Cassandra.Lunch
Resources from weekly Zoom lunches revolving around Apache Cassandra and Apache Cassandra-related topics. Hosted by Anant Corporation.
Join Cassandra Lunch Weekly at 12 PM EST Every Thursday!
Watch Cassandra Lunches Live and Subscribe to Our YouTube Channel to Keep Up to Date!
If you would like to be a guest speaker, you can reach us at solutions@anant.us. If you would like to sponsor Cassandra Lunch, please reach us at the email listed above.
Check out the Cassandra.Lunch playlist on Youtube
Table of Contents
Apache Cassandra Lunch Online Meetup #10: Cassandra 4.0
- We discuss and take an in-depth look at the improvements and new features that come with Cassandra 4.0.
- We discuss various Cassandra distributions ranging from Cassandra / Cassandra Compliant Databases on JVM, Cassandra Compliant Databases on C++, Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra, and Cassandra as a Service / Managed Cassandra Based on Proprietary Technology.
- We cover Kubernetes, discussing what it is and how it works with Docker and Cassandra. We also looked at some of Kubernetes' competitors and a variety of open sources tools for Kubernetes which will give you an insight as to why we picked Kubernetes to be a worth while investment when working with databases.
- We discuss a number of projects and platforms that you can use to jumpstart your Cassandra projects. They make useful educational resources; as well as, good starting codebases for new projects. We also discuss a recent article on the Yugabyte blog about Cassandra.
- We discuss methods for finding and diagnosing issues in Cassandra clusters with ELK/FEK/BEK.
- We discuss Cassandra Backup / Restoration. We also discuss disaster avoidance, disaster recovery, and different tools that can be used for backup and restoration of your Cassandra data. Also, we discuss an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration.
- We discuss Cassandra Anti-entropy which is a process of comparing the data of all replicas and updating each replica to the newest version. We also looked at repair and synchronization in Cassandra and how you can prepare for the unexpected.
- We discuss deletion and tombstones in Cassandra.
- Guest speaker, Ryan Quey, a full stack data engineer, discusses a personal project he has been working on called java-podcast-processor, which is a tool to find podcast metadata over an external API, store them, get their RSS feeds, and run ETL using Airflow, Kafka, Spark, and Cassandra. The particular Cassandra distribution used is Elassandra, which allows seamless integration with Elasticsearch. The data is also displayed using a Gatsby app and served using Flask.
- We discuss the combined use of relational databases and Cassandra. We also discuss the advantages of using relational databases and Cassandra separately; as well as, covering the advantages and methods for using both concurrently.
- We discuss Cassandra read and write paths, which is how Cassandra stores and retrieves data at high speeds. We do not cover how Cassandra replicates data because that its own subject, but we take a look at these four sub-topics: Write Path, Update / Delete, Maintenance Path, and Read Path.
- We discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages / thread pools. We also discuss a few different tools that we can use to monitor these stages and thread pools in order to keep your Cassandra running as smoothly as possible.
- We discuss deployment and administration tools for Cassandra. We also discuss a number of tools for the installation, configuration, monitoring, and administration of Cassandra clusters.
- We discuss packaged and DIY methods for Lucene based indexes on Cassandra; as well as, give some pros and cons for using Lucene Based Indexes on Cassandra.
- We discuss a number of use cases for Cassandra, focusing on Cassandra's place in running a digital business technology platform.
- We discuss how Cassandra is used for real-time data platforms; as well as, cover different reference architectures in which Cassandra is and can be used.
- We discuss how Cassandra is used for real-time data platforms; as well as, cover different reference architectures in which Cassandra is and can be used.
- We discuss different methods in which we can deploy Cassandra whether it be on Baremetal, Virtual Machines, or Containers; as well as, pros, cons, and deployment tools.
- We discuss specific scenarios for Cassandra's backup and restore, some methods for restoring data to a Cassandra cluster, and covered how factors like the topology of a cluster or the need for constant uptime can affect the backup/restore process.
- We discuss updates regarding Cassandra and Kubernetes after the recent KubeCon event.
- We discuss the basics of using Spark and Cassandra together, the advantages of each, and the advantages of using them together. We also discuss the potential drawbacks, and configuration methods for avoiding those drawbacks.
- We discuss open-source tools that can be used for BI with Cassandra including a live demo using DSE, Presto, and Metabase.
- We discuss the various ways of moving data into and out of Cassandra clusters.
- We discuss using Terraform and Ansible to set up the infrastructure for and handle the provisioning of a new Cassandra cluster
- We discuss how to use Liquibase with Cassandra and DataStax Astra.
- We discuss some basic data operations that you can do with Apache Spark and Cassandra.
- We discuss various databases that can run on top of Cassandra.
- We discuss CQL Copy and how we can use it for Cassandra data operations.
- We discuss Apache Spark projects that interact with Cassandra specifically through Cassandra’s SSTables
- We discuss General Updates to Apache Cassandra and relevant articles of interest.
- We discuss Scylla’s Spark Migrator and walk through how we can use the Scylla Migrator for Cassandra Data Operations.
- We discuss Cassandra on Kubernetes and give an introduction to Docker, Kubernetes, and Helm.
- We cover SSTable files, their relation to SSTableLoader, and we walk through an example using SSTableloader to load data taken from a cluster to a new, empty cluster.
- We will introduce DSBulk or DataStax Bulk Loader, and show how we can use it with tools like sed and awk to do ETL on Cassandra data.
- We will introduce DSBulk or DataStax Bulk Loader, and show how we can use it with tools like sed and awk to do ETL on Cassandra data.
- In Apache Cassandra Lunch #45, we will discuss how you can stream tweets using Twitter4S (Scala Twitter client) and save them to Cassandra using Alpakka Cassandra.
- In Apache Cassandra Lunch #46, we will discuss how we can use Apache Spark jobs written in Scala to do Cassandra data operations, which will include a live walkthrough!
- In Cassandra Lunch #48, we will discuss using Airflow and Cassandra together. Airflow provides a Cassandra connection type and a Cassandra operator. We will explore what we can do to manage a Cassandra cluster via Airflow.
- We will discuss how to use Spark SQL to do Cassandra data operations such as moving data in Apache Cassandra tables.
- In Apache Cassandra Lunch #50, we will discuss how you can use Apache Spark and Apache Cassandra to perform basic Machine Learning tasks.
- In Cassandra Lunch #51, we will discuss an overview of Cassandra cluster architecture, not to be confused with the Cassandra database architecture. Specifically, using Cassandra Datacenters to isolate workloads.
- In Cassandra Lunch #52, we will continue our discussion on using Airflow and Cassandra together. Last time we discussed the Cassandra operators and how they allow us to manipulate data on a Cassandra cluster. This time we will explore what other Cassandra processes we can manage from Airflow.
- We will discuss how we can set up a Cassandra ETL pipeline using Airflow and Spark
- We will discuss how we can set up a Cassandra ETL pipeline using Airflow and Spark
- We will discuss the process and reasons for migrating your database from SQL(PostgreSQL) to NoSQL(Cassandra)
- We will discuss using Spark Parquet tables in DSEFS and DSE analytics.
- We will discuss using Spark Parquet tables in DSEFS and DSE analytics.
- In Cassandra Lunch #58, Rahul Singh will be leading a presentation covering a Cassandra topic we're sure you won't want to miss..
- We will discuss the use of functions (Default and UDFs) in Cassandra.
- In Apache Cassandra Lunch #60, we will discuss how we can use Apache Nifi with Apache Cassandra
- In Apache Cassandra Lunch #61, we will discuss different ways of indexing and working with Elassandra
- In Apache Cassandra Lunch #62, guest speaker Sarma Pydipally will be presenting on the Grafana Dashboard for Cassandra.
Apache Cassandra Lunch #63: How to Install Cassandra 4.0 From a Tarball On Linux
- In Apache Cassandra Lunch #63, CEO of Anant Rahul Singh went over a live demo of how to install Cassandra 4.0 from a tarball on Linux
Apache Cassandra Lunch #64: Cassandra for .NET Developers
- In Cassandra Lunch #64: Cassandra for .NET Developers, Co-founder, Customer Experience Architect, and Sitecore MVP of Anant, Eric Ramseur will be presenting on Cassandra for .NET developers.
- In Apache Cassandra Lunch #65 we will discuss how the Spark Cassandra Connector pushes some parts of a query down to Cassandra, and what that has to do with normal Spark SQL predicate pushdown.
- In Apache Cassandra Lunch #66, we will discuss the enterprise version of DBeaver and demonstrate how it can be used with a Cassandra Database.
- In Apache Cassandra Lunch #67, we will discuss how to move data from Open Source Cassandra to Datastax Astra using dsbulk/scylla migrator.
- In Apache Cassandra Lunch #68, We will introduce the DataStax Apache Kafka Connector and discuss how we can use it to connect Apache Kafka and Cassandra
Apache Cassandra Lunch #69: k8ssandra
- In Apache Cassandra Lunch #69, we will discuss getting started with k8ssandra
- In Cassandra Lunch #70, we will discuss the Basics of Apache Cassandra and setup a stand-alone Apache Cassandra.
- In Cassandra Lunch #71, we will discuss how DataStax Astra can be used as a back-end for a React client. We will demo a small application with a user profile.
- In Cassandra Lunch #72, we will discuss how we can use Databricks with Cassandra.
- In Cassandra Lunch #73, we discuss an overview and comparison of Datastax dependencies for Cassandra, Spark and Graph
- In Cassandra Lunch #74, Technical Marketing Manager at ScyllaDB, Peter Corless, presents on ScyllaDB and some of the advantages of using ScyllaDB over open-source Cassandra.
- In Cassandra Lunch #75, we are going to look at getting started with DataStax Enterprises on Docker.
Apache Cassandra Lunch #76: Tombstone Mitigation Strategies - Aaron Ploetz
- In Cassandra Lunch #76, Aaron Ploetz, Tech Author at DataStax is going to be presenting on Tombstone Mitigation Strategies.
- In Cassandra Lunch #77, we will show you how you can connect to your DataStax Astra database using standalone CQLSH.
- In Cassandra Lunch #78, we will deploy Cassandra using DSE Operator to Kubernetes
Apache Cassandra Lunch #79: Cassandra API in Cosmos DB
- In Cassandra Lunch #79 we will discuss how Cosmos DB compares to Cassandra, by setting up an old project that puts a REST API over a Cassandra table using Cassandra drivers to use data stored in CosmosDB instead. We will also discuss Cosmos DBs Cassandra API and it's connections to cqlsh.
Apache Cassandra Lunch #79: Cassandra API in Cosmos DB
- In Cassandra Lunch #79 we will discuss how Cosmos DB compares to Cassandra, by setting up an old project that puts a REST API over a Cassandra table using Cassandra drivers to use data stored in CosmosDB instead. We will also discuss Cosmos DBs Cassandra API and it's connections to cqlsh.
- In Cassandra Lunch #80: How to Use Cassandra for Content Management, we will be using DataStax Astra as our database to demonstrate content management in Cassandra.
- In Cassandra Lunch #81, we will discuss how we can use Redash to do BI on Cassandra data!
- In Cassandra Lunch #82, we will discuss how to set up a Instaclustr managed Cassandra on Next.js
- In Cassandra Lunch #83, we will introduce Aiven's Managed Cassandra offering and show how we can connect to Aiven with Node.js and CQLSH
- In Apache Cassandra Lunch #84, the CEO of Anant Rahul Singh will be presenting on Data Platform Design around Cassandra, Spark, and Kafka
Apache Cassandra Lunch #85: Top 10 Open-Source Projects Using Cassandra in 2022
- In Cassandra Lunch #85, we will discuss some of the most popular open-source projects using Cassandra in 2022.
- In Cassandra Lunch #86, we will discuss the DataStax Astra Terraform Provider and discuss how it can be used to manage DataStax Astra infrastructure
- In Cassandra Lunch #87, we will work on using AstraDBs included Stargate API layer to substitute for the written Node and Python APIs in our Cassandra.api project.
Apache Cassandra Lunch #88: Cadence
- In Cassandra Lunch #87, we will work on using AstraDBs included Stargate API layer to substitute for the written Node and Python APIs in our Cassandra.api project.
Apache Cassandra Lunch #89: Semi-Structured Data in Cassandra
- In Cassandra Lunch #89, we will discuss how to store and parse semi-structured data in Cassandra using Spark
Apache Cassandra Lunch #90: Securing Apache Cassandra
- In Cassandra Lunch #90, CEO of Anant, Rahul Singh, will discuss the different ways to secure Apache Cassandra. This is going to be an overview of the built-in features as well as other options that can be used
Apache Cassandra Lunch #91: Collections in Cassandra
- In Cassandra Lunch #91, we will discuss the collection types in Cassandra and how the frozen modifier changes the way that Cassandra interacts with them.
Apache Cassandra Lunch #92: Securing Apache Cassandra - Managing Roles and Permissions
- In Cassandra Lunch #92, CEO of Anant, Rahul Singh, will discuss how to design and manage roles and permissions in Apache Cassandra to secure multiple applications and users for a growing platform with new use cases.
- In Cassandra Lunch #93, we will discuss how to use k8ssandra on Digital Ocean
- In Cassandra Lunch #94, Arpan Patel will discuss how to connect StreamSets and Cassandra.
Apache Cassandra Lunch #95: Spark Graph Operations with DSEGraphFrames Scala API
- In Cassandra Lunch #95, Obioma Anomnachi will discuss the DSEGraphFrames library which allows Spark to perform operations on graph databases.
Apache Cassandra Lunch #96: Apache Cassandra Change Data Capture (CDC) Strategies
- In Cassandra Lunch #96, Rahul Singh, CEO of Anant, will discuss different ways to get change data into and out of Cassandra using a few different strategies which could work out for your platform.
Apache Cassandra Lunch #97: Cassandra DataSource for Grafana
- In Apache Cassandra Lunch #97, Obioma Anomnachi will discuss using the new Cassandra Datasource for Grafana to visualize any time series data stored in Cassandra.
Apache Cassandra Lunch #98: Cassandra on k3s
- In Cassandra Lunch #98, Stefan Nikolovski will discuss Cassandra on k3s.
- In Cassandra Lunch #99, Arpan Patel will discuss the CQL Arithmetic Operators that are now supported in Cassandra 4.0!
- In Cassandra Lunch #100, CEO of Anant, Rahul Singh, will discuss which companies currently use Cassandra and what products use it as their backend. We will also take a look at how far Cassandra has come and how players like Scylla, Yugabyte, have brought value, and how the Saas and managed service providers can help.
Apache Cassandra Lunch #101: IoT and Cassandra
- In Apache Cassandra Lunch #101, Obioma Anomnachi will discuss the use of Cassandra for IoT (Internet of Things) workloads. We will discuss data modeling for IoT, as well as different ways devices might send data back to the cluster.
Apache Cassandra Lunch #102: Choreography vs Orchestration
- In Apache Cassandra Lunch #102, Stefan Nikolovski will discuss Choreography vs Orchestration / Google Workflows.
Apache Cassandra Lunch #103: Cassandra Cluster Architecture in UML and the Azure Digital Twin Domain Language
- In Cassandra Lunch #103, Nicholas Brackley will discuss how to connect to an Azure Digital Twin resource, view the models in Azure’s environment, and investigate the functions available using the DTDL resources on Azure’s platform.
Apache Cassandra Lunch #104: DataOps - Cleaning Data in Apache Cassandra
- In Apache Cassandra Lunch #104, CEO of Anant, Rahul Singh, will discuss methods and strategies to manage big data in Apache Cassandra after you’ve got it already stored. We’ll discuss how to delete or apply TTLs after the fact, how to operationalize processes with Apache Airflow and Apache Spark, and how to manage Data hygiene as a strategy so that you’re not stuck with bad data later.
Apache Cassandra Lunch #105: Cassandra, Presto, and Airflow
- In Apache Cassandra Lunch #105, Arpan Patel will discuss how to run read, join, and write queries on Cassandra by Presto orchestrated via Airflow
Apache Cassandra Lunch #106: SSL with Apache Cassandra
- In Cassandra lunch #106, Dipan Shah will discuss enabling SSL on an Apache Cassandra cluster.
Apache Cassandra Lunch #107: Guardrails
- In Cassandra lunch #107, Dipan Shah will discuss how Guardrails works in Apache Cassandra.
Apache Cassandra Lunch #108: Developing Enterprise Consciousness with Apache Cassandra
- In Apache Cassandra Lunch #108, CEO of Anant, Rahul Singh, will discuss developing enterprise consciousness with Apache Cassandra
Apache Cassandra Lunch #109: DataStax cql-proxy
- In Apache Cassandra Lunch #109, Arpan Patel will discuss DataStax’s cql-proxy tool and show how you can use it with DataStax Astra
Apache Cassandra Lunch #110: Full Query Logging
- In Apache Cassandra Lunch #110, Dipan Shah will discuss full query logging.
Apache Cassandra Lunch #112: Azure Cassandra Proxy
- In Apache Cassandra Lunch #112, Arpan Patel will discuss Azure's Cassandra Dual Write Proxy
Apache Cassandra Lunch #113: ScyllaDB V: NoSQL Innovations for Extreme Scale
- With the release of ScyllaDB Open Source 5.0 users have a Raft of new capabilities to manage and scale their NoSQL databases — all puns intended. Discover what's new, and why industry gamechangers are moving their workloads to ScyllaDB.
Apache Cassandra Lunch #114: Cassandra Virtual Tables
- In Apache Cassandra lunch #114, Dipan Shah will discuss virtual Tables in Apache Cassandra 4.0
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
- In Cassandra Lunch #115, Arpan Patel will discuss how to connect Google Dataproc and DataStax Astra with a demo showing you what configurations you will need to get the connection working!
Apache Cassandra Lunch #116: 5 Disciplines of a Cassandra Expert
- Apache Cassandra is an exceptionally powerful distributed database used by some of the world’s most popular online services. However, the adoption of Cassandra requires some fundamental disciplines to operate it effectively.
Hayato Shimizu is a veteran Cassandra architect having had a hand in some of the world’s largest deployments from media companies to banks.
Hayato will take you through some of the key disciplines you should adopt to ensure Cassandra provides your users and customers the reliability and performance you need.
- YouTube
Apache Cassandra Lunch #117: Integrating YugaByteDB with Microsoft.NET Applications
- In this edition of our Apache Cassandra Lunch, Eric Ramseur, the head of growth at Anant will walk attendees through the process of integrating YugaByte real-time data into your Microsoft.NET applications.
Apache Cassandra Lunch #118: Yugabyte, Spark, and Airflow
- In Apache Cassandra Lunch #118, Arpan Patel will discuss how we can do some simple ETL against Yugabyte using Spark, which will be orchestrated with Airflow.
- In Apache Cassandra Lunch #119, Rahul Singh will cover a refresher on GUI desktop/web tools for users that want to get their hands dirty with Cassandra but don't want to deal with CQLSH to do simple queries. Some of the tools are web-based and others are installed on your desktop. Since the beginning days of Cassandra, a lot has changed and there are many options for command-line-haters to use Cassandra.
Apache Cassandra Lunch #120: Apache Cassandra Monitoring Made Easy with AxonOps
- Apache Cassandra is an exceptionally powerful distributed database used by some of the world’s most popular online services. Having the right tools to monitor your Cassandra Cluster is crucial to ensuring an always-on service. Discover a simple and easy way to identify issues and maintain the health of your database with AxonOps. In this session, Johnny will show how easy it is to start monitoring your Cassandra cluster in minutes. He will explain the various aspects and features of Cassandra that need to be monitored, how to do it and most importantly why! Approaches for backups and Cassandra repairs will be discussed and explored in detail. Learn how AxonOps significantly reduces the complexity and overhead when looking after Cassandra and ensures your Cassandra cluster is reliable and resilient. Experienced developer, DevOps, architect and AxonOps co-founder, Johnny Miller, has worked with a wide variety of companies – from small start-ups to large enterprises. He has been working with Cassandra for many years and has a deep understanding of the challenges facing modern companies looking to adopt Apache Cassandra.
Apache Cassandra Lunch #121: Migrating to Azure Managed Instance for Apache Cassandra
- In Cassandra Lunch #121, Obioma Anomnachi will discuss the use of the Azure Cassandra Migrator for moving data between Apache Cassandra and Azure Managed Instance for Apache Cassandra, comparing it to the alternate migration process of setting up cross datacenter replication on a hybrid cluster.
Apache Cassandra Lunch #122: CDC for Apache Cassandra - Dipan
- In Apache Cassandra Lunch 122, Dipan Shah will demonstrate how to enable CDC for Apache Cassandra and then send the data to a destination database.
Apache Cassandra Lunch #123: ChatGPT w/ Cassandra: Super Charge Your Skills
- Informal talk on how ChatGPT is changing the world and what it means for people in the Cassandra ecosystem. Learn how to use GPT to help with Development / Admin work with some basic prompts. We'll also discuss how it can be used in data processing in conjunction with Cassandra data as a source and as a sink.
Apache Cassandra Lunch #124: CDC for Apache Cassandra Part 2
- In Apache Cassandra Lunch 124, Dipan Shah continues his demonstration on how to enable CDC for Apache Cassandra and then move data from Apache Pulsar into a few destination databases.