Cassandra for Developers (CD)

Course Description Schedule Course Outline

About this Course

Apache Cassandra is an open source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients, and places a high value on performance.

This course walks the developer through Cassandra architecture, tips and tricks, based on solid understanding of NoSQL design principles and some administrative functions. The course includes a good dose of hands-on programming labs as well.

Who should attend

Application Architects and Developers

Class Prerequisites

Learners will need to come to class meeting the following prerequisites:

  • Comfortable with Java programming language (most programming exercises are in java)
  • Be able to navigate Linux command line
  • Basic knowledge of Linux editors (VI / nano) for modifying code

What You Will Learn

By the end of this course, you should be able to:

  • Understand NoSQL in terms of Big Data, including the NoSQL ecosystem
  • Implement Cassandra nodes, clusters and datacenters
  • Construct Data Modelling solutions with CQL
  • Design multiple use cases from various domains
  • Work with Cassandra drivers
  • Understand Cassandra design under the hood
  • Administer a Cassandra implementation including hardware requirements

Outline: Cassandra for Developers (CD)

Module 1: Introduction to Big Data / NoSQL
  • NoSQL overview
  • CAP theorem
  • When is NoSQL appropriate
  • NoSQL ecosystem
Module 2: Cassandra Basics
  • Cassandra nodes, clusters, datacenters
  • Keyspaces, tables, rows and columns
  • Partitioning, replication, tokens
  • Quorum and consistency levels
  • Labs
Module 3: Data Modelling part 1
  • Introduction to CQL
  • CQL Datatypes
  • Creating keyspaces & tables
  • Choosing columns and types
  • Choosing primary keys
  • Data layout for rows and columns
  • Time to live (TTL), create, insert, update
  • Querying with CQL
  • CQL updates
  • Labs
Module 4: Data Modelling part 2
  • Creating and using secondary indexes
  • Denormalization and join avoidance
  • Composite keys (partition keys and clustering keys)
  • Time series data
  • Best practices for time series data
  • Counters
  • Lightweight
  • Transactions (LWT)
Module 5: Data Modelling
  • Labs : Group design sessions multiple use cases from various domains are presented
  • Students work in groups to come up designs and models
  • Discuss various designs, analyze decisions
  • Lab: implement ‘Netflix’ data models, generate data
Module 6: Cassandra drivers
  • Introduction to Java driver
  • CRUD (Create / Read / Update, Delete) operations using Java client
  • Asynchronous queries
  • Labs
Module 7: Cassandra Internals
  • Understand cassandra design under the hood
  • sstables, memtables
  • caching
  • read path, write path
  • Vnodes
Module 8: Administration
  • Hardware selection
  • Install mediums / choices
  • Cassandra best practices (compaction, garbage collection,)
  • Troubleshooting tools and tips
  • Lab: students install their own cassandra node
  • Lab: benchmarking and optimizing C* installation
Module 9: Bonus Lab
  • Implement a music service like Pandora / Spottify on Cassandra
Classroom Training

Duration 3 days

  • United States: US$ 2,500
Enroll now
Online Training

Duration 3 days

  • United States: US$ 2,500
Enroll now