Cassandra supports horizontal scalability achieved by adding more that one node as part of a cassandra cluster
Cassandra works with peer to peer to architecture with each node connected to all other nodes
Each Cassandra node performs all database operations and can serve client requests without a need for a master node
Nodes in a cluster communicate with each other via Seeds and Gossip
Seeds - Each node configures a list of seeds which is simply a list of other nodes. A seed node is used to bootstrap a node when it is first joining a cluster
Gossip - Gossip is the protocol used by Cassandra nodes for peer-to-peer communication. The gossip
informs a node about the state of all other nodes
A cluster is subdivided into racks and data centers. These terminologies are Cassandra's representation of
a real world rack and data center
Database Structures
Cassandra stores data in tables where each table is organised in rows and columns the same as any
other database
Tables are grouped in keyspaces. A keyspace could be used to group tables serving a similar purpose from
a business perspective like all transactional tables, metadata tables, use information tables
Each table has a defined primary key. The primary key is divided into partition key and clustering columns
The partition key is used by Cassandra to index the data. All rows which share a common partition key make a
single data partition which is the basic unit of data partitioning, storage and retrieval in Cassandra
Partitioning
A partition key is converted to a token by a partitioner
The tokens are signed integer values between -2^63 to 2^63-1 and this range is referred to as token
range
Each Cassandra node owns a portion of this range and it primarily owns data corresponding to the range
A token is used to prescisly locate the data among the nodes and on the data storage of the corresponding node
Here is a simplified example to illustrate token range assignment. If we consider there are only 100 tokens used for a
Cassandra cluster with three nodes. Each node is assigned approximately 33 tokens like ----- node1: 0-33 ----- node2: 34-66 ----- node3: 67-99