1 .Apache Cassandra is an open source and free distributed database management system.
2. It is a type of NoSQL(Not only SQL ) database.Most of the Cassandra Query language command and syntax are similar to SQL.DML statements in cassandra do not require “commit”,it is auto committed.
3. It is designed to handle large amounts of data/big data across many servers providing high availability and with no single point of failure.
4.It supports clustering extended across different datacentres.By clustering we mean a set of loosely or tightly coupled nodes or contact points or seeds which can be viewed as a single system.
5. It is highly scale-able.If new seeds or nodes or machines are added ,the Cassandra read and write throughput both increase linearly with no downtime or interruption to applications.
Before We explain you the architecture ,let us first understand the definitions of some of the terms related to Cassandra.
Node : Node or seeds or contact points in Cassandra are nothing but servers or storage where actual address is stored. e.g. ip address of the server xxx.xxx.xxx.xxx
DataCentre : A datacentre is nothing but a combination of related nodes.Suppose your application is used by both Asian customers and American customers , in this case you can have two datacentres.In any disaster scenario if one datacentre goes down ,the other datacentre can still make the application’s work done and no data will be lost.So,datacentres are generally created according to geographical locations.One geographical datacentre has some nodes related to its location and other datacentre will have some nodes related to that geographical location.
Cluster : A cluster is a component that contains one or more data centers.By clustering we mean a set of loosely or tightly coupled nodes or contact points or seeds which can be viewed as a single system.