Monday, May 18, 2015

Cassandra | Setup

We learnt about Cassandra in previous post. We will setup and run client on an Cassandra cluster(fully distributed) here.

Installation

For installation we will use three nodes. We will install fully distributed Cassandra cluster. Here we are using following details for installation (for complete setup):

  • Installation base directory:
      • /home/anishsneh/installs
  • Installation user name:
      • anishsneh
  • Hostnames: 
      • server01 (first node, say with ip address 172.16.70.131)
      • server02 (second node, say with ip address 172.16.70.132)
      • server03 (third node, say with ip address 172.16.70.133)
Note that in Cassandra there is NO SINGLE POINT OF FAILURE, hence all the nodes are equal and there is no MASTER or SLAVE.

  • Install Cassandra
    • Download Apache Cassandra binary from Apache Website.
    • Extract downloaded package to /home/anishsneh/installs, such that we have:
      [anishsneh@server01 installs]$ ls -ltr apache-cassandra-2.1.5/
      total 360
      -rw-r--r--. 1 anishsneh anishsneh   2117 Apr 27 07:33 NOTICE.txt
      -rw-r--r--. 1 anishsneh anishsneh  64431 Apr 27 07:33 NEWS.txt
      -rw-r--r--. 1 anishsneh anishsneh  11609 Apr 27 07:33 LICENSE.txt
      -rw-r--r--. 1 anishsneh anishsneh 245971 Apr 27 07:33 CHANGES.txt
      drwxr-xr-x. 2 anishsneh anishsneh   4096 May 17 15:37 interface
      drwxr-xr-x. 4 anishsneh anishsneh   4096 May 17 15:37 javadoc
      drwxr-xr-x. 3 anishsneh anishsneh   4096 May 17 15:37 lib
      drwxr-xr-x. 3 anishsneh anishsneh   4096 May 17 15:37 pylib
      drwxr-xr-x. 4 anishsneh anishsneh   4096 May 17 15:37 tools
      drwxr-xr-x. 2 anishsneh anishsneh   4096 May 17 15:37 bin
      drwxrwxr-x. 2 anishsneh anishsneh   4096 May 17 15:51 logs
      drwxrwxr-x. 5 anishsneh anishsneh   4096 May 17 15:51 data
      drwxr-xr-x. 3 anishsneh anishsneh   4096 May 17 16:46 conf
      
    • Repeat above steps for all the three nodes.
  • Configure Cluster 
    • Set CASSANDRA_HOME="/home/anishsneh/installs/apache-cassandra-2.1.5" in ~/.bashrc (or wherever maintaining environment variables), reload profile/bash.
    • On first node edit $CASSANDRA_HOME/conf/cassandra.yaml with following:
      cluster_name: 'HELLO_CLUSTER'
      
      listen_address: 172.16.70.131
      
      rpc_address: 172.16.70.131
      
      seeds: "172.16.70.131,172.16.70.132,172.16.70.133"
      
      Here we are assuming first node has ip address 172.16.70.131. Note that other properties like data_file_directories, commitlog_directory can be changed if needed.
    • On first node make changes to the following properties in the script $CASSANDRA_HOME/conf/cassandra-env.sh: Uncomment/Update
      JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=172.16.70.131"
      
      LOCAL_JMX=no
      
      JVM_OPTS="$JVM_OPTS -Dcom.sun.management.jmxremote.authenticate=false"
      
      Here we are assuming first node has the ip address 172.16.70.131
    • Repeat above steps for all the three nodes (with their respective ip addresses)
  • Start/Run Cluster
    • Execute $CASSANDRA_HOME/bin/cassandra on all the three nodes, it will start Cassandra server on all the three nodes and all the three server will join a cluster (as per the information provided in cassandra.yaml)
  • Verify Cluster
    • On one of the nodes go to $CASSANDRA_HOME/bin and execute following command:
      [anishsneh@server01 bin]$ ./nodetool -h server01 status
      Datacenter: datacenter1
      =======================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
      UN  172.16.70.131  188.19 KB  256     64.9%             0fc990e2-c257-4dfc-aec0-b151efd634d7  rack1
      UN  172.16.70.132  187.5 KB   256     67.8%             ba280c97-295c-4056-85f0-3c11594a3676  rack1
      UN  172.16.70.133  153.47 KB  256     67.3%             3a670717-401c-419a-8b89-73c1426df67b  rack1
      
      We may execute few more commands like:
      [anishsneh@server01 bin]$ ./nodetool -h server01 version
      ReleaseVersion: 2.1.5
      
      
      [anishsneh@server01 bin]$ ./nodetool -h server01 info
      ID                     : 0fc990e2-c257-4dfc-aec0-b151efd634d7
      Gossip active          : true
      Thrift active          : true
      Native Transport active: true
      Load                   : 188.19 KB
      Generation No          : 1431991363
      Uptime (seconds)       : 537
      Heap Memory (MB)       : 84.14 / 484.00
      Off Heap Memory (MB)   : 0.00
      Data Center            : datacenter1
      Rack                   : rack1
      Exceptions             : 0
      Key Cache              : entries 11, size 824 bytes, capacity 24 MB, 21 hits, 38 requests, 0.553 recent hit rate, 14400 save period in seconds
      Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
      Counter Cache          : entries 0, size 0 bytes, capacity 12 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
      Token                  : (invoke with -T/--tokens to see all 256 tokens)
      
      

CQLSH Client

Cassandra is shipped with a very useful command line client CQLSH which is a shell for CQL (Cassandra Query Language). It is an interactive command line interface for Cassandra. We will connect to Cassandra cluster using CQLSH here and execute various CRUD operations. CQLSH can be launched using command $CASSANDRA_HOME/bin/cqlsh script on any of the nodes (or where Cassandra is installed):
[anishsneh@server01 bin]$ ./cqlsh server01
Connected to HELLO_CLUSTER at server01:9042.
[cqlsh 5.0.1 | Cassandra 2.1.5 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
Create KEYSPACE:
cqlsh> CREATE KEYSPACE IF NOT EXISTS demo_keyspace WITH replication={'class' : 'SimpleStrategy', 'replication_factor':1};
Use the created KEYSPACE:
cqlsh> USE demo_keyspace;
Create COLUMN FAMILY:
cqlsh:demo_keyspace> CREATE TABLE IF NOT EXISTS demo_table(id varchar, login varchar, full_name varchar, country_code varchar, PRIMARY KEY(id));
Describe KEYSPACE:
cqlsh:demo_keyspace> DESCRIBE KEYSPACE demo_keyspace;

CREATE KEYSPACE demo_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'}  AND durable_writes = true;

CREATE TABLE demo_keyspace.demo_table (
    id text PRIMARY KEY,
    country_code text,
    full_name text,
    login text
) WITH bloom_filter_fp_chance = 0.01
    AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
    AND comment = ''
    AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99.0PERCENTILE';

Insert records to COLUMN FAMILY
cqlsh:demo_keyspace> INSERT INTO demo_table(id, login, full_name, country_code) values('USR0000001', 'anishsneh', 'Anish Sneh', 'IN');
cqlsh:demo_keyspace> INSERT INTO demo_table(id, login, full_name, country_code) values('USR0000002', 'rakeshk', 'Rakesh K', 'UK');
cqlsh:demo_keyspace> INSERT INTO demo_table(id, login, full_name, country_code) values('USR0000003', 'ballys', 'Bally S', 'US');
cqlsh:demo_keyspace> INSERT INTO demo_table(id, login, full_name, country_code) values('USR0000004', 'yogeshd', 'Yogesh D', 'US');
Select records from COLUMN FAMILY
cqlsh:demo_keyspace> SELECT * FROM demo_table;

 id         | country_code | full_name  | login
------------+--------------+------------+-----------
 USR0000001 |           IN | Anish Sneh | anishsneh
 USR0000004 |           US |   Yogesh D |   yogeshd
 USR0000003 |           US |    Bally S |    ballys
 USR0000002 |           UK |   Rakesh K |   rakeshk

(4 rows)
Delete record from COLUMN FAMILY:
cqlsh:demo_keyspace> DELETE FROM demo_table WHERE id = 'USR0000002';
cqlsh:demo_keyspace> SELECT * FROM demo_table;

 id         | country_code | full_name  | login
------------+--------------+------------+-----------
 USR0000001 |           IN | Anish Sneh | anishsneh
 USR0000004 |           US |   Yogesh D |   yogeshd
 USR0000003 |           US |    Bally S |    ballys

(3 rows)
Update record in COLUMN FAMILY:
cqlsh:demo_keyspace> UPDATE demo_table SET country_code = 'CA' WHERE id = 'USR0000001';
cqlsh:demo_keyspace> SELECT * FROM demo_table;

 id         | country_code | full_name  | login
------------+--------------+------------+-----------
 USR0000001 |           CA | Anish Sneh | anishsneh
 USR0000004 |           US |   Yogesh D |   yogeshd
 USR0000003 |           US |    Bally S |    ballys

(3 rows)

Cassandra CQL queries can be used with Datastax JDBC driver (Java based high level client), demo programs can be found at anishsneh@git.

8 comments:

  1. I wish to show thanks to you just for bailing me out of this particular trouble.As a result of checking through the net and meeting techniques that were not productive, I thought my life was done.
    Digital Marketing Training in Chennai

    Digital Marketing Training in Bangalore
    Digital Marketing Training in Pune

    ReplyDelete
  2. Its really an Excellent post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog. Thanks for sharing....
    Blueprism online training

    Blue Prism Training in Pune

    Blueprism training in tambaram

    ReplyDelete
  3. Wonderful bloggers like yourself who would positively reply encouraged me to be more open and engaging in commenting.So know it's helpful.

    Data Science Training in Chennai
    Data science training in bangalore
    Data science online training
    Data science training in pune

    ReplyDelete
  4. This is a good post. This post give truly quality information. I’m definitely going to look into it. Really very useful tips are provided here. thank you so much. Keep up the good works.
    java course in tambaram | java course in velachery

    java course in omr | oracle course in chennai

    ReplyDelete
  5. Thank you so much for a well written, easy to understand article on this. It can get really confusing when trying to explain it – but you did a great job. Thank you!
    angularjs-Training in pune

    angularjs-Training in chennai

    angularjs Training in chennai

    angularjs-Training in tambaram

    angularjs-Training in sholinganallur

    ReplyDelete
  6. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    python training in rajajinagar | Python training in btm | Python training in usa

    ReplyDelete
  7. I’m planning to start my blog soon, but I’m a little lost on everything. Would you suggest starting with a free platform like Word Press or go for a paid option? There are so many choices out there that I’m completely confused. Any suggestions? Thanks a lot.


    AWS Interview Questions And Answers

    Best AWS Training in Chennai | Amazon Web Services Training in Chennai


    Amazon Web Services Training in OMR , Chennai | Best AWS Training in OMR,Chennai


    AWS Training in Chennai |Best Amazon Web Services Training in Chennai

    ReplyDelete