Sunday, September 7, 2014

HBase | Setup

We learnt about HBase in previous post. We will setup and run client on an HBase cluster(fully distributed) here.


For installation we will use three node Hadoop - YARN cluster (as setup/described in earlier post ).
We will setup three node HBase cluster:

Note that here we are using following details for installation (for complete setup):

     - Installation base directory:  
      • /home/anishsneh/installs
     - Installation user name:
      • anishsneh
     - Hostnames: 
      • server01 (master+slave)
      • server02 (only slave)
      • server03 (only slave)
Steps to install HBase (on the top of Hadoop 2 cluster):
  1. Install HBase - we will use HBase 0.98.4 (Hadoop2)
    • Download hbase-0.98.4-hadoop2-bin.tar.gz from HBase Website, note that we are using Hadoop 2 version of HBase binary
    • Extract downloaded package to /home/anishsneh/installs, such that we have:
      [anishsneh@server01 installs]$ ls -ltr hbase-0.98.4-hadoop2
      total 172
      -rw-r--r--.  1 anishsneh anishsneh    897 Jun  6 10:33 NOTICE.txt
      -rw-r--r--.  1 anishsneh anishsneh  11358 Jun  6 10:33 LICENSE.txt
      -rw-r--r--.  1 anishsneh anishsneh   1377 Jul 14 18:23 README.txt
      drwxr-xr-x.  2 anishsneh anishsneh   4096 Jul 14 18:23 conf
      drwxr-xr-x.  4 anishsneh anishsneh   4096 Jul 14 18:23 bin
      -rw-r--r--.  1 anishsneh anishsneh 134544 Jul 14 18:27 CHANGES.txt
      drwxr-xr-x.  7 anishsneh anishsneh   4096 Jul 14 19:37 hbase-webapps
      drwxr-xr-x. 29 anishsneh anishsneh   4096 Jul 14 19:45 docs
      drwxrwxr-x.  3 anishsneh anishsneh   4096 Sep  7 14:47 lib
    • Repeat above steps for all the three hosts.
    • Create hdfs://server01:9000/data/hbase directory on HDFS and change its permissions to 777 (for this demo)
    • Create /home/anishsneh/installs/tmp/hbase directory on LFS in all of the three servers (i.e. server01, server02, server03)
  2. Configure Cluster
    • Add Set HBASE_HOME=/home/anishsneh/installs/hbase-0.98.4-hadoop2 in ~/.bashrc (or wherever maintaining environment variables), reload profile/bash
    • Edit $HBASE_HOME/conf/hbase-site.xml with following
    • Repeat above steps for all the three hosts.
    • Edit regionservers file $HBASE_HOME/conf/regionservers on MASTER node (i.e. server01 in our case), such that:
      [anishsneh@server01 installs]$ cat $HBASE_HOME/conf/regionservers
  3. Start Cluster
    • Run $HBASE_HOME/bin/
    • On MASTER node, execute jps command ($JAVA_HOME/bin/jps), it should show following running processes
      [anishsneh@server01 installs]$ jps
      49360 NodeManager
      50841 HQuorumPeer
      50926 HMaster
      51047 HRegionServer
      49194 DataNode
      51241 Jps
      49019 NameNode
      49428 JobHistoryServer
      49075 SecondaryNameNode
      49253 ResourceManager

      On SLAVE nodes, execute jps command ($JAVA_HOME/bin/jps), it should show following running processes
      [anishsneh@server02 installs]$ jps
      35334 DataNode
      36000 HRegionServer
      36260 Jps
      35930 HQuorumPeer
      Note that three new processes are started on MASTER node i.e. HQuorumPeer, HMaster and HRegionServer. For SLAVE nodes two processes HQuorumPeer and HRegionServer are started
  4. Verify Installation
    • Access http://MASTER_SERVER_HOSTNAME:60010/master-status URL (for our demo URL will be http://server01:60010/master-status). Following page should appear:

      HBase Master Console

Commandline Client

We will run commandline client here and execute various CRUD operations. Commandline client (hbase-shell) can be launched using command $HBASE_HOME/bin/hbase shell command:
  • Launch HBase Shell
    [anishsneh@server01 installs]$ $HBASE_HOME/bin/hbase shell
    2014-09-07 15:41:25,864 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
    HBase Shell; enter 'help<RETURN>' for list of supported commands.
    Type "exit<RETURN>" to leave the HBase Shell
    Version 0.98.4-hadoop2, r890e852ce1c51b71ad180f626b71a2a1009246da, Mon Jul 14 19:45:06 PDT 2014
  • Verify connectivity
    hbase(main):001:00> status
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/home/anishsneh/installs/hbase-0.98.4-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/home/anishsneh/installs/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See for an explanation.
    Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/anishsneh/installs/hadoop-2.2.0/lib/native/ which might have disabled stack guard. The VM will try to fix the stack guard now.
    It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
    2014-09-07 15:44:01,138 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    3 servers, 0 dead, 0.6667 average load
  • Create table
    hbase(main):003:0> create 'users', 'xu'
    0 row(s) in 1.3660 seconds
    => Hbase::Table - users
  • Show tables
    hbase(main):004:0> list
    1 row(s) in 0.1240 seconds
    => ["users"]
  • Describe table
    hbase(main):005:0> describe 'users'
    DESCRIPTION                                                                                                 ENABLED                                                    
     'users', {NAME => 'xu', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER=> 'ROW', REPLICATION_SCOPE => '0', VER true                                                       
     SIONS => '1', COMPRESSION => 'NONE', MIN_VERSIONS => '0', TTL => 'FOREVER', KEEP_DELETED_CELLS => 'false',                                                            
      BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}                                                                                                    
    1 row(s) in 0.1430 seconds
  • Insert records
    hbase(main):006:0> put 'users', 'u001', 'xu:name', 'Anish Sneh'
    0 row(s) in 0.3240 seconds
    hbase(main):007:0> put 'users', 'u001', 'xu:login', 'asneh'
    0 row(s) in 0.0230 seconds
    hbase(main):008:0> put 'users', 'u001', 'xu:country', 'India'
    0 row(s) in 0.0210 seconds
    hbase(main):009:0> put 'users', 'u002', 'xu:name', 'Rakesh Shukla'
    0 row(s) in 0.0330 seconds
    hbase(main):010:0> put 'users', 'u002', 'xu:login', 'rshukla'
    0 row(s) in 0.0220 seconds
    hbase(main):011:0> put 'users', 'u002', 'xu:country', 'USA'
    0 row(s) in 0.0270 seconds
  • Select a single record
    hbase(main):012:0> get 'users', 'u001'
    COLUMN                                     CELL                                                                                                                        
     xu:country                                timestamp=1410130969069,value=India                                                                                        
     xu:login                                  timestamp=1410130951616,value=asneh                                                                                        
     xu:name                                   timestamp=1410130944209, value=Anish Sneh                                                                                   
    3 row(s) in 0.0390 seconds
  • Select all records or scanning a table
    hbase(main):013:0> scan 'users'
    ROW                                        COLUMN+CELL                                                                                                                 
     u001                                      column=xu:country, timestamp=1410130969069,value=India                                                                     
     u001                                      column=xu:login, timestamp=1410130951616,value=asneh                                                                       
     u001                                      column=xu:name, timestamp=1410130944209, value=AnishSneh                                                                   
     u002                                      column=xu:country, timestamp=1410130987810,value=USA                                                                       
     u002                                      column=xu:login, timestamp=1410130982420,value=rshukla                                                                     
     u002                                      column=xu:name, timestamp=1410130976169, value=Rakesh Shukla                                                                
    2 row(s) in 0.0540 seconds
  • Delete row in a table
    hbase(main):020:0> delete 'users', 'u002', 'xu:name'
    0 row(s) in 0.1300 seconds
    hbase(main):022:0> delete 'users', 'u002', 'xu:login'
    0 row(s) in 0.0210 seconds
    hbase(main):023:0> delete 'users', 'u002', 'xu:country'
    0 row(s) in 0.0280 seconds
    hbase(main):024:0> scan 'users'
    ROW                                        COLUMN+CELL                                                                                                                 
     u001                                      column=xu:country, timestamp=1410130969069, value=India                                                                     
     u001                                      column=xu:login, timestamp=1410130951616, value=asneh                                                                       
     u001                                      column=xu:name, timestamp=1410130944209, value=Anish Sneh                                                                   
    1 row(s) in 0.0290 seconds
  • Deleting a table
    hbase(main):025:0> disable 'users'
    0 row(s) in 1.4920 seconds
    hbase(main):026:0> drop 'users'
    0 row(s) in 0.5290 seconds
    hbase(main):027:0> list
    0 row(s) in 0.0750 seconds
    => []
We will learn more on Java based usage and Phoenix JDBC client in next post