简介
Hbase本地环境搭建
| 软件 | 下载地址 | 
|---|---|
| hadoop | 点击下载 | 
| zookeeper-3.4.10.tar.gz | 点击下载 | 
| hbase-1.2.6-bin.tar.gz | 点击下载 | 
hadoop
在笔者另一篇博客Spark实践里有详细步骤,这里不再赘述。
zookeeper
安装包解压缩,修改环境变量
tar xvf zookeeper-3.4.10/ -C /usr/local
export PATH
export JAVA_HOME=/usr/local/jdk1.8.0_161
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib
export SCALA_HOME=/usr/local/scala-2.11.7
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export ZOOKEEPER_HOME=/usr/local/zookeeper-3.4.10
export HBASE_HOME=/usr/local/hbase-1.2.6
export PATH=$JAVA_HOME/bin:/usr/local/hadoop-2.7.2/bin:/usr/local/hadoop-2.7.2/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$PATH
修改配置文件
修改/usr/local/zookeeper-3.4.10/conf/zoo.cfg,内容如下
添加数据路径
dataDir=/usr/local/zookeeper-3.4.10/data
添加集群工作端口
    server.0=SparkMaster:2888:3888
    server.1=SparkWorker1:2888:3888
    server.2=SparkWorker2:2888:3888
    server.3=SparkWorker3:2888:3888
    server.4=SparkWorker4:2888:3888
添加通讯
    quorumListenOnAllIPs=true
创建/usr/local/zookeeper-3.4.10/data/myid文件,写入该节点集群编号

节点可以可以0,1,2,3,4,无限增长,不一定0就是主节点,主要是为了区分身份。
将软件包同步到集群其他节点,起集群
rsync -av /usr/local/zookeeper-3.4.10/ SparkWorker1:/usr/local/zookeeper-3.4.10
rsync -av /usr/local/zookeeper-3.4.10/ SparkWorker2:/usr/local/zookeeper-3.4.10
.
.
.
zkServer.sh start
每个节点都得起,可以在主节点上用ssh远程执行,ssh root@SparkWorker1 "***"
zkServer.sh status
查看zookeeper状态

注:笔者在虚拟环境的搭建的五节点集群,包括hadoop,spark,zookerper,hbase,上图为hadoop的hdfs和yarn,spark,zookeeper,hbase服务都启动的状态,其中`QuorumPeerMain`对应zookeeper,`HRegionServer`和`HMaster`对应hbase,Namenode和SecondaryNameNode对应hdfs的主节点,从节点是datanode,ResourceManager对应yarn,Master对应spark
hbase
解压缩,修改环境变量
tar xvf hbase-1.2.6/ -C /usr/local
export PATH
export JAVA_HOME=/usr/local/jdk1.8.0_161
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib
export SCALA_HOME=/usr/local/scala-2.11.7
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export ZOOKEEPER_HOME=/usr/local/zookeeper-3.4.10
export HBASE_HOME=/usr/local/hbase-1.2.6
export PATH=$JAVA_HOME/bin:/usr/local/hadoop-2.7.2/bin:/usr/local/hadoop-2.7.2/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$PATH
修改配置文件
/usr/local/hbase-1.2.6/conf/hbase-env.sh
export JAVA_HOME=/usr/local/jdk  
export HBASE_MANAGES_ZK=false
/usr/local/hbase-1.2.6/conf/hbase-site.xml
<property>  
  <name>hbase.rootdir</name>  
  <value>hdfs://Spark:9000/hbase</value>  
</property>  
<property>  
  <name>hbase.cluster.distributed</name>  
  <value>true</value>  
</property>  
<property>  
  <name>hbase.zookeeper.quorum</name>  
  <value>SparkMaster</value>     注:这里的值为各自节点的值,比如SparkWorker1上这里填SparkWorker1,可以先同步文件夹,再单独修改
</property>  
<property>  
  <name>dfs.replication</name>  
  <value>1</value>  
</property>  
/usr/local/hbase-1.2.6/conf/regionservers
这里的值为各自节点的值,比如SparkWorker1上这里填SparkWorker1,可以先同步文件夹,再单独修改
同步到其他节点,起集群
rsync -av /usr/local/hbase-1.2.6/ SparkWorker2:/usr/local/hbase-1.2.6
start-hbase.sh
hbase 基本操作 (增删改查)
创建表
> create 'users','user_id','address','info'
列出全部表
> list 
得到表的描述
> describe 'users'  
删除表
> create 'users_tmp','user_id','address','info' 
> disable 'users_tmp'  
> drop 'users_tmp' 
添加记录
put ‘表名’,’行键(标识)’,’列族:字段’,’数值’
put 'users','xiaoming','info:age','24';  
put 'users','xiaoming','info:birthday','1987-06-17';  
put 'users','xiaoming','info:company','alibaba';  
put 'users','xiaoming','address:contry','china';  
put 'users','xiaoming','address:province','zhejiang';  
put 'users','xiaoming','address:city','hangzhou';  
put 'users','zhangyifei','info:birthday','1987-4-17';  
put 'users','zhangyifei','info:favorite','movie';  
put 'users','zhangyifei','info:company','alibaba';  
put 'users','zhangyifei','address:contry','china';  
put 'users','zhangyifei','address:province','guangdong';  
put 'users','zhangyifei','address:city','jieyang';  
put 'users','zhangyifei','address:town','xianqiao'  
获取一条记录
取得一个id的所有数据
> get 'users','xiaoming'  
获取一个id,一个列族的所有数据
> get 'users','xiaoming','info'  
获取一个id,一个列族中一个列的所有数据
> get 'users','xiaoming','info:age'  
更新记录
> put 'users','xiaoming','info:age' ,'29'  
> get 'users','xiaoming','info:age'  
> put 'users','xiaoming','info:age' ,'30'  
> get 'users','xiaoming','info:age'  
获取单元格数据的版本数据
> get 'users','xiaoming',{COLUMN=>'info:age',VERSIONS=>1}  
> get 'users','xiaoming',{COLUMN=>'info:age',VERSIONS=>2}  
> get 'users','xiaoming',{COLUMN=>'info:age',VERSIONS=>3}  
获取单元格数据的某个版本数据
> get 'users','xiaoming',{COLUMN=>'info:age',TIMESTAMP=>1364874937056}  
全表扫描
> scan 'users'  
删除xiaoming值的’info:age’字段
> delete 'users','xiaoming','info:age'  
> get 'users','xiaoming'  
删除整行
> deleteall 'users','xiaoming'  
统计表的行数
> count 'users'  
清空表
> truncate 'users'  
实践仓促,尚有瑕疵,等待深入研究。
