简介
Hbase本地环境搭建
软件 | 下载地址 |
---|---|
hadoop | 点击下载 |
zookeeper-3.4.10.tar.gz | 点击下载 |
hbase-1.2.6-bin.tar.gz | 点击下载 |
hadoop
在笔者另一篇博客Spark实践里有详细步骤,这里不再赘述。
zookeeper
安装包解压缩,修改环境变量
tar xvf zookeeper-3.4.10/ -C /usr/local
export PATH
export JAVA_HOME=/usr/local/jdk1.8.0_161
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib
export SCALA_HOME=/usr/local/scala-2.11.7
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export ZOOKEEPER_HOME=/usr/local/zookeeper-3.4.10
export HBASE_HOME=/usr/local/hbase-1.2.6
export PATH=$JAVA_HOME/bin:/usr/local/hadoop-2.7.2/bin:/usr/local/hadoop-2.7.2/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$PATH
修改配置文件
修改/usr/local/zookeeper-3.4.10/conf/zoo.cfg
,内容如下
添加数据路径
dataDir=/usr/local/zookeeper-3.4.10/data
添加集群工作端口
server.0=SparkMaster:2888:3888
server.1=SparkWorker1:2888:3888
server.2=SparkWorker2:2888:3888
server.3=SparkWorker3:2888:3888
server.4=SparkWorker4:2888:3888
添加通讯
quorumListenOnAllIPs=true
创建/usr/local/zookeeper-3.4.10/data/myid
文件,写入该节点集群编号
节点可以可以0,1,2,3,4,无限增长,不一定0就是主节点,主要是为了区分身份。
将软件包同步到集群其他节点,起集群
rsync -av /usr/local/zookeeper-3.4.10/ SparkWorker1:/usr/local/zookeeper-3.4.10
rsync -av /usr/local/zookeeper-3.4.10/ SparkWorker2:/usr/local/zookeeper-3.4.10
.
.
.
zkServer.sh start
每个节点都得起,可以在主节点上用ssh远程执行,ssh root@SparkWorker1 "***"
zkServer.sh status
查看zookeeper状态
注:笔者在虚拟环境的搭建的五节点集群,包括hadoop,spark,zookerper,hbase,上图为hadoop的hdfs和yarn,spark,zookeeper,hbase服务都启动的状态,其中`QuorumPeerMain`对应zookeeper,`HRegionServer`和`HMaster`对应hbase,Namenode和SecondaryNameNode对应hdfs的主节点,从节点是datanode,ResourceManager对应yarn,Master对应spark
hbase
解压缩,修改环境变量
tar xvf hbase-1.2.6/ -C /usr/local
export PATH
export JAVA_HOME=/usr/local/jdk1.8.0_161
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib
export SCALA_HOME=/usr/local/scala-2.11.7
export SPARK_HOME=/usr/local/spark-2.2.0-bin-hadoop2.7
export ZOOKEEPER_HOME=/usr/local/zookeeper-3.4.10
export HBASE_HOME=/usr/local/hbase-1.2.6
export PATH=$JAVA_HOME/bin:/usr/local/hadoop-2.7.2/bin:/usr/local/hadoop-2.7.2/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$PATH
修改配置文件
/usr/local/hbase-1.2.6/conf/hbase-env.sh
export JAVA_HOME=/usr/local/jdk
export HBASE_MANAGES_ZK=false
/usr/local/hbase-1.2.6/conf/hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://Spark:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>SparkMaster</value> 注:这里的值为各自节点的值,比如SparkWorker1上这里填SparkWorker1,可以先同步文件夹,再单独修改
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
/usr/local/hbase-1.2.6/conf/regionservers
这里的值为各自节点的值,比如SparkWorker1上这里填SparkWorker1,可以先同步文件夹,再单独修改
同步到其他节点,起集群
rsync -av /usr/local/hbase-1.2.6/ SparkWorker2:/usr/local/hbase-1.2.6
start-hbase.sh
hbase 基本操作 (增删改查)
创建表
> create 'users','user_id','address','info'
列出全部表
> list
得到表的描述
> describe 'users'
删除表
> create 'users_tmp','user_id','address','info'
> disable 'users_tmp'
> drop 'users_tmp'
添加记录
put ‘表名’,’行键(标识)’,’列族:字段’,’数值’
put 'users','xiaoming','info:age','24';
put 'users','xiaoming','info:birthday','1987-06-17';
put 'users','xiaoming','info:company','alibaba';
put 'users','xiaoming','address:contry','china';
put 'users','xiaoming','address:province','zhejiang';
put 'users','xiaoming','address:city','hangzhou';
put 'users','zhangyifei','info:birthday','1987-4-17';
put 'users','zhangyifei','info:favorite','movie';
put 'users','zhangyifei','info:company','alibaba';
put 'users','zhangyifei','address:contry','china';
put 'users','zhangyifei','address:province','guangdong';
put 'users','zhangyifei','address:city','jieyang';
put 'users','zhangyifei','address:town','xianqiao'
获取一条记录
取得一个id的所有数据
> get 'users','xiaoming'
获取一个id,一个列族的所有数据
> get 'users','xiaoming','info'
获取一个id,一个列族中一个列的所有数据
> get 'users','xiaoming','info:age'
更新记录
> put 'users','xiaoming','info:age' ,'29'
> get 'users','xiaoming','info:age'
> put 'users','xiaoming','info:age' ,'30'
> get 'users','xiaoming','info:age'
获取单元格数据的版本数据
> get 'users','xiaoming',{COLUMN=>'info:age',VERSIONS=>1}
> get 'users','xiaoming',{COLUMN=>'info:age',VERSIONS=>2}
> get 'users','xiaoming',{COLUMN=>'info:age',VERSIONS=>3}
获取单元格数据的某个版本数据
> get 'users','xiaoming',{COLUMN=>'info:age',TIMESTAMP=>1364874937056}
全表扫描
> scan 'users'
删除xiaoming值的’info:age’字段
> delete 'users','xiaoming','info:age'
> get 'users','xiaoming'
删除整行
> deleteall 'users','xiaoming'
统计表的行数
> count 'users'
清空表
> truncate 'users'
实践仓促,尚有瑕疵,等待深入研究。