HBase环境搭建有三种方式:1.本地模式:只需要一个节点(只有HMaster,没有HRegionServer),不需要集成ZooKeeper,数据存储在本地文件系统上;2.伪分布模式:只需要一个节点(HMaster和HRegionServer在同一个节点上),需要集成ZooKeeper,数据存储在HDFS上;3.全分布模式:至少需要3个节点(一个HMaster节点和至少2个HRegionServer节点),需要集成ZooKeeper,数据存储在HDFS上。本节先来介绍HBase伪分布模式的环境搭建过程。
集群环境介绍:
master 192.168.126.111
slave1 192.168.126.112
slave2 192.168.126.113
slave3 192.168.126.114
本节用到的安装介质:
hbase-2.0.1-bin.tar.gz 提取码:h04f
zookeeper-3.4.10.tar.gz 提取码:31j4
1.搭建Hadoop全分布环境
Hadoop全分布模式的搭建过程请参看文章《Hadoop从入门到精通3:Hadoop2.x环境搭建之全分布模式》。
2.搭建ZooKeeper伪分布环境
Zookeeper有两种常见的搭建方式:1.standalone方式;2.集群方式。这两者的区别是:
- standalone方式只需要一个Zookeeper节点;集群方式至少需要3个节点(一个leader,2个follower);
- standalone方式一旦Zookeeper节点宕机,Spark HA就不能使用了;
- 集群方式如果leader宕机,Zookeeper内部会从follower中选举出一个新的leader,可以对HA提供更有效的保障。
这里来介绍Zookeeper的standalone方式的搭建过程:
2.1上传Zookeeper安装包
[root@master ~]# cd /root/tools/
[root@master tools]# ls
zookeeper-3.4.10.tar.gz
2.2解压Zookeeper安装包
[root@master tools]# tar -zxvf zookeeper-3.4.10.tar.gz -C /root/trainings/
2.3配置Zookeeper环境变量
[root@master tools]# vim /root/.bash_profile
ZOOKEEPER_HOME=/root/trainings/zookeeper-3.4.10
export ZOOKEEPER_HOME
PATH=$ZOOKEEPER_HOME/bin:$PATH
export PATH
[root@master tools]# source /root/.bash_profile
2.4配置Zookeeper参数
创建Zookeeper保存数据的目录:
[root@master ~]# mkdir /root/trainings/zookeeper-3.4.10/tmp
创建Zookeeper Server ID的配置文件myid,内容写入1:
[root@master ~]# vim /root/trainings/zookeeper-3.4.10/tmp/myid
1
编辑Zookeeper的配置文件zoo.cfg:
[root@master ~]# cd /root/trainings/zookeeper-3.4.10/conf
[root@master conf]# cp zoo_sample.cfg zoo.cfg
[root@master conf]# vim zoo.cfg
#dataDir=/tmp/zookeeper
dataDir=/root/trainings/zookeeper-3.4.10/tmp
server.1=master:2888:3888
2.5启动Zookeeper
[root@master ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /root/trainings/zookeeper-3.4.10/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@master ~]# jps
1977 Jps
1962 QuorumPeerMain
3.搭建HBase全分布环境
3.1下载HBase安装包
可以从上面的安装介质链接中下载HBase安装包,或者去HBase官网下载最新HBase安装包,然后使用WinSCP工具将下载好的安装包上传至master节点的/root/tools目录下。
[root@master ~]# cd /root/tools/
[root@master tools]# ls
hbase-2.0.1-bin.tar.gz
3.2解压HBase到安装目录
将HBase安装包解压至安装目录/root/trainings/
[root@master tools]# tar -zxvf hbase-2.0.1-bin.tar.gz -C /root/trainings/
3.3配置HBase环境变量
将HBase加入到环境变量PATH中(四台机器都做一遍)
[root@master tools]# cd /root/trainings/hbase-2.0.1/
[root@master hbase-2.0.1]# pwd
/root/trainings/hbase-2.0.1
[root@master hbase-2.0.1]# vim /root/.bash_profile
HBASE_HOME=/root/trainings/hbase-2.0.1
export HBASE_HOME
PATH=$HBASE_HOME/bin:$PATH
export PATH
[root@master hbase-2.0.1]# source /root/.bash_profile
3.4配置HBase配置文件
启动Hadoop全分布集群:
[root@master sbin]# ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [master]
master: starting namenode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-namenode-master.out
slave1: starting datanode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-datanode-slave1.out
slave2: starting datanode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-datanode-slave2.out
slave3: starting datanode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root-datanode-slave3.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /root/trainings/hadoop-2.7.3/logs/hadoop-root->secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-resourcemanager->master.out
slave1: starting nodemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-nodemanager->slave1.out
slave3: starting nodemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-nodemanager->slave3.out
slave2: starting nodemanager, logging to /root/trainings/hadoop-2.7.3/logs/yarn-root-nodemanager->slave2.out
在HDFS上新建一个目录用来存储HBase的数据:
[root@master ~]# hdfs dfs -mkdir /hbase
进入$HBASE_HOME/conf目录,配置下面的配置文件:
[root@master conf]# pwd
/root/trainings/hbase-2.0.1/conf[root@master conf]# vim hbase-env.sh
# The java implementation to use. Java 1.8+ required.
# export JAVA_HOME=/usr/java/jdk1.8.0/
export JAVA_HOME=/root/trainings/jdk1.8.0_144#注意:参数HBASE_MANAGES_ZK=true时表示使用HBase自带的ZooKeeper,
#因此,如果第2步没有安装别的ZooKeeper请将该参数置为true,否则置为false。
export HBASE_MANAGES_ZK=false[root@localhost conf]# vim hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>[root@master conf]# vim regionservers
slave1
slave2
slave3
注意:这里需要Java的版本在1.8以上。
3.5将配置好的HBase目录分发给从节点
[root@master ~ ]# cd /root/trainings
[root@master trainings]# scp -rf hbase-2.0.1 root@slave1:/root/trainings/
[root@master trainings]# scp -rf hbase-2.0.1 root@slave2:/root/trainings/
[root@master trainings]# scp -rf hbase-2.0.1 root@slave3:/root/trainings/
4.使用HBase
4.1启动HBase集群
[root@master ~]# start-hbase.sh
master: running zookeeper, logging to /root/trainings/hbase-2.0.1/bin/../logs/hbase-root-zookeeper->master.out
running master, logging to /root/trainings/hbase-2.0.1/logs/hbase-root-master-master.out
slave1: running regionserver, logging to /root/trainings/hbase-2.0.1/bin/../logs/hbase-root-regionserver->slave1.out
slave2: running regionserver, logging to /root/trainings/hbase-2.0.1/bin/../logs/hbase-root-regionserver->slave2.out
slave3: running regionserver, logging to /root/trainings/hbase-2.0.1/bin/../logs/hbase-root-regionserver->slave3.out[root@master ~]# jps
4273 HMaster
4485 Jps
2249 NameNode
1962 QuorumPeerMain
2460 SecondaryNameNode
2622 ResourceManager[root@slave1 ~]# jps
1993 DataNode
2089 NodeManager
2665 HRegionServer
2827 Jps[root@slave2 ~]# jps
2628 HRegionServer
1990 DataNode
2086 NodeManager
2806 Jps[root@slave3 ~]# jps
2819 Jps
2119 NodeManager
2631 HRegionServer
1999 DataNode
可以看到,HBase全分布模式启动之后HMaster进程和HRegionServer进程在位于不同的节点上。
可以在网页上监控HBase的状态信息:端口号16010
4.2使用HBase shell
使用hbase shell命令可以进入HBase命令行模式:
[root@master ~]# hbase shell
2018-07-16 23:32:59,553 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for >your platform... using builtin-java classes where applicable
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.1, r987f7b6d37c2fcacc942cc66e5c5122aba8fdfbe, Wed Jun 13 12:03:55 PDT 2018
Took 0.0020 seconds
hbase(main):001:0> create 'tblStudent','Info','Grade'
Created table tblStudent
Took 1.8144 seconds
=> Hbase::Table - tblStudent
hbase(main):002:0> put 'tblStudent','stu001','Info:name','Tom'
Took 0.1655 seconds
hbase(main):003:0> put 'tblStudent','stu001','Info:age','25'
Took 0.0129 seconds
hbase(main):004:0> put 'tblStudent','stu001','Grade:chinese','88'
Took 0.0053 seconds
hbase(main):005:0> put 'tblStudent','stu001','Grade:math','90'
Took 0.0080 seconds
hbase(main):006:0> put 'tblStudent','stu002','Info:name','Jack'
Took 0.0042 seconds
hbase(main):007:0> put 'tblStudent','stu002','Info:age','23'
Took 0.6333 seconds
hbase(main):008:0> put 'tblStudent','stu002','Grade:english','78'
Took 0.0457 seconds
hbase(main):009:0> put 'tblStudent','stu002','Grade:math','60'
Took 0.0108 seconds
hbase(main):010:0> scan 'tblStudent'
ROW COLUMN+CELL
stu001 column=Grade:chinese, timestamp=1531755222379, value=88
stu001 column=Grade:math, timestamp=1531755227442, value=90
stu001 column=Info:age, timestamp=1531755216220, value=25
stu001 column=Info:name, timestamp=1531755211017, value=Tom
stu002 column=Grade:english, timestamp=1531755253054, value=78
stu002 column=Grade:math, timestamp=1531755258620, value=60
stu002 column=Info:age, timestamp=1531755246729, value=23
stu002 column=Info:name, timestamp=1531755232841, value=Jack
2 row(s)
Took 0.0607 seconds
hbase(main):011:0> quit
使用quit命令退出HBase命令行环境。
查看HDFS上/hbase目录下产生的数据:
[root@master ~]# /root/trainings/hadoop-2.7.3/bin/hdfs dfs -ls /hbase
Found 12 items
drwxr-xr-x - root supergroup 0 2018-07-16 23:22 /hbase/.hbck
drwxr-xr-x - root supergroup 0 2018-07-16 23:29 /hbase/.tmp
drwxr-xr-x - root supergroup 0 2018-07-16 23:28 /hbase/MasterProcWALs
drwxr-xr-x - root supergroup 0 2018-07-16 23:29 /hbase/WALs
drwxr-xr-x - root supergroup 0 2018-07-16 23:22 /hbase/archive
drwxr-xr-x - root supergroup 0 2018-07-16 23:22 /hbase/corrupt
drwxr-xr-x - root supergroup 0 2018-07-16 23:29 /hbase/data
-rw-r--r-- 3 root supergroup 42 2018-07-16 23:22 /hbase/hbase.id
-rw-r--r-- 3 root supergroup 7 2018-07-16 23:22 /hbase/hbase.version
drwxr-xr-x - root supergroup 0 2018-07-16 23:22 /hbase/mobdir
drwxr-xr-x - root supergroup 0 2018-07-16 23:28 /hbase/oldWALs
drwx--x--x - root supergroup 0 2018-07-16 23:22 /hbase/staging
[root@master ~]# /root/trainings/hadoop-2.7.3/bin/hdfs dfs -ls /hbase/data/default
Found 1 items
drwxr-xr-x - root supergroup 0 2018-07-16 23:33 /hbase/data/default/tblStudent
[root@master ~]# /root/trainings/hadoop-2.7.3/bin/hdfs dfs -ls /hbase/data/default/tblStudent
Found 3 items
drwxr-xr-x - root supergroup 0 2018-07-16 23:33 /hbase/data/default/tblStudent/.tabledesc
drwxr-xr-x - root supergroup 0 2018-07-16 23:33 /hbase/data/default/tblStudent/.tmp
drwxr-xr-x - root supergroup 0 2018-07-16 23:34 >/hbase/data/default/tblStudent/886b369644f808734fa92b6a774d81fe
[root@master ~]# /root/trainings/hadoop-2.7.3/bin/hdfs dfs -ls >/hbase/data/default/tblStudent/886b369644f808734fa92b6a774d81fe
Found 4 items
-rw-r--r-- 3 root supergroup 45 2018-07-16 23:33 >/hbase/data/default/tblStudent/886b369644f808734fa92b6a774d81fe/.regioninfo
drwxr-xr-x - root supergroup 0 2018-07-16 23:34 >/hbase/data/default/tblStudent/886b369644f808734fa92b6a774d81fe/Grade
drwxr-xr-x - root supergroup 0 2018-07-16 23:34 >/hbase/data/default/tblStudent/886b369644f808734fa92b6a774d81fe/Info
drwxr-xr-x - root supergroup 0 2018-07-16 23:34 >/hbase/data/default/tblStudent/886b369644f808734fa92b6a774d81fe/recovered.edits
4.3停止HBase全分布模式
[root@master ~]# stop-hbase.sh
stopping hbase...........
master: running zookeeper, logging to /root/trainings/hbase-2.0.1/bin/../logs/hbase-root-zookeeper->master.out
master: no zookeeper to stop because no pid file /tmp/hbase-root-zookeeper.pid
[root@master ~]# jps
5122 Jps
2249 NameNode
1962 QuorumPeerMain
2460 SecondaryNameNode
2622 ResourceManager[root@slave1 ~]# jps
2998 Jps
1993 DataNode
2089 NodeManager[root@slave2 ~]# jps
1990 DataNode
2086 NodeManager
2952 Jps[root@slave3 ~]# jps
2119 NodeManager
2990 Jps
1999 DataNode
本节介绍了HBase全分布模式的环境搭建过程!祝你玩得愉快!