HBase提供了一个命令行工具hbase shell以供用户使用。本节来介绍一下HBase Shell中常用的命令。
1.进入hbase shell环境
启动Hadoop
[root@localhost ~]# start-all.sh
启动ZooKeeper
[root@localhost ~]# zkServer.sh start
启动HBase
[root@localhost ~]# start-hbase.sh
进入hbase shell环境
[root@localhost ~]# hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.1, r987f7b6d37c2fcacc942cc66e5c5122aba8fdfbe, Wed Jun 13 12:03:55 PDT 2018
Took 0.0016 seconds
hbase(main):001:0>
2.简单创建表
hbase(main):001:0> create 'tblStudent','Info','Grade'
Created table tblStudent
Took 1.3997 seconds
=> Hbase::Table - tblStudent
3.高级创建表
hbase(main):001:0> create 'tblStudent',{NAME=>'Info',VERSIONS=>'3',COMPRESSION=>'SNAPPY'},
{NAME=>'Grade',VERSIONS=>'2',COMPRESSION=>'LZO'}
4.列出已有的表
hbase(main):002:0> list
TABLE
tblStudent
1 row(s)
Took 0.0192 seconds
=> ["tblStudent"]
5.查看表结构
hbase(main):003:0> describe 'tblStudent'
Table tblStudent is ENABLED
tblStudent
COLUMN FAMILIES DESCRIPTION
{NAME => 'Grade',
VERSIONS => '1',
EVICT_BLOCKS_ON_CLOSE => 'false',
NEW_VERSION_BEHAVIOR => 'false',
KEEP_DELETED_CELLS => 'FALSE',
CACHE_DATA_ON_WRITE => 'false',
DATA_BLOCK_ENCODING => 'NONE',
TTL => 'FOREVER',
MIN_VERSIONS => '0',
REPLICATION_SCOPE => '0',
BLOOMFILTER => 'ROW',
CACHE_INDEX_ON_WRITE => 'false',
IN_MEMORY => 'false',
CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE',
BLOCKCACHE => 'true',
BLOCKSIZE => '65536'}
{NAME => 'Info',
VERSIONS => '1',
EVICT_BLOCKS_ON_CLOSE => 'false',
NEW_VERSION_BEHAVIOR => 'false',
KEEP_DELETED_CELLS => 'FALSE',
CACHE_DATA_ON_WRITE => 'false',
DATA_BLOCK_ENCODING => 'NONE',
TTL => 'FOREVER',
MIN_VERSIONS =>'0',
REPLICATION_SCOPE => '0',
BLOOMFILTER => 'ROW',
CACHE_INDEX_ON_WRITE => 'false',
IN_MEMORY => 'false',
CACHE_BLOOMS_ON_WRITE => 'false',
PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE',
BLOCKCACHE => 'true',
BLOCKSIZE => '65536'}
2 row(s)
Took 0.1060 seconds
关于HBase表结构的几点说明:
- NAME:指定列族的名字
- VERSIONS:指定可以保存的值的版本个数
- EVICT_BLOCKS_ON_CLOSE:
- NEW_VERSION_BEHAVIOR:
- KEEP_DELETED_CELLS:
- CACHE_DATA_ON_WRITE:
- DATA_BLOCK_ENCODING:设置数据块编码方式
- TTL:默认是 2147483647 即:Integer.MAX_VALUE 值大概是68年。这个参数是说明该列族数据的存活时间,单位是s。这个参数可以根据具体的需求对数据设定存活时间,超过存过时间的数据将在表中不在显示,待下次major compact的时候再彻底删除数据。注意的是TTL设定之后 MIN_VERSIONS=>’0’ 这样设置之后,TTL时间戳过期后,将全部彻底删除该family下所有的数据;如果MIN_VERSIONS 不等于0那将保留最新的MIN_VERSIONS个版本的数据,其它的全部删除,比如MIN_VERSIONS=>’1’ 届时将保留一个最新版本的数据,其它版本的数据将不再保存。
- MIN_VERSIONS:在compact操作执行之后,至少要保留的版本数
- REPLICATION_SCOPE:配置HBase集群replication时需要将该参数设置为1.
- BLOOMFILTER:
- CACHE_INDEX_ON_WRITE:
- IN_MEMORY:设置激进缓存,优先考虑将该列族放入块缓存中,默认值为false, 针对随机读操作相对较多的列族可以设置该属性为true
- CACHE_BLOOMS_ON_WRITE:
- PREFETCH_BLOCKS_ON_OPEN:
- COMPRESSION:设置压缩算法,默认为NONE。使用压缩可以减少存储,但是会增减计算量。常用的压缩算法如下所示:
- BLOCKCACHE:设置数据块是否缓存,默认为true
- BLOCKSIZE:设置HFile数据块大小,默认64kb
6.修改表结构
hbase(main):004:0> disable 'tblStudent'
Took 0.5471 seconds
hbase(main):005:0> alter 'tblStudent',{NAME=>'Info',VERSIONS=>'2'}
Updating all regions with the new schema...
All regions updated.
Done.
Took 1.2771 seconds
hbase(main):006:0> enable 'tblStudent'
Took 0.7709 seconds
7.插入记录
hbase(main):007:0> put 'tblStudent','stu001','Info:name','Tom'
Took 0.0697 seconds
hbase(main):008:0> put 'tblStudent','stu001','Info:age','22'
Took 0.0050 seconds
hbase(main):009:0> put 'tblStudent','stu002','Info:name','Jack'
Took 0.0078 seconds
hbase(main):010:0> put 'tblStudent','stu002','Info:age','25'
Took 0.0050 seconds
8.查看单条记录
hbase(main):011:0> get 'tblStudent','stu001'
COLUMN CELL
Info:age timestamp=1532361034202, value=22
Info:name timestamp=1532360807963, value=Tom
1 row(s)
Took 0.0359 seconds
9.查看所有记录
hbase(main):012:0> scan 'tblStudent'
ROW COLUMN+CELL
stu001 column=Info:age, timestamp=1532361034202, value=22
stu001 column=Info:name, timestamp=1532361309771, value=Tom
stu002 column=Info:age, timestamp=1532361111248, value=25
stu002 column=Info:name, timestamp=1532361052962, value=Jack
2 row(s)
Took 0.0112 seconds
10.统计记录数
hbase(main):013:0> count 'tblStudent'
2 row(s)
Took 0.0388 seconds
=> 2
11.查看某个表某个列的所有数据
hbase(main):014:0> scan 'tblStudent',{COLUMNS=>'Info:name'}
ROW COLUMN+CELL
stu001 column=Info:name, timestamp=1532361309771, value=Tom
stu002 column=Info:name, timestamp=1532361052962, value=Jack
2 row(s)
Took 0.0093 seconds
12.更新记录(重写一遍)
hbase(main):015:0> put 'tblStudent','stu001','Info:name','Tommy'
Took 0.0063 seconds
hbase(main):016:0> scan 'tblStudent',{COLUMNS=>'Info:name'}
ROW COLUMN+CELL
stu001 column=Info:name, timestamp=1532361423526, value=Tommy
stu002 column=Info:name, timestamp=1532361052962, value=Jack
2 row(s)
Took 0.0064 seconds
13.删除某条记录的某一列
hbase(main):017:0> delete 'tblStudent','stu002','Info:name'
Took 0.0078 seconds
hbase(main):018:0> scan 'tblStudent',{COLUMNS=>'Info:name'}
ROW COLUMN+CELL
stu001 column=Info:name, timestamp=1532361423526, value=Tommy
1 row(s)
Took 0.0078 seconds
14.清空表中所有数据
hbase(main):019:0> truncate 'tblStudent'
Truncating 'tblStudent' table (it may take a while):
Disabling table...
Truncating table...
Took 1.2209 seconds
hbase(main):020:0> scan 'tblStudent'
ROW COLUMN+CELL
0 row(s)
Took 0.1419 seconds
15.禁用单个表
hbase(main):017:0> disable 'tblStudent'
Took 0.8602 seconds
hbase(main):040:0> scan 'tblStudent'
ROW COLUMN+CELLERROR: Table tblStudent is disabled!
16.激活单个表
hbase(main):018:0> enable 'tblStudent'
Took 0.0098 seconds
hbase(main):042:0> scan 'tblStudent'
ROW COLUMN+CELL
0 row(s)
Took 0.0044 seconds
17.禁用多个表
支持正则表达式:
hbase(main):026:0> list
TABLE
tblStudent
tblStudent1
tblStudent2
tblStudent3
4 row(s)
Took 0.0168 seconds
=> ["tblStudent", "tblStudent1", "tblStudent2", "tblStudent3"]
hbase(main):030:0> disable_all 'tblStudent[1-9]'
tblStudent1
tblStudent2
tblStudent3Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
Took 6.1949 seconds
18.激活多个表
支持正则表达式:
hbase(main):027:0> enable_all 'tblStudent[1-9]'
tblStudent1
tblStudent2
tblStudent3Enable the above 3 tables (y/n)?
y
3 tables successfully enabled
Took 3.8096 seconds
19.删除单个表
hbase(main):031:0> disable 'tblStudent'
Took 0.0069 seconds
hbase(main):032:0> drop 'tblStudent'
Took 0.2532 seconds
hbase(main):033:0> list
TABLE
tblStudent1
tblStudent2
tblStudent3
3 row(s)
Took 0.0068 seconds
=> ["tblStudent1", "tblStudent2", "tblStudent3"]
20.删除多个表
支持正则表达式:
hbase(main):034:0> disable_all 'tblStudent[1-9]'
tblStudent1
tblStudent2
tblStudent3Disable the above 3 tables (y/n)?
y
3 tables successfully disabled
Took 1.8435 seconds
hbase(main):035:0> drop_all 'tblStudent[1-9]'
tblStudent1
tblStudent2
tblStudent3Drop the above 3 tables (y/n)?
y
3 tables successfully dropped
Took 3.0995 seconds
hbase(main):036:0> list
TABLE
0 row(s)
Took 0.0078 seconds
=> []
21.退出hbase shell环境
hbase(main):040:0> quit
[root@master ~]#