元数据是Kylin中最重要的数据之一,备份元数据是运维工作中至关重要的环节。
本篇主要介绍如何备份Kylin元数据,方便数据恢复和迁移。
1. Kylin元数据
1.1 Kylin元数据介绍
Kylin组织所有的元数据(cube、cube_desc、model_desc、project、table等)作为一份层次的问加你系统,然而Kylin默认使用HBase来进行存储的,而不是普通的文件系统。
可以从Kylin的配置文件conf/kylin.properties中查看到:
## The metadata store in hbase
#kylin.metadata.url=kylin_metadata@hbase
kylin.metadata.url选项的指标是kylin的元数据被保存在HBase的kylin_metadata表中。
1.2 Kylin的元数据的相关操作
[root@hadoop2 bin]# ./metastore.sh
usage: metastore.sh backup
metastore.sh fetch DATA
metastore.sh reset
metastore.sh refresh-cube-signature
metastore.sh restore PATH_TO_LOCAL_META
metastore.sh list RESOURCE_PATH
metastore.sh cat RESOURCE_PATH
metastore.sh remove RESOURCE_PATH
metastore.sh clean [--delete true]
[root@hadoop2 bin]# ./metastore.sh backup
1.3 备份元数据
[root@hadoop2 bin]# ./metastore.sh backup
Starting backup to /usr/local/apps/kylin/meta_backups/meta_2018_05_25_15_11_32
Retrieving hadoop conf dir...
KYLIN_HOME is set to /usr/local/apps/kylin
Retrieving hive dependency...
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apps/apache-kylin-2.3.1-bin/tool/kylin-tool-2.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/apps/apache-kylin-2.3.1-bin/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
通过日志发现,备份元数据到本地目录和目录的命名格式:
/usr/local/apps/kylin/meta_backups/meta_2018_05_25_15_11_32
1.4 查看备份目录内容
[root@hadoop2 meta_2018_05_25_15_11_32]# ll
total 8
drwxr-xr-x 2 root root 50 May 25 15:11 acl
drwxr-xr-x 2 root root 29 May 25 15:11 cube
drwxr-xr-x 2 root root 29 May 25 15:11 cube_desc
drwxr-xr-x 3 root root 24 May 25 15:11 cube_statistics
drwxr-xr-x 3 root root 47 May 25 15:11 dict
drwxr-xr-x 2 root root 314 May 25 15:11 execute
drwxr-xr-x 2 root root 4096 May 25 15:11 execute_output
drwxr-xr-x 2 root root 39 May 25 15:11 model_desc
drwxr-xr-x 2 root root 33 May 25 15:11 project
drwxr-xr-x 2 root root 19 May 25 15:11 query
drwxr-xr-x 2 root root 172 May 25 15:11 table
drwxr-xr-x 2 root root 172 May 25 15:11 table_exd
drwxr-xr-x 4 root root 68 May 25 15:11 table_snapshot
drwxr-xr-x 2 root root 19 May 25 15:11 user
-rw-r--r-- 1 root root 38 Jan 1 1970 UUID
| 目录名 | 备份目录的内容 |
|---|---|
| project | 包含了项目的基本信息,项目所包含其他元数据类型的声明 |
| model_desc | 包含了描述数据模型基本信息,结构的定义 |
| cube_desc | 包含了描述Cuge模型基本信息,结构的定义 |
| cube | 包含了Cube实例的基本信息,以及下属Cube Segment的信息 |
| cube_statistics | 包含了Cuge实例的统计信息 |
| table | 包含了表的基本信息,如Hive信息 |
| table_exd | 包含了表的扩展信息,如维度 |
| table_snapshot | 包含了Lookup表的镜像 |
| dict | 包含了使用字典列的字典 |
| execute | 包含了Cube构建任务的步骤信息 |
| execute_output | 包含了Cube构建任务的步骤信息 |
1.5 恢复元数据
[root@hadoop2 bin]# ./metastore.sh reset
等恢复操作完成,可以在“Web UI”的“System”页面单击“Reload Metadata”按钮对元数据缓存进行刷新,即可看到最新的元数据。