Druid0.12.0单机升级到0.17集群
Druid老的版本功能受限,想使用更加新的功能就需要升级Druid,这里做了一个升级的方案演示,单机到集群,均是模拟的环境,仅供参考
默认存储使用的都是本地存储,升级之后到Druid0.17.0版本集群,存储为HDFS,且Druid集群也就不会在使用默认的元数据库做元数据管理了。
基础环境准备及检测,请参考Linux环境准备及检测.md
Druid新版集群安装,可以参考Druid0.17.0版本集群实施文档
这里暂定新的版本的环境变量是$NEW_DRUID_HOME 当前Druid0.12.0版本的环境变量是$DRUID_HOME
这里默认已经安装好了Druid0.17.0版本的集群,且已经把Druid0.12.0版本的元数据导入到MySQL中了,可以参考Druid元数据从Derby转成MySQL
<font color="red">注意:深度存储迁移到HDFS上时,元数据导出时需要在导出命令上添加一个参数,-h 指定深度存储在HDFS的路径: 如 -h /demo/druid/ </font>
如 :2020615125640
复制segment数据到Druid0.17.0版本配置的segment配置的HDFS上
[1] 下载Druid0.12.0版本上的segments文件
把segments目录整体的压缩成一个压缩包,要是很大的话,就分开压缩,或者直接使用scp命令copy到HDFS客户端所在的节点上
我这里的数据比较小,就使用的scp命令,直接复制到了HDFS机器上。
tar -zcf $DRUID_HOME/var/druid/segments.tar.gz $DRUID_HOME/var/druid/segments
scp $DRUID_HOME/var/druid/segments.tar.gz 192.168.1.91:/opt
[2] 确认Druid0.17.0版本的深度存储的位置一定要和导出元数据指定的位置一致
这里是需要做一步转换的,首先HDFS文件目录中不能带有 "冒号" 所以在导出元数据的时候就把 存储到HDFS文件目录的地址做了转换
- 导出到本地存储是的信息是:
ttDemo_2020-06-11T16:00:00.000Z_2020-06-11T17:00:00.000Z_2020-06-12T03:58:40.472Z,ttDemo,2020-06-12T03:59:31.408Z,2020-06-11T16:00:00.000Z,2020-06-11T17:00:00.000Z,1,2020-06-12T03:58:40.472Z,1,"{""dataSource"":""ttDemo"",""interval"":""2020-06-11T16:00:00.000Z/2020-06-11T17:00:00.000Z"",""version"":""2020-06-12T03:58:40.472Z"",""loadSpec"":{""type"":""local"",""path"":""/opt/install/druid-0.12.0/var/druid/segments/ttDemo/2020-06-11T16:00:00.000Z_2020-06-11T17:00:00.000Z/2020-06-12T03:58:40.472Z/0/index.zip""},""dimensions"":""tagName,sendTS,tagValue,isGood"",""metrics"":"""",""shardSpec"":{""type"":""numbered"",""partitionNum"":0,""partitions"":0},""binaryVersion"":9,""size"":21684530,""identifier"":""ttDemo_2020-06-11T16:00:00.000Z_2020-06-11T17:00:00.000Z_2020-06-12T03:58:40.472Z""}"
- 导出到HDFS存储的信息是:
ttDemo_2020-06-11T16:00:00.000Z_2020-06-11T17:00:00.000Z_2020-06-12T03:58:40.472Z,ttDemo,2020-06-12T03:59:31.408Z,2020-06-11T16:00:00.000Z,2020-06-11T17:00:00.000Z,1,2020-06-12T03:58:40.472Z,1,"{""dataSource"":""ttDemo"",""interval"":""2020-06-11T16:00:00.000Z/2020-06-11T17:00:00.000Z"",""version"":""2020-06-12T03:58:40.472Z"",""loadSpec"":{""type"":""hdfs"",""path"":""/demo/druid/ttDemo/2020-06-11T16_00_00.000Z_2020-06-11T17_00_00.000Z/2020-06-12T03_58_40.472Z/0/index.zip""},""dimensions"":""tagName,sendTS,tagValue,isGood"",""metrics"":"""",""shardSpec"":{""type"":""numbered"",""partitionNum"":0,""partitions"":0},""binaryVersion"":9,""size"":21684530,""identifier"":""ttDemo_2020-06-11T16:00:00.000Z_2020-06-11T17:00:00.000Z_2020-06-12T03:58:40.472Z""}"
一定好看清楚路径地址
/opt/install/druid-0.12.0/var/druid/segments/ttDemo/2020-06-11T16:00:00.000Z_2020-06-11T17:00:00.000Z/2020-06-12T03:58:40.472Z/0/index.zip
/demo/druid/ttDemo/2020-06-11T16_00_00.000Z_2020-06-11T17_00_00.000Z/2020-06-12T03_58_40.472Z/0/index.zip
两个地址有区别的,所以就有了后面的一步
需要把深度存储中的文件目录做更改
数据源 A 下的第一级目录名称中的 : 改成 _
数据源 A 下的每一个第一级目录下的第二级目录中的 : 也改成 _
我这里是使用自己写的程序更改的,程序如下:比较简单的实现,只是为了做个测试
import java.io.File;
/**
* Description :
* PackageName : cn.itdeer
* ProjectName : itdeerlab-tools
* CreatorName : itdeer.cn
* CreateTime : 2020/6/15/16:20
*/
public class RenameOfFile {
public static void main(String[] args) {
String path = null;
if (args.length == 1) {
path = args[0];
} else {
System.out.println("args length not is 1:");
System.exit(0);
}
renameDir(path);
}
private static void renameDir(String path) {
File dirFile = new File(path);
if (dirFile.exists()) {
File[] files = dirFile.listFiles();
if (files != null) {
for (File fileChildDir : files) {
System.out.println(fileChildDir.getName());
String dirName = fileChildDir.getName();
dirName = dirName.replaceAll(":", "_");
fileChildDir.renameTo(new File(path + File.separator + dirName));
renameDir(path + File.separator + dirName);
}
}
}
}
}
传递参数是 数据源的目录,程序会把数据源下的两层目录都做名称更改
[3] 把数据put到HDFS文件系统上
cd /opt/ && tar -zxf segments.tar.gz && rm -fr segments.tar.gz
hadoop fs -put /opt/segments/* /demo/druid/
[4] 查看HDFS上的数据
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T16_00_00.000Z_2020-06-11T17_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T17_00_00.000Z_2020-06-11T18_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T18_00_00.000Z_2020-06-11T19_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T19_00_00.000Z_2020-06-11T20_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T20_00_00.000Z_2020-06-11T21_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T21_00_00.000Z_2020-06-11T22_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T22_00_00.000Z_2020-06-11T23_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-11T23_00_00.000Z_2020-06-12T00_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T00_00_00.000Z_2020-06-12T01_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T01_00_00.000Z_2020-06-12T02_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T09_00_00.000Z_2020-06-12T10_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T10_00_00.000Z_2020-06-12T11_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T11_00_00.000Z_2020-06-12T12_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T12_00_00.000Z_2020-06-12T13_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T13_00_00.000Z_2020-06-12T14_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T14_00_00.000Z_2020-06-12T15_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T15_00_00.000Z_2020-06-12T16_00_00.000Z
drwxr-xr-x - root supergroup 0 2020-06-15 17:41 /demo/druid/ttDemo/2020-06-12T16_00_00.000Z_2020-06-12T17_00_00.000Z
[5] 启动Druid集群
就可以看到会自动加载所有的segment文件了
2020615175320
2020615175338
202061517547
2020615175421
至此,单机版本的Druid升级完成,主要是要注意的是数据文件及目录的权限,和配置文件配置的路径对应好,其他没有什么问题。