DataX 是阿里巴巴集团内被广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、DRDS 等各种异构数据源之间高效的数据同步功能
最新的版本已经指出 ES 记录一下步骤
环境要求:
Linux
JDK1.8
Python2.6
wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
wget http://www.trieuvan.com/apache/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz
解压即用,所有的job 都配置在 /datax/job
启动命令
python datax.py ./job/stream2stream.json
懒得人总是很多,DataX集成可视化页面 孕育而生
下载安装:
git clone https://github.com/pengls/datax-admin.git
安装mvn 编译
tar zxvf apache-maven-3.6.3-bin.tar.gz && mv apache-maven-3.6.3 /usr/local/maven3
添加到环境变量
export M2_HOME=/usr/local/maven3
export PATH=$PATH:$JAVA_HOME/bin:$M2_HOME/bin
测试
mvn -v
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/maven3
Java version: 1.8.0_251, vendor: Oracle Corporation, runtime: /usr/local/jdk1.8/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-514.26.2.el7.x86_64", arch: "amd64", family: "unix"
1.修改datax_admin下resources/application.yml文件
cat /opt/datax-web/datax-admin/src/main/resources/application.yml<<EOF
# 配置mybatis-plus打印slq日志
logging:
level:
com.wugui.datax.admin.mapper: error
path: ./data/applogs/admin ##目录不存在请新建
执行doc/db下面的datax_web.sql文件
2.修改datax_executor下resources/application.yml文件
# log config
logging:
config: classpath:logback.xml
path: ./data/applogs/executor/jobhandler
修改日志路径path
datax:
job:
admin:
### datax-web admin address
addresses: http://127.0.0.1:8080
executor:
appname: datax-executor
ip:
port: 9999
### job log path
logpath: ./data/applogs/executor/jobhandler
### job log retention days
logretentiondays: 30
executor:
jsonpath: /Users/mac/data/applogs
pypath: /Users/mac/tools/datax/bin/datax.py
编译部署
1.本地安装好maven环境,安装此处细节忽略
2.执行mvn package -Dmaven.test.skip=true
3.打包成功后分别将datax-admin、datax-executor模块target下datax-admin-2.1.1.jar、datax-executor-2.1.1.jar放到指定目录
4.分别启动datax-admin-2.1.1.jar、datax-executor-2.1.1.jar
5.启动命令demo: nohup java -Xmx1024M -Xms1024M -Xmn448M -XX:MaxMetaspaceSize=192M -XX:MetaspaceSize=192M -jar datax-admin-2.1.1.jar& nohup java -Xmx1024M -Xms1024M -Xmn448M -XX:MaxMetaspaceSize=192M -XX:MetaspaceSize=192M -jar datax-executor-2.1.1.jar&
访问测试
http://127.0.0.1:8080/index.html#/dashboard