dataX的安装
工具部署
System Requirements
Linux
Apache Maven 3.x (Compile DataX)
方法、直接下载DataX工具包:DataX下载地址
cd /data/datax
wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
下载后解压至本地某个目录,进入bin目录,即可运行同步作业:
自检测脚本: [root@mysqls:datax]$ bin/datax.py job/job.json
job.json 是个测试的json 文件。
{
"job": {
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"column" : [
{
"type": "string",
"value": "0101",
}
],
"sliceRecordCount": "10" # 表示这个column 执行10次;
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": "2" # 表示这个conntent 执行2次;
}
}
}
}
输出的结果:
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
0101
2020-02-02 12:00:42.272 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[107]ms
2020-02-02 12:00:42.272 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[1] is successed, used[119]ms
2020-02-02 12:00:42.273 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2020-02-02 12:00:52.148 [job-0] INFO StandAloneJobContainerCommunicator - Total 20 records, 80 bytes | Speed 8B/s, 2 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2020-02-02 12:00:52.148 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2020-02-02 12:00:52.149 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do post work.
2020-02-02 12:00:52.150 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do post work.
2020-02-02 12:00:52.150 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2020-02-02 12:00:52.152 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /data/dataX/datax/hook
2020-02-02 12:00:52.154 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
PS Scavenge | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
2020-02-02 12:00:52.154 [job-0] INFO JobContainer - PerfTrace not enable!
2020-02-02 12:00:52.155 [job-0] INFO StandAloneJobContainerCommunicator - Total 20 records, 80 bytes | Speed 8B/s, 2 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2020-02-02 12:00:52.156 [job-0] INFO JobContainer -
任务启动时刻 : 2020-02-02 12:00:42
任务结束时刻 : 2020-02-02 12:00:52
任务总计耗时 : 10s
任务平均流量 : 8B/s
记录写入速度 : 2rec/s
读出记录总数 : 20
读写失败总数 : 0