https://github.com/alibaba/canal
Canal:阿里巴巴mysql数据库binlog的增量订阅&消费组件 。阿里云DRDS( https://www.aliyun.com/product/drds )、阿里巴巴TDDL 二级索引、小表复制powerd by canal. Aliyun Data Lake Analytics https://www.aliyun.com/product/datalakeanalytics
目录
开启Canal之旅
Docker快速开始
https://github.com/alibaba/canal/wiki/Docker-QuickStart
MySQL要求
https://github.com/alibaba/canal/wiki/AdminGuide
a. 当前的canal开源版本支持5.7及以下的版本(阿里内部mysql 5.7.13, 5.6.10, mysql 5.5.18和5.1.40/48),ps. mysql4.x版本没有经过严格测试,理论上是可以兼容
b. canal的原理是基于mysql binlog技术,所以这里一定需要开启mysql的binlog写入功能,并且配置binlog模式为row.
[mysqld] log-bin=mysql-bin #添加这一行就ok binlog-format=ROW #选择row模式 server_id=1 #配置mysql replaction需要定义,不能和canal的slaveId重复
数据库重启后, 简单测试 `my.cnf` 配置是否生效:
mysql> show variables like 'binlog_format';+---------------+-------+| Variable_name | Value |+---------------+-------+| binlog_format | ROW |+---------------+-------+
mysql> show variables like 'log_bin';+---------------+-------+| Variable_name | Value |+---------------+-------+| log_bin | ON |+---------------+-------+
如果 my.cnf 设置不起作用,请参考:
https://stackoverflow.com/questions/38288646/changes-to-my-cnf-dont-take-effect-ubuntu-16-04-mysql-5-6
https://stackoverflow.com/questions/52736162/set-binlog-for-mysql-5-6-ubuntu16-4
c. canal的原理是模拟自己为mysql slave,所以这里一定需要做为mysql slave的相关权限
CREATE USER canal IDENTIFIED BY 'canal'; GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%'; -- GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' ; FLUSH PRIVILEGES;
针对已有的账户可通过grants查询权限:
show grants for 'canal'
运行Canal容器
注意:需要明确端口号和配置文件路径。
错误运行方式
错误运行是什么意思?即缺少一些基本的配置参数,虽然有修改跟我们理解上还是有一定区别
#运行canal容器 docker run -p 11111:11111 --name canal -d canal/canal-server #带配置映射docker run -p 11111:11111 --name canal -v /usr/local/canal/canal.properties:/home/admin/canal-server/conf/canal.properties -d canal/canal-server
canal.properties配置:需要注意id和instance配置(真实情况Docker支持单个instance不需要修改canal.properties)
########################################################## common argument ############# ##################################################canal.manager.jdbc.url=jdbc:mysql://127.0.0.1:3306/canal_manager?useUnicode=true&characterEncoding=UTF-8#canal.manager.jdbc.username=root#canal.manager.jdbc.password=121212#id不能重复(默认id=1)canal.id = 10002canal.ip =canal.port = 11111canal.metrics.pull.port = 11112canal.zkServers =# flush data to zkcanal.zookeeper.flush.period = 1000canal.withoutNetty = false# tcp, kafka, RocketMQcanal.serverMode = tcp# flush meta cursor/parse position to filecanal.file.data.dir = ${canal.conf.dir}canal.file.flush.period = 1000## memory store RingBuffer size, should be Math.pow(2,n)canal.instance.memory.buffer.size = 16384## memory store RingBuffer used memory unit size , default 1kbcanal.instance.memory.buffer.memunit = 1024 ## meory store gets mode used MEMSIZE or ITEMSIZEcanal.instance.memory.batch.mode = MEMSIZEcanal.instance.memory.rawEntry = true ## detecing configcanal.instance.detecting.enable = false#canal.instance.detecting.sql = insert into retl.xdual values(1,now()) on duplicate key update x=now()canal.instance.detecting.sql = select 1canal.instance.detecting.interval.time = 3canal.instance.detecting.retry.threshold = 3canal.instance.detecting.heartbeatHaEnable = false # support maximum transaction size, more than the size of the transaction will be cut into multiple transactions deliverycanal.instance.transaction.size = 1024# mysql fallback connected to new master should fallback timescanal.instance.fallbackIntervalInSeconds = 60 # network configcanal.instance.network.receiveBufferSize = 16384canal.instance.network.sendBufferSize = 16384canal.instance.network.soTimeout = 30 # binlog filter configcanal.instance.filter.druid.ddl = truecanal.instance.filter.query.dcl = falsecanal.instance.filter.query.dml = falsecanal.instance.filter.query.ddl = falsecanal.instance.filter.table.error = falsecanal.instance.filter.rows = falsecanal.instance.filter.transaction.entry = false # binlog format/image checkcanal.instance.binlog.format = ROW,STATEMENT,MIXED canal.instance.binlog.image = FULL,MINIMAL,NOBLOB # binlog ddl isolationcanal.instance.get.ddl.isolation = false # parallel parser configcanal.instance.parser.parallel = true## concurrent thread number, default 60% available processors, suggest not to exceed Runtime.getRuntime().availableProcessors()#canal.instance.parser.parallelThreadSize = 16## disruptor ringbuffer size, must be power of 2canal.instance.parser.parallelBufferSize = 256 # table meta tsdb infocanal.instance.tsdb.enable = truecanal.instance.tsdb.dir = ${canal.file.data.dir:../conf}/${canal.instance.destination:}canal.instance.tsdb.url = jdbc:h2:${canal.instance.tsdb.dir}/h2;CACHE_SIZE=1000;MODE=MYSQL;canal.instance.tsdb.dbUsername = canalcanal.instance.tsdb.dbPassword = canal# dump snapshot interval, default 24 hourcanal.instance.tsdb.snapshot.interval = 24# purge snapshot expire , default 360 hour(15 days)canal.instance.tsdb.snapshot.expire = 360 # aliyun ak/sk , support rds/mqcanal.aliyun.accessKey =canal.aliyun.secretKey = ########################################################## destinations ############# #################################################canal.destinations = example# conf root dircanal.conf.dir = ../conf# auto scan instance dir add/remove and start/stop instancecanal.auto.scan = truecanal.auto.scan.interval = 5 canal.instance.tsdb.spring.xml = classpath:spring/tsdb/h2-tsdb.xml#canal.instance.tsdb.spring.xml = classpath:spring/tsdb/mysql-tsdb.xml canal.instance.global.mode = springcanal.instance.global.lazy = false#canal.instance.global.manager.address = 127.0.0.1:1099#canal.instance.global.spring.xml = classpath:spring/memory-instance.xmlcanal.instance.global.spring.xml = classpath:spring/file-instance.xml#canal.instance.global.spring.xml = classpath:spring/default-instance.xml ########################################################### MQ ###############################################################canal.mq.servers = 127.0.0.1:6667canal.mq.retries = 0canal.mq.batchSize = 16384canal.mq.maxRequestSize = 1048576canal.mq.lingerMs = 100canal.mq.bufferMemory = 33554432canal.mq.canalBatchSize = 50canal.mq.canalGetTimeout = 100canal.mq.flatMessage = truecanal.mq.compressionType = nonecanal.mq.acks = all# use transaction for kafka flatMessage batch producecanal.mq.transaction = false#canal.mq.properties. =
正确运行方式一
#带配置映射docker run -p 11111:11111 --name canal -v /usr/local/canal/example/instance.properties:/home/admin/canal-server/conf/example/instance.properties -d canal/canal-server
instance.properties配置,只需要设置MySQL容器实例的地址: canal.instance.master.address=172.17.0.4:3306
################################################### mysql serverId , v1.0.26+ will autoGen# canal.instance.mysql.slaveId=0 # enable gtid use true/falsecanal.instance.gtidon=false # position infocanal.instance.master.address=172.17.0.4:3306canal.instance.master.journal.name=canal.instance.master.position=canal.instance.master.timestamp=canal.instance.master.gtid= # rds oss binlogcanal.instance.rds.accesskey=canal.instance.rds.secretkey=canal.instance.rds.instanceId= # table meta tsdb infocanal.instance.tsdb.enable=true#canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb#canal.instance.tsdb.dbUsername=canal#canal.instance.tsdb.dbPassword=canal #canal.instance.standby.address =#canal.instance.standby.journal.name =#canal.instance.standby.position =#canal.instance.standby.timestamp =#canal.instance.standby.gtid= # username/passwordcanal.instance.dbUsername=canalcanal.instance.dbPassword=canalcanal.instance.connectionCharset = UTF-8# enable druid Decrypt database passwordcanal.instance.enableDruid=false#canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ== # table regexcanal.instance.filter.regex=.*\\..*# table black regexcanal.instance.filter.black.regex= # mq configcanal.mq.topic=example# dynamic topic route by schema or table regex#canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*canal.mq.partition=0# hash partition config#canal.mq.partitionsNum=3#canal.mq.partitionHash=test.table:id^name,.*\\..*#################################################
正确运行方式二
参考文章地址:https://my.oschina.net/amhuman/blog/1941540
采用如下方式可以直接运行canal实现binlog的增量消费:
docker run --name canal -e canal.instance.master.address=192.168.1.111:3365 -e canal.instance.dbUsername=canal -e canal.instance.dbPassword=canal -p 11111:11111 -d canal/canal-server
开启服务器端口访问
#开启端口访问firewall-cmd --zone=public --add-port=11111/tcp --permanent #重载防火墙firewall-cmd --reload
Java测试客户端
请使用canal提供的示例进行测试,https://github.com/alibaba/canal/wiki/ClientExample
package com.alibaba.otter; import java.net.InetSocketAddress;import java.util.List;import com.alibaba.otter.canal.client.CanalConnector;import com.alibaba.otter.canal.client.CanalConnectors;import com.alibaba.otter.canal.common.utils.AddressUtils;import com.alibaba.otter.canal.protocol.CanalEntry.Column;import com.alibaba.otter.canal.protocol.CanalEntry.Entry;import com.alibaba.otter.canal.protocol.CanalEntry.EntryType;import com.alibaba.otter.canal.protocol.CanalEntry.EventType;import com.alibaba.otter.canal.protocol.CanalEntry.RowChange;import com.alibaba.otter.canal.protocol.CanalEntry.RowData;import com.alibaba.otter.canal.protocol.Message;/** * * @author PJL * * @note 功能描述:TODO增删改查--事件捕捉 * @package com.alibaba.otter * @filename SimpleCanalClientExample.java * @date 2019年4月16日 上午9:16:24 */public class SimpleCanalClientExample { public static void main(String args[]) { // 创建链接 CanalConnector connector = CanalConnectors.newSingleConnector(new InetSocketAddress("192.168.1.111"/*AddressUtils.getHostIp()*/, 11111), "example", "canal", "canal"); int batchSize = 1000; int emptyCount = 0; try { connector.connect(); connector.subscribe(".*\\..*"); connector.rollback(); int totalEmptyCount = 120; while (emptyCount < totalEmptyCount) { Message message = connector.getWithoutAck(batchSize); // 获取指定数量的数据 long batchId = message.getId(); int size = message.getEntries().size(); if (batchId == -1 || size == 0) { emptyCount++; System.out.println("empty count : " + emptyCount); try { Thread.sleep(1000); } catch (InterruptedException e) { } } else { emptyCount = 0; // System.out.printf("message[batchId=%s,size=%s] \n", batchId, size); printEntry(message.getEntries()); } connector.ack(batchId); // 提交确认 // connector.rollback(batchId); // 处理失败, 回滚数据 } System.out.println("empty too many times, exit"); } finally { connector.disconnect(); } } private static void printEntry(List<Entry> entrys) { for (Entry entry : entrys) { if (entry.getEntryType() == EntryType.TRANSACTIONBEGIN || entry.getEntryType() == EntryType.TRANSACTIONEND) { continue; } RowChange rowChage = null; try { rowChage = RowChange.parseFrom(entry.getStoreValue()); } catch (Exception e) { throw new RuntimeException("ERROR ## parser of eromanga-event has an error , data:" + entry.toString(), e); } EventType eventType = rowChage.getEventType(); // 可以获取到数据库实例名称、日志文件、当前操作的表以及执行的增删改查的操作 String logFileName= entry.getHeader().getLogfileName(); long logFileOffset= entry.getHeader().getLogfileOffset(); String dbName=entry.getHeader().getSchemaName(); String tableName=entry.getHeader().getTableName(); System.out.println(String.format("=======> binlog[%s:%s] , name[%s,%s] , eventType : %s", logFileName, logFileOffset, dbName, tableName, eventType)); for (RowData rowData : rowChage.getRowDatasList()) { if (eventType == EventType.DELETE) { // 删除 printColumn(rowData.getBeforeColumnsList()); } else if (eventType == EventType.INSERT) { // 新增 printColumn(rowData.getAfterColumnsList()); } else { System.out.println("-------> before"); printColumn(rowData.getBeforeColumnsList()); System.out.println("-------> after"); printColumn(rowData.getAfterColumnsList()); } } } } private static void printColumn(List<Column> columns) { for (Column column : columns) { System.out.println(column.getName() + " : " + column.getValue() + " update=" + column.getUpdated()); } } }
错误运行容器测试结果
此处测试捕获数据库插入删除事件,不过未捕获到,有待进一步研究!超时关闭机制:
empty count : 1empty count : 2empty count : 3empty count : 4empty count : 5empty count : 6empty count : 7empty count : 8empty count : 9empty count : 10empty count : 11empty count : 12empty count : 13empty count : 14empty count : 15empty count : 16empty count : 17empty count : 18empty count : 19empty count : 20empty count : 21empty count : 22empty count : 23empty count : 24empty count : 25empty count : 26empty count : 27empty count : 28empty count : 29empty count : 30empty count : 31empty count : 32empty count : 33empty count : 34empty count : 35empty count : 36empty count : 37empty count : 38empty count : 39empty count : 40empty count : 41empty count : 42empty count : 43empty count : 44empty count : 45empty count : 46empty count : 47empty count : 48empty count : 49empty count : 50empty count : 51empty count : 52empty count : 53empty count : 54empty count : 55empty count : 56empty count : 57empty count : 58empty count : 59empty count : 60empty count : 61empty count : 62empty count : 63empty count : 64empty count : 65empty count : 66empty count : 67empty count : 68empty count : 69empty count : 70empty count : 71empty count : 72empty count : 73empty count : 74empty count : 75empty count : 76empty count : 77empty count : 78empty count : 79empty count : 80empty count : 81empty count : 82empty count : 83empty count : 84empty count : 85empty count : 86empty count : 87empty count : 88empty count : 89empty count : 90empty count : 91empty count : 92empty count : 93empty count : 94empty count : 95empty count : 96empty count : 97empty count : 98empty count : 99empty count : 100empty count : 101empty count : 102empty count : 103empty count : 104empty count : 105empty count : 106empty count : 107empty count : 108empty count : 109empty count : 110empty count : 111empty count : 112empty count : 113empty count : 114empty count : 115empty count : 116empty count : 117empty count : 118empty count : 119empty count : 120empty too many times, exit
正常运行容器测试结果
=======> binlog[mysql-bin.000004:3504] , name[service_db,sys_user] , eventType : INSERTid : 52 update=truename : boonya update=trueage : 28 update=trueempty count : 1empty count : 2empty count : 3empty count : 4empty count : 5empty count : 6empty count : 7empty count : 8empty count : 9empty count : 10empty count : 11empty count : 12empty count : 13empty count : 14empty count : 15=======> binlog[mysql-bin.000004:3790] , name[service_db,sys_user] , eventType : INSERTid : 53 update=truename : boonya update=trueage : 28 update=trueempty count : 1empty count : 2empty count : 3empty count : 4empty count : 5empty count : 6empty count : 7empty count : 8empty count : 9empty count : 10empty count : 11empty count : 12empty count : 13empty count : 14empty count : 15empty count : 16empty count : 17empty count : 18empty count : 19empty count : 20empty count : 21empty count : 22empty count : 23empty count : 24=======> binlog[mysql-bin.000004:4076] , name[service_db,sys_user] , eventType : INSERTid : 54 update=truename : boonya update=trueage : 28 update=true=======> binlog[mysql-bin.000004:4362] , name[service_db,sys_user] , eventType : INSERTid : 55 update=truename : boonya update=trueage : 28 update=trueempty count : 1=======> binlog[mysql-bin.000004:4648] , name[service_db,sys_user] , eventType : INSERTid : 56 update=truename : boonya update=trueage : 28 update=true=======> binlog[mysql-bin.000004:4934] , name[service_db,sys_user] , eventType : INSERTid : 57 update=truename : boonya update=trueage : 28 update=trueempty count : 1empty count : 2empty count : 3empty count : 4empty count : 5empty count : 6empty count : 7empty count : 8empty count : 9empty count : 10empty count : 11empty count : 12empty count : 13empty count : 14empty count : 15empty count : 16empty count : 17empty count : 18empty count : 19empty count : 20empty count : 21empty count : 22empty count : 23empty count : 24empty count : 25empty count : 26empty count : 27empty count : 28empty count : 29empty count : 30empty count : 31empty count : 32empty count : 33empty count : 34empty count : 35=======> binlog[mysql-bin.000004:5220] , name[service_db,sys_user] , eventType : INSERTid : 58 update=truename : boonya update=trueage : 28 update=true=======> binlog[mysql-bin.000004:5506] , name[service_db,sys_user] , eventType : INSERTid : 59 update=truename : boonya update=trueage : 28 update=trueempty count : 1=======> binlog[mysql-bin.000004:5792] , name[service_db,sys_user] , eventType : INSERTid : 60 update=truename : boonya update=trueage : 28 update=true=======> binlog[mysql-bin.000004:6078] , name[service_db,sys_user] , eventType : INSERTid : 61 update=truename : boonya update=trueage : 28 update=trueempty count : 1empty count : 2empty count : 3empty count : 4empty count : 5empty count : 6empty count : 7empty count : 8empty count : 9empty count : 10empty count : 11empty count : 12empty count : 13empty count : 14empty count : 15empty count : 16empty count : 17empty count : 18empty count : 19empty count : 20
数据库异常问题
mysql5.7报错:
[Err] 1055 - Expression #1 of ORDER BY clause is not in GROUP BY clause and contains nonaggregated column 'information_schema.PROFILING.SEQ' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
解决方法:
show variables like "sql_mode"; set sql_mode='';set sql_mode='NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES';
Canal只是一个基于增量日志的通道,如果需要数据库实现备份还需要在Canal基础上引入Otter,Otter定义了Channel、Pipleline、Node(机器节点)、数据源、目标数据源、数据库表、主备配置等等帮助实现数据库增量备份机制。