0. summary
1. 下载和安装
2. 参数
3. 导出示例
4. 如何实现一致性的多个线程导出
5. 原理参考
1. 下载和安装
### 下载 ####
git clone https://github.com/maxbube/mydumper
#### 安装需要的包 ####
yum -y install cmake
yum -y install glib2-devel
yum -y install pcre-devel
yum -y install zlib-devel
yum -y install openssl-devel
yum -y install gcc-c++
#### 安装过程 ####
[root@test-1 mydumper]# cmake .
-- The CXX compiler identification is GNU 4.8.5
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Using mysql-config: /usr/local/mysql/bin/mysql_config
-- Found MySQL: /usr/local/mysql/include, /usr/local/mysql/lib/libmysqlclient.so;/usr/lib64/libpthread.so;/usr/lib64/libm.so;/usr/lib64/librt.so;/usr/lib64/libdl.so
CMake Warning at docs/CMakeLists.txt:9 (message):
Unable to find Sphinx documentation generator
-- ------------------------------------------------
-- MYSQL_CONFIG = /usr/local/mysql/bin/mysql_config
-- CMAKE_INSTALL_PREFIX = /usr/local
-- BUILD_DOCS = ON
-- WITH_BINLOG = OFF
-- RUN_CPPCHECK = OFF
-- Change a values with: cmake -D<Variable>=<Value>
-- ------------------------------------------------
--
-- Configuring done
-- Generating done
-- Build files have been written to: /usr/local/mydumper
[root@test-1 mydumper]# echo $?
0
[root@test-1 mydumper]# make
Scanning dependencies of target mydumper
[ 25%] Building C object CMakeFiles/mydumper.dir/mydumper.c.o
[ 50%] Building C object CMakeFiles/mydumper.dir/server_detect.c.o
[ 75%] Building C object CMakeFiles/mydumper.dir/g_unix_signal.c.o
Linking C executable mydumper
[ 75%] Built target mydumper
Scanning dependencies of target myloader
[100%] Building C object CMakeFiles/myloader.dir/myloader.c.o
Linking C executable myloader
[100%] Built target myloader
#### 添加环境变量 ####
export PATH=/usr/local/mysql/bin:/usr/local/mydumper:$PATH
2. 参数
[root@test-1 ~]# mydumper --help
Usage:
mydumper [OPTION...] multi-threaded MySQL dumping
Help Options:
-?, --help Show help options
Application Options:
-B, --database Database to dump
-T, --tables-list Comma delimited table list to dump (does not exclude regex option)
-o, --outputdir Directory to output files to
-s, --statement-size Attempted size of INSERT statement in bytes, default 1000000
-r, --rows Try to split tables into chunks of this many rows. This option turns off --chunk-filesize
-F, --chunk-filesize Split tables into chunks of this output file size. This value is in MB
-c, --compress Compress output files
-e, --build-empty-files Build dump files even if no data available from table
-x, --regex Regular expression for 'db.table' matching
-i, --ignore-engines Comma delimited list of storage engines to ignore
-m, --no-schemas Do not dump table schemas with the data
-d, --no-data Do not dump table data
-G, --triggers Dump triggers
-E, --events Dump events
-R, --routines Dump stored procedures and functions
-W, --no-views Do not dump VIEWs
-k, --no-locks Do not execute the temporary shared read lock. WARNING: This will cause inconsistent backups
--no-backup-locks Do not use Percona backup locks
--less-locking Minimize locking time on InnoDB tables.
-l, --long-query-guard Set long query timer in seconds, default 60
-K, --kill-long-queries Kill long running queries (instead of aborting)
-D, --daemon Enable daemon mode
-I, --snapshot-interval Interval between each dump snapshot (in minutes), requires --daemon, default 60
-L, --logfile Log file name to use, by default stdout is used
--tz-utc SET TIME_ZONE='+00:00' at top of dump to allow dumping of TIMESTAMP data when a server has data in different time zones or data is being moved between servers with different time zones, defaults to on use --skip-tz-utc to disable.
--skip-tz-utc
--use-savepoints Use savepoints to reduce metadata locking issues, needs SUPER privilege
--success-on-1146 Not increment error count and Warning instead of Critical in case of table doesn't exist
--lock-all-tables Use LOCK TABLE for all, instead of FTWRL
-U, --updated-since Use Update_time to dump only tables updated in the last U days
--trx-consistency-only Transactional consistency only
--complete-insert Use complete INSERT statements that include column names
-h, --host The host to connect to
-u, --user Username with the necessary privileges
-p, --password User password
-P, --port TCP/IP port to connect to
-S, --socket UNIX domain socket file to use for connection
-t, --threads Number of threads to use, default 4
-C, --compress-protocol Use compression on the MySQL connection
-V, --version Show the program version and exit
-v, --verbose Verbosity of output, 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2
--defaults-file Use a specific defaults file
常用参数:
-D, --daemon Enable daemon mode ---- 开启后台
--trx-consistency-only Transactional consistency only ---- 一致性导出
[root@test-1 mdata]# myloader --help
Usage:
myloader [OPTION...] multi-threaded MySQL loader
Help Options:
-?, --help Show help options
Application Options:
-d, --directory Directory of the dump to import
-q, --queries-per-transaction Number of queries per transaction, default 1000
-o, --overwrite-tables Drop tables if they already exist
-B, --database An alternative database to restore into
-s, --source-db Database to restore
-e, --enable-binlog Enable binary logging of the restore data
-h, --host The host to connect to
-u, --user Username with the necessary privileges
-p, --password User password
-P, --port TCP/IP port to connect to
-S, --socket UNIX domain socket file to use for connection
-t, --threads Number of threads to use, default 4
-C, --compress-protocol Use compression on the MySQL connection
-V, --version Show the program version and exit
-v, --verbose Verbosity of output, 0 = silent, 1 = errors, 2 = warnings, 3 = info, default 2
--defaults-file Use a specific defaults file
也可以用mysql <来导入,但是开不了并行,使用myloader就很简单,指定directory就可以了,默认并行4.
3. 导出示例
[root@test-1 mdata]# mydumper -B mytest
导出会在当前目录下建立export加上时间戳的目录
[root@test-1 mdata]# ls -ld export-*
drwx------. 2 root root 4096 Jan 28 12:16 export-20170128-121638
[root@test-1 mdata]# cd export-20170128-121638/
[root@test-1 export-20170128-121638]# ls
metadata mytest.c-schema.sql mytest.mytest-schema.sql mytest.t1-schema.sql mytest.t-schema.sql mytest.v-schema.sql
mytest.a-schema.sql mytest.c.sql mytest.mytest.sql mytest.test_load-schema.sql mytest.t.sql mytest.z-schema.sql
mytest.a.sql mytest.d-schema.sql mytest-schema-create.sql mytest.test_load.sql mytest.u-schema.sql mytest.z.sql
mytest.b-schema.sql mytest.mytest1-schema.sql mytest.stock-schema.sql mytest.test-schema.sql mytest.UserInfo-schema.sql mytest.zz-schema.sql
mytest.b.sql mytest.mytest1.sql mytest.stock.sql mytest.test.sql mytest.UserInfo.sql mytest.zz.sql
对于每张表,都有表结构和插入数据两个sql文件
[root@test-1 export-20170128-121638]# cat mytest.a-schema.sql
/*!40101 SET NAMES binary*/;
/*!40014 SET FOREIGN_KEY_CHECKS=0*/;
CREATE TABLE `a` (
`a` int(11) NOT NULL,
`b` int(11) DEFAULT NULL,
`c` int(11) DEFAULT NULL,
`d` int(11) DEFAULT NULL,
PRIMARY KEY (`a`),
UNIQUE KEY `b` (`b`),
KEY `c` (`c`),
KEY `d` (`d`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
[root@test-1 export-20170128-121638]# cat mytest.a.sql
/*!40101 SET NAMES binary*/;
/*!40014 SET FOREIGN_KEY_CHECKS=0*/;
/*!40103 SET TIME_ZONE='+00:00' */;
INSERT INTO `a` VALUES
(1,4,1,1),
(2,2,2,2),
(3,5,3,3),
(5,7,8,9);
还有个metadata的文件,记录当前的filename、postition和GTID
[root@test-1 export-20170128-121638]# cat metadata
Started dump at: 2017-01-28 12:16:38
SHOW MASTER STATUS:
Log: bin.000084
Pos: 194
GTID:713a7f7f-6f53-11e6-b7a9-000c29de5d8b:1-284340
Finished dump at: 2017-01-28 12:16:38
这么导的好处有两点:
- 恢复的时候开多个并行导入。
- 通过一个备份里面恢复指定的表,而并不希望全部恢复,mydumper的情况下非常简单。mysql官方的导入导出工具很难办到。
4. 如何实现一致性的多个线程导出
首先对于单表,需要加上-r的参数来支持并行导出,需要有个自增的主键
mydumper -t 4 -r 1000 --trx-consistency-only -B sbtest -T sbtest1
通过通用日志跟踪的方法
(root@localhost) [mysql]> set global general_log = 0;
Query OK, 0 rows affected (0.02 sec)
(root@localhost) [mysql]> truncate table general_log;
Query OK, 0 rows affected (0.01 sec)
(root@localhost) [mysql]> set global general_log = 1;
Query OK, 0 rows affected (0.00 sec)
(root@localhost) [mysql]> set global log_output = 'TABLE';
Query OK, 0 rows affected (0.01 sec)
另一个session执行导出
[root@test-1 mdata]# mydumper -t 4 -r 10000 --trx-consistency-only -B sbtest -T sbtest1
** (mydumper:57204): WARNING **: Using trx_consistency_only, binlog coordinates will not be accurate if you are writing to non transactional tables.
关闭通用日志
(root@localhost) [mysql]> set global general_log = 0;
Query OK, 0 rows affected (0.00 sec)
查看通用日志
(root@localhost) [mysql]> select thread_id,left(argument,120) from general_log order by event_time limit 64;
+-----------+--------------------------------------------------------------------------------------------------------------+
| thread_id | left(argument,120) |
+-----------+--------------------------------------------------------------------------------------------------------------+
| 43 | /*!40101 SET NAMES binary*/ |
| 44 | /*!40101 SET NAMES binary*/ |
| 45 | /*!40101 SET NAMES binary*/ |
| 46 | /*!40101 SET NAMES binary*/ |
| 44 | SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest`.`sbtest1` WHERE `id` IS NULL OR (`id` >= 1 AND `id` < 12501) |
| 45 | SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest`.`sbtest1` WHERE (`id` >= 12501 AND `id` < 25001) |
| 43 | SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest`.`sbtest1` WHERE (`id` >= 25001 AND `id` < 37501) |
| 46 | SELECT /*!40001 SQL_NO_CACHE */ * FROM `sbtest`.`sbtest1` WHERE (`id` >= 37501 AND `id` < 50001) |
| 44 | SHOW CREATE TABLE `sbtest`.`sbtest1` |
| 43 | |
| 44 | |
| 45 | |
| 46 | |
| 42 | root@localhost on sbtest using Socket |
| 42 | SET SESSION wait_timeout = 2147483 |
| 42 | SET SESSION net_write_timeout = 2147483 |
| 42 | SHOW PROCESSLIST |
| 42 | FLUSH TABLES WITH READ LOCK |
| 42 | START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */ | ---- 一个线程去执行实例级别的读锁,其他线程读到的就是一致性的数据。
| 42 | /*!40101 SET NAMES binary*/ |
| 42 | SHOW MASTER STATUS |
| 42 | SHOW SLAVE STATUS |
| 43 | root@localhost on using Socket |
| 43 | SET SESSION wait_timeout = 2147483 |
| 43 | SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ |
| 43 | START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */ |
| 43 | /*!40103 SET TIME_ZONE='+00:00' */ |
| 44 | root@localhost on using Socket |
| 44 | SET SESSION wait_timeout = 2147483 |
| 44 | SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ |
| 44 | START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */ |
| 44 | /*!40103 SET TIME_ZONE='+00:00' */ |
| 45 | root@localhost on using Socket |
| 45 | SET SESSION wait_timeout = 2147483 |
| 45 | SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ |
| 45 | START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */ |
| 45 | /*!40103 SET TIME_ZONE='+00:00' */ |
| 46 | root@localhost on using Socket |
| 46 | SET SESSION wait_timeout = 2147483 |
| 46 | SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ | ---- 有4个线程开启了RR事务隔离级别
| 46 | START TRANSACTION /*!40108 WITH CONSISTENT SNAPSHOT */ | ---- 创建快照读
| 46 | /*!40103 SET TIME_ZONE='+00:00' */ |
| 42 | UNLOCK TABLES /* trx-only */ | ---- 释放了实例级别的读锁
| 42 | sbtest |
| 42 | SHOW TABLE STATUS |
| 42 | SHOW INDEX FROM `sbtest`.`sbtest1` |
| 42 | SELECT /*!40001 SQL_NO_CACHE */ MIN(`id`),MAX(`id`) FROM `sbtest`.`sbtest1` | ---- 对自增列求最大最小值进行一个分片
| 42 | EXPLAIN SELECT `id` FROM `sbtest`.`sbtest1` |
| 42 | SHOW CREATE DATABASE `sbtest` |
| 42 | |
| 21 | set global general_log = 0 |
+-----------+--------------------------------------------------------------------------------------------------------------+
51 rows in set (0.01 sec)
导出来的是多个文件,所以支持单表的并行导入
[root@test-1 export-20170130-000445]# ls -ltr
total 9728
-rw-r--r--. 1 root root 68 Jan 30 00:04 sbtest-schema-create.sql
-rw-r--r--. 1 root root 2476583 Jan 30 00:04 sbtest.sbtest1.00000.sql
-rw-r--r--. 1 root root 419 Jan 30 00:04 sbtest.sbtest1-schema.sql
-rw-r--r--. 1 root root 2487688 Jan 30 00:04 sbtest.sbtest1.00001.sql
-rw-r--r--. 1 root root 2487689 Jan 30 00:04 sbtest.sbtest1.00002.sql
-rw-r--r--. 1 root root 2487687 Jan 30 00:04 sbtest.sbtest1.00003.sql
-rw-r--r--. 1 root root 175 Jan 30 00:04 metadata
5. 原理参考
开源MYSQL多线程逻辑导出工具MYDUMPER原理与改进: