mydumper

【1】mydumper介绍

(1.1)基本概念与优劣

而你的MySQL版本应该更新到5.5版本以上,mydumper作为一个实用工具,能够良好支持多线程工作,可以并行的多线程的从表中读入数据并同时写到不同的文件里这使得它在处理速度方面快于传统的mysqldump。

mysqldumper其特征之一是在处理过程中需要对列表加以锁定,如果我们需要在工作时段执行备份工作,那么会引起DML阻塞不过现在的MySQL一般都是主从结构,备份也大部分在从上进行,所以锁的问题可以不用考虑因此mydumper能更优秀快速的完成备份任务

mydumper特性

1.多线程备份

2因为是多线程逻辑备份,备份后会生成多个备份文件

3.备份时对MyISAM表施加FTWRL(FLUSH TABLES WITH READ LOCK),会阻塞DML语句

4.保证备份数据的一致性

5.支持文件压缩

6.支持导出binlog

7.支持多线程恢复

8.支持以守护进程模式工作,定时快照和连续二进制日志

9.支持将备份文件切块

mydumper优点

1.轻量级C语言写的

2.执行速度比mysqldump快10倍

3.事务性和非事务性表一致的快照(适用于0.22以上版本)

4.快速的文件压缩

5.支持导出binlog

6.多线程恢复(适用于0.2.1以上版本)

7.以守护进程的工作方式,定时快照和连续二进制日志(适用于0.5.0以上版本)

8.开源(GNUGPLv3)

mydumper缺点

  mydumper的多线程备份是基于表的,当只有一张表或99张是小表,张是超级大表,mydumper不如mysqldump,甚至更慢

对于大表情况注意使用参数-r

(1.2)备份机制原理

对于MySQL官方提供的逻辑备份工具 mysqldump , mydumper最大的特点就是可以采用多线程并行备份,大大提高了数据导出的速度。

这里对mydumper的工作原理做个分析,看一下mydumper如何巧妙的利用Innodb引擎提供的MVCC版本控制的功能,实现多线程并发获取一致性数据。

这里一致性数据指的是在某个时间点,导出的数据与导出的Binlog文件信息相匹配,如果导出了多张表的数据,这些不同表之间的数据都是同一个时间点的数据。

在mydumper进行备份的时候,由一个主线程以及多个备份线程完成。

其主线程的流程是:

  连接数据库

FLUSH TABLES WITH READ LOCK 将脏页刷新到磁盘并获得只读锁

START TRANSACTION /!40108 WITH CONSISTENT SNAPSHOT / 开启事物并获取一致性快照

SHOW MASTER STATUS 获得binlog信息

创建子线程并连接数据库

为子线程分配任务并push到队列中

UNLOCK TABLES / FTWRL / 释放锁

子线程的主要流程是:

  连接数据库

  SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE

  START TRANSACTION /!40108 WITH CONSISTENT SNAPSHOT /

从队列中pop任务并执行

(1.3)图解备份原理

 

  mydumper

 

 

 

【2】mydumper 使用

(2.1)下载安装

源码安装:

[root@mysql-150 ~]# yum -y  install glib2-devel mysql-devel zlib-devel pcre-devel zlib gcc-c++ gcc cmake
[root@mysql-150 ~]# wget https://launchpad.net/mydumper/0.9/0.9.1/+download/mydumper-0.9.1.tar.gz
[root@mysql-150 ~]# tar zxf mydumper-0.9.1.tar.gz
[root@mysql-150 ~]# cd mydumper-0.9.1/
[root@mysql-150 mydumper-0.9.1]# pwd
/root/mydumper-0.9.1
[root@mysql-150 mydumper-0.9.1]# cmake .
[root@mysql-150 mydumper-0.9.1]# make
[root@mysql-150 mydumper-0.9.1]# make install
#安装完成后生成两个二进制文件mydumper和myloader位于/usr/local/bin目录下
[root@mysql-150 bin]# ls /usr/local/bin/
mydumper  myloader

rpm安装:

https://launchpad.net/mydumper
 
[root@localhost ~]# ls
anaconda-ks.cfg   mydumper-0.9.5-2.el7.x86_64.rpm
[root@localhost ~]# rpm -ivh mydumper-0.9.5-2.el7.x86_64.rpm 
Preparing...                          ################################# [100%]
Updating / installing...
   1:mydumper-0.9.5-2                 ################################# [100%]
[root@localhost ~]# which mydumper
/usr/bin/mydumper
[root@localhost ~]# which myloader
/usr/bin/myloade

(2.2)mydumper参数介绍

-B, --database              要备份的数据库,不指定则备份所有库
-T, --tables-list           需要备份的表,名字用逗号隔开
-o, --outputdir             备份文件输出的目录
-s, --statement-size        生成的insert语句的字节数,默认1000000
-r, --rows                  将表按行分块时,指定的块行数,指定这个选项会关闭 --chunk-filesize
-F, --chunk-filesize        将表按大小分块时,指定的块大小,单位是 MB
-c, --compress              压缩输出文件
-e, --build-empty-files     如果表数据是空,还是产生一个空文件(默认无数据则只有表结构文件)
-x, --regex                 是同正则表达式匹配 'db.table'
-i, --ignore-engines        忽略的存储引擎,用都厚分割
-m, --no-schemas            不备份表结构
-k, --no-locks              不使用临时共享只读锁,使用这个选项会造成数据不一致
--less-locking              减少对InnoDB表的锁施加时间(这种模式的机制下文详解)
-l, --long-query-guard      设定阻塞备份的长查询超时时间,单位是秒,默认是60秒(超时后默认mydumper将会退出)
--kill-long-queries         杀掉长查询 (不退出)
-b, --binlogs               导出binlog
-D, --daemon                启用守护进程模式,守护进程模式以某个间隔不间断对数据库进行备份
-I, --snapshot-interval     dump快照间隔时间,默认60s,需要在daemon模式下
-L, --logfile               使用的日志文件名(mydumper所产生的日志), 默认使用标准输出
--tz-utc                    跨时区是使用的选项,不解释了
--skip-tz-utc               同上
--use-savepoints            使用savepoints来减少采集metadata所造成的锁时间,需要 SUPER 权限
--success-on-1146           Not increment error count and Warning instead of Critical in case of table doesn't exist
-h, --host                  连接的主机名
-u, --user                  备份所使用的用户
-p, --password              密码
-P, --port                  端口
-S, --socket                使用socket通信时的socket文件
-t, --threads               开启的备份线程数,默认是4
-C, --compress-protocol     压缩与mysql通信的数据
-V, --version               显示版本号
-v, --verbose               输出信息模式, 0 = silent, 1 = errors, 2 = warnings, 3 = info, 默认为 2

(2.3)myloader参数介绍

-d, --directory                   备份文件的文件夹
-q, --queries-per-transaction     每次事物执行的查询数量,默认是1000
-o, --overwrite-tables            如果要恢复的表存在,则先drop掉该表,使用该参数,需要备份时候要备份表结构
-B, --database                    需要还原的数据库
-e, --enable-binlog               启用还原数据的二进制日志
-h, --host                        主机
-u, --user                        还原的用户
-p, --password                    密码
-P, --port                        端口
-S, --socket                      socket文件
-t, --threads                     还原所使用的线程数,默认是4
-C, --compress-protocol           压缩协议
-V, --version                     显示版本
-v, --verbose                     输出模式, 0 = silent, 1 = errors, 2 = warnings, 3 = info, 默认为2

 

【3】备份恢复实践

(3.1)mydumper备份

如下信息所示:

  我备份test库的所有表,用了多线程,默认4线程

[root@localhost /data/dba]$ mydumper -u root -p b123456 -B test  -o /data/dba/test/  --verbose=3
** Message: 17:41:08.385: Connected to a MySQL server
** Message: 17:41:08.387: Started dump at: 2021-10-08 17:41:08

** Message: 17:41:08.387: Written master status
** Message: 17:41:08.388: Thread 1 connected using MySQL connection ID 1618810
** Message: 17:41:08.389: Thread 2 connected using MySQL connection ID 1618811
** Message: 17:41:08.389: Thread 3 connected using MySQL connection ID 1618812
** Message: 17:41:08.390: Thread 4 connected using MySQL connection ID 1618813
** Message: 17:41:08.398: Thread 3 dumping data for `test`.`game_users`
** Message: 17:41:08.398: Thread 4 dumping data for `test`.`log_test2`
** Message: 17:41:08.398: Thread 2 dumping data for `test`.`agent_cost`
** Message: 17:41:08.398: Thread 1 dumping data for `test`.`t2`
** Message: 17:41:08.399: Thread 3 dumping data for `test`.`agent_cost_game_cnt`
** Message: 17:41:08.399: Thread 1 dumping data for `test`.`agent_money_stock_hourly`
** Message: 17:41:08.399: Empty table test.log_test2
** Message: 17:41:08.399: Thread 4 dumping data for `test`.`game_users2`
** Message: 17:41:08.399: Non-InnoDB dump complete, unlocking tables
** Message: 17:41:08.399: Thread 3 dumping data for `test`.`goinc_variables_manu`
** Message: 17:41:08.399: Thread 2 dumping data for `test`.`log`
** Message: 17:41:08.399: Empty table test.goinc_variables_manu
** Message: 17:41:08.400: Thread 3 dumping data for `test`.`log_test`
** Message: 17:41:08.400: Thread 1 dumping data for `test`.`log_reversecard`
** Message: 17:41:08.400: Thread 4 dumping data for `test`.`priv`
** Message: 17:41:08.400: Empty table test.log
** Message: 17:41:08.400: Empty table test.log_test
** Message: 17:41:08.400: Empty table test.log_reversecard
** Message: 17:41:08.400: Thread 2 dumping data for `test`.`q1`
** Message: 17:41:08.400: Thread 3 dumping data for `test`.`q2`
** Message: 17:41:08.400: Thread 4 dumping data for `test`.`round_bills`
** Message: 17:41:08.400: Thread 1 dumping data for `test`.`sbtest1`
** Message: 17:41:08.400: Thread 3 dumping data for `test`.`sbtest2`
** Message: 17:41:08.400: Thread 2 dumping data for `test`.`sbtest3`
** Message: 17:41:08.400: Thread 4 dumping data for `test`.`sbtest4`
** Message: 17:41:10.283: Thread 4 dumping data for `test`.`sbtest5`
** Message: 17:41:10.566: Thread 3 dumping data for `test`.`t1`
** Message: 17:41:10.566: Thread 3 dumping data for `test`.`tc`
** Message: 17:41:10.566: Empty table test.tc
** Message: 17:41:10.567: Thread 3 dumping data for `test`.`tea_house_bills`
** Message: 17:41:10.567: Thread 3 dumping data for `test`.`tea_house_bills_bak`
** Message: 17:41:10.567: Empty table test.tea_house_bills_bak
** Message: 17:41:10.567: Thread 3 dumping data for `test`.`tea_prop_rich_log`
** Message: 17:41:10.568: Thread 3 dumping data for `test`.`test1`
** Message: 17:41:10.568: Thread 3 dumping data for `test`.`test111`
** Message: 17:41:10.568: Thread 3 dumping data for `test`.`test2`
** Message: 17:41:10.568: Empty table test.test2
** Message: 17:41:10.568: Thread 3 dumping data for `test`.`test20210820`
** Message: 17:41:10.568: Thread 3 dumping data for `test`.`test3`
** Message: 17:41:10.569: Empty table test.test3
** Message: 17:41:10.569: Thread 3 dumping data for `test`.`tmp1`
** Message: 17:41:10.569: Thread 3 dumping data for `test`.`tmp2`
** Message: 17:41:10.569: Thread 3 dumping schema for `test`.`agent_cost`
** Message: 17:41:10.569: Thread 3 dumping schema for `test`.`agent_cost_game_cnt`
** Message: 17:41:10.570: Thread 3 dumping schema for `test`.`agent_money_stock_hourly`
** Message: 17:41:10.570: Thread 3 dumping schema for `test`.`game_users`
** Message: 17:41:10.570: Thread 3 dumping schema for `test`.`game_users2`
** Message: 17:41:10.571: Thread 3 dumping schema for `test`.`goinc_variables_manu`
** Message: 17:41:10.571: Thread 3 dumping schema for `test`.`log`
** Message: 17:41:10.670: Thread 3 dumping schema for `test`.`log_reversecard`
** Message: 17:41:10.671: Thread 3 dumping schema for `test`.`log_test`
** Message: 17:41:10.671: Thread 3 dumping schema for `test`.`log_test2`
** Message: 17:41:10.671: Thread 3 dumping schema for `test`.`priv`
** Message: 17:41:10.671: Thread 3 dumping schema for `test`.`q1`
** Message: 17:41:10.671: Thread 3 dumping schema for `test`.`q2`
** Message: 17:41:10.672: Thread 3 dumping schema for `test`.`round_bills`
** Message: 17:41:10.672: Thread 3 dumping schema for `test`.`sbtest1`
** Message: 17:41:10.672: Thread 3 dumping schema for `test`.`sbtest2`
** Message: 17:41:10.672: Thread 3 dumping schema for `test`.`sbtest3`
** Message: 17:41:10.672: Thread 3 dumping schema for `test`.`sbtest4`
** Message: 17:41:10.673: Thread 3 dumping schema for `test`.`sbtest5`
** Message: 17:41:10.673: Thread 3 dumping schema for `test`.`t1`
** Message: 17:41:10.673: Thread 3 dumping schema for `test`.`t2`
** Message: 17:41:10.673: Thread 3 dumping schema for `test`.`tc`
** Message: 17:41:10.673: Thread 3 dumping schema for `test`.`tea_house_bills`
** Message: 17:41:10.673: Thread 3 dumping schema for `test`.`tea_house_bills_bak`
** Message: 17:41:10.674: Thread 3 dumping schema for `test`.`tea_prop_rich_log`
** Message: 17:41:10.674: Thread 3 dumping schema for `test`.`test1`
** Message: 17:41:10.674: Thread 3 dumping schema for `test`.`test111`
** Message: 17:41:10.674: Thread 3 dumping schema for `test`.`test2`
** Message: 17:41:10.674: Thread 3 dumping schema for `test`.`test20210820`
** Message: 17:41:10.675: Thread 3 dumping schema for `test`.`test3`
** Message: 17:41:10.675: Thread 3 dumping schema for `test`.`tmp1`
** Message: 17:41:10.675: Thread 3 dumping schema for `test`.`tmp2`
** Message: 17:41:10.675: Thread 3 dumping view for `test`.`users2`
** Message: 17:41:10.676: Thread 3 dumping view for `test`.`users3`
** Message: 17:41:10.676: Thread 3 dumping view for `test`.`users4`
** Message: 17:41:10.677: Thread 3 dumping view for `test`.`users5`
** Message: 17:41:10.730: Thread 2 dumping view for `test`.`v_a1`
** Message: 17:41:10.751: Thread 3 dumping view for `test`.`v_a2`
** Message: 17:41:10.752: Thread 3 shutting down
** Message: 17:41:10.754: Thread 1 shutting down
** Message: 17:41:10.787: Thread 2 shutting down
** Message: 17:41:11.333: Thread 4 shutting down
** Message: 17:41:11.333: Finished dump at: 2021-10-08 17:41:11

与 mysqlpump 比较

[root@localhost /data/dba/test]$ time mysqlpump --default-character-set=utf8mb4 --set-gtid-purged=ON --host=localhost  -B test default-parallelism=4>/data/dba/test_mysqlpump.sql
mysqlpump: [WARNING] (1429) Unable to connect to foreign data source: Access denied for user 'leoaprd'@'10.20.53.12' (using password: 
mysqlpump: [WARNING] (1429) Unable to connect to foreign data source: Access denied for user 'leoaprd'@'10.20.53.12' (using password: 
Dump progress: 1/3 tables, 0/300 rows
Dump progress: 14/32 tables, 833250/4932647 rows
Dump progress: 14/32 tables, 1721000/4932647 rows
Dump progress: 16/32 tables, 2625500/4932647 rows
Dump progress: 16/32 tables, 3438750/4932647 rows
Dump progress: 31/32 tables, 4152750/4932647 rows
Dump progress: 31/32 tables, 4591000/4932647 rows
Dump completed in 7672 milliseconds

比它都快了一倍,yyds啊!

(3.2)mydumper 备份出来的文件

  mydumper

 

metadata:里面放的是 show master status

*-schema.sql:里面放的是创建语句,表结构、视图结构、存储过程、函数等

*表名.sql:直接就是 Insert into 语句

 

 

【参考文档】

https://www.cnblogs.com/keme/p/11679446.html

上一篇:色差计算度量


下一篇:Tomcat源码分析环境搭建