Pt-Archiver 使用说明:
工具介绍:
Pt-Archiver 是一款数据库单表数据导出,清理,归档工具
官方文档地址:
https://www.percona.com/doc/percona-toolkit/LATEST/pt-archiver.html
常用配置说明:
-where 'id<3000' 设置操作条件
--limit 10000 每次取1000行数据给pt-archive处理
--txn-size 1000 设置1000行为一个事务提交一次
--progress 5000 每处理5000行输出一次处理信息
--statistics 结束的时候给出统计信息:开始的时间点,结束的时间点,查询的行数,归档的行数,删除的行数,以及各个阶段消耗的总的时间和比例,便于以此进行优化。只要不加上--quiet,默认情况下pt-archive都会输出执行过程的
--charset=UTF8 指定字符集为UTF8,字符集需要对应当前库的字符集来操作
--no-delete 表示不删除原来的数据,注意:如果不指定此参数,所有处理完成后,都会清理原表中的数据
--bulk-delete 批量删除source上的旧数据
--bulk-insert 批量插入数据到dest主机 (看dest的general log发现它是通过在dest主机上LOAD
DATA LOCAL INFILE插入数据的)
--dry-run 模拟执行
--source 源数据
--dest 目标数据
注意事项:
--source --dest 后的DSN之间不能空格出现,否则会出错。
--where条件的值,有字符串的,要用引号括起来。
DSN 是指 --source --dest 的参数
工作原理:
181214 10:55:55 36 Connect root@10.10.11.16 on platform-unitive
36 Query set autocommit=0
36 Query /*!40101 SET NAMES "UTF8"*/
36 Query SHOW VARIABLES LIKE 'wait\_timeout'
36 Query SET SESSION wait_timeout=10
36 Query SELECT @@SQL_MODE
36 Query SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'*/
36 Query SELECT version()
36 Query SHOW VARIABLES LIKE 'version%'
36 Query SHOW ENGINES
36 Query SHOW VARIABLES LIKE 'innodb_version'
36 Query show variables like 'innodb_rollback_on_timeout'
36 Query /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
36 Query USE `platform-unitive`
36 Query SHOW CREATE TABLE `platform-unitive`.`u_push_message`
36 Query /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
36 Query SELECT CONCAT(/*!40100 @@session.character_set_connection, */ "")
37 Connect root@10.10.11.16 on platform-unitive
37 Query set autocommit=0
37 Query /*!40101 SET NAMES "UTF8"*/
37 Query SHOW VARIABLES LIKE 'wait\_timeout'
37 Query SET SESSION wait_timeout=10
37 Query SELECT @@SQL_MODE
37 Query SET @@SQL_QUOTE_SHOW_CREATE = 1/*!40101, @@SQL_MODE='NO_AUTO_VALUE_ON_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'*/
37 Query SELECT version()
37 Query SHOW VARIABLES LIKE 'version%'
37 Query SHOW ENGINES
37 Query SHOW VARIABLES LIKE 'innodb_version'
37 Query show variables like 'innodb_rollback_on_timeout'
37 Query /*!40101 SET @OLD_SQL_MODE := @@SQL_MODE, @@SQL_MODE := '', @OLD_QUOTE := @@SQL_QUOTE_SHOW_CREATE, @@SQL_QUOTE_SHOW_CREATE := 1 */
37 Query USE `platform-unitive`
37 Query SHOW CREATE TABLE `platform-unitive`.`u_push_message_2018`
37 Query /*!40101 SET @@SQL_MODE := @OLD_SQL_MODE, @@SQL_QUOTE_SHOW_CREATE := @OLD_QUOTE */
37 Query SELECT CONCAT(/*!40100 @@session.character_set_connection, */ "")
36 Query SHOW VARIABLES LIKE 'wsrep_on'
36 Query SHOW VARIABLES LIKE 'wsrep_on'
36 Query SHOW VARIABLES LIKE 'version%'
36 Query SHOW ENGINES
36 Query SHOW VARIABLES LIKE 'innodb_version'
36 Query SELECT MAX(`id`) FROM `platform-unitive`.`u_push_message`
36 Query SELECT CONCAT(@@hostname, @@port)
37 Query SELECT CONCAT(@@hostname, @@port)
36 Query SELECT /*!40001 SQL_NO_CACHE */ `id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull` FROM `platform-unitive`.`u_push_message` FORCE INDEX(`PRIMARY`) WHERE (id <= 640040) AND (`id` < '23928907') ORDER BY `id` LIMIT 2
36 Query SELECT 'pt-archiver keepalive'
37 Query commit
36 Query commit
37 Query LOAD DATA LOCAL INFILE '/tmp/3xbPFCYChYpt-archiver' INTO TABLE `platform-unitive`.`u_push_message_2018`CHARACTER SET UTF8(`id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull`)
36 Query SELECT 'pt-archiver keepalive'
36 Query DELETE FROM `platform-unitive`.`u_push_message` WHERE (((`id` >= '640021'))) AND (((`id` <= '640022'))) AND (id <= 640040) LIMIT 2
37 Query commit
36 Query commit
使用场景一:
u_push_message 表按照推送时间90天前数据 归档到 u_push_message_2018
1,创建u_push_message_2018 表结构与u_push_message 完全一致,不需要额外索引
2,编写执行脚本
pt-archiver --source h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message --dest h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message_2018 --charset=UTF8 --where 'push_time <= 1536854400000' --progress 10000 --limit=10000 --txn-size 10000 --bulk-insert --bulk-delete --statistics --purge
3,增加--dry-run 去服务器检查执行
pt-archiver --source h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message --dest h=10.10.11.16,P=3309,u=root,p='kws@1234',D=platform-unitive,t=u_push_message_2018 --charset=UTF8 --where 'push_time <= 1536854400000' --progress 10000 --limit=10000 --txn-size 10000 --bulk-insert --bulk-delete --statistics --purge --dry-run
4.检查输出日志是否符合预期:
SELECT /*!40001 SQL_NO_CACHE */ `id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull` FROM `platform-unitive`.`u_push_message` FORCE INDEX(`PRIMARY`) WHERE (push_time <= 1536854400000) AND (`id` < '23928907') ORDER BY `id` LIMIT 10000
SELECT /*!40001 SQL_NO_CACHE */ `id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull` FROM `platform-unitive`.`u_push_message` FORCE INDEX(`PRIMARY`) WHERE (push_time <= 1536854400000) AND (`id` < '23928907') AND ((`id` >= ?)) ORDER BY `id` LIMIT 10000
DELETE FROM `platform-unitive`.`u_push_message` WHERE (((`id` >= ?))) AND (((`id` <= ?))) AND (push_time <= 1536854400000) LIMIT 10000
LOAD DATA LOCAL INFILE ? INTO TABLE `platform-unitive`.`u_push_message_2018`CHARACTER SET UTF8(`id`,`app_codes`,`device_type`,`msg_subject`,`msg_body`,`push_time`,`offline_days`,`expire_time`,`push_type`,`device_ids`,`group_codes`,`uids`,`click_action`,`click_uri`,`msg_content`,`message_roam_type`,`msg_type`,`priority`,`msg_batch_no`,`need_time_scope`,`query_id`,`status`,`push_count`,`creater`,`create_time`,`modifier`,`modify_time`,`version`,`schedule_flag`,`notice_url`,`msg_channel`,`is_need_pull`)
5,编写执行计划脚本每天3点执行: 替换 --where 'push_time <= 1536854400000' 中的push_time时间
Shell 脚本示例:
#/bin/bash
#时间变量
month=`date -d "10 day ago" "+%Y-%m-%d 00:00:00"`
lock_time=`date -d "60 day ago" "+%Y-%m-%d"`
timeStamp=`date -d "$month" +%s`
currentTimeStamp=$((timeStamp*1000))
#数据库变量
source_data=platform-unitive
source_table=u_push_message
out_data=platform-unitive
out_table=u_push_message_2018
out_dir=/tmp/"$out_data"
out_time=`date "+%Y_%m_%d"`
out_file="$out_dir"/"$out_table"_archiver_log_"$out_time"
#判断文件夹是否存在,不存在就创建
if [ ! -d "$out_dir" ]; then
/bin/mkdir "$out_dir" -p
fi
#数据归档,判断进程是否存在,不存在就开始,存在就不开始.
count=`ps -ef|grep "/usr/bin/pt-archiver"|grep "t=$out_table,"|grep -v grep|wc -l`
echo $count
if [ "$count" -eq 0 ]
then
echo "start process....."
/usr/bin/pt-archiver --source h=10.10.11.16,P=3309,u=root,p='kws@1234',D=$source_data,t=$source_table --dest h=10.10.11.16,P=3309,u=root,p='kws@1234',t=$out_table,D=$out_data --charset=UTF8 --where "push_time <= '$currentTimeStamp'" --progress 10000 --limit=10000 --txn-size 10000 --bulk-insert --bulk-delete --purge --statistics >>"$out_file" 2>&1 &
else
echo "runing....."
fi