官方文档https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#Ongoing+Data+Directory+Cleanup
监控报警发现clickhouse集群环境的数据库节点磁盘报警,检查下来发现/chdata/zookeeper/data/version-2/目录特别大,里面包含了log.*文件和snapshot.*文件,检查zooker的配置文件/chdata/zookeeper/apache-zookeeper-3.7.1-bin/conf/zoo.cfg发现该路径配置的是dataDir,查看zookeeper的官方文档发现参数dataDir的说明如下
root@CHDB001:~# du -sh /chdata/zookeeper/data/*
4.0K /chdata/zookeeper/data/myid
56G /chdata/zookeeper/data/version-2
4.0K /chdata/zookeeper/data/zookeeper_server.pid
root@CHDB001:~# ll /chdata/zookeeper/data/version-2/ -rt |tail -4
-rw-r--r-- 1 root root 3106008 Oct 21 03:52 snapshot.31ea7df7f
-rw-r--r-- 1 root root 67108880 Oct 21 04:40 log.31ea7df81
-rw-r--r-- 1 root root 3009585 Oct 21 04:40 snapshot.31ea8b8cf
-rw-r--r-- 1 root root 67108880 Oct 21 05:43 log.31ea8b8d1
root@CHDB001:~# cat /chdata/zookeeper/apache-zookeeper-3.7.1-bin/conf/zoo.cfg |grep dataDir
dataDir=/chdata/zookeeper/data
The ZooKeeper Data Directory contains files which are a persistent copy of the znodes stored by a particular serving ensemble. These are the snapshot and transactional log files. As changes are made to the znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supercedes all previous logs.
A ZooKeeper server will not remove old snapshots and log files, this is the responsibility of the operator. Every serving environment is different and therefore the requirements of managing these files may differ from install to install (backup for example).
ZooKeeper 数据目录包含的文件是由特定服务整体存储的 znode 的持久副本。这些是快照和事务日志文件。当对 znode 进行更改时,这些更改会附加到事务日志中,有时,当日志变大时,所有 znode 当前状态的快照将写入文件系统。此快照取代所有以前的日志。
ZooKeeper 服务器不会删除旧的快照和日志文件,这是操作员的责任。每个服务环境都不同,因此管理这些文件的要求可能因安装而异(例如备份)。
dataDir:the location where ZooKeeper will store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
dataDir:ZooKeeper 存储内存数据库快照的位置,除非另有指定,否则存储数据库更新的事务日志。
通过官方文档的说明,了解到zookeeper参数dataDir对应的目录下面的log文件和snapshot文件可以删除,当然也发现zookeeper配置文件中一段注释的说明如下,更说明zookeeper参数dataDir对应的目录下面的log文件和snapshot文件可以删除
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
配置了一个crontab
0 18 * * * sh /root/script/clickhouse_removelogsnapshot.sh
crontab对应的可以执行文件/root/script/clickhouse_removelogsnapshot.sh内容如下
#!/bin/bash
removedate=`date +%Y%m%d`
echo " ">>/root/clickhouse_removelogsnapshot/clickhouse_removelogsnapshot_$removedate.log
echo "++++++++++++++++++++++++++++++++++++++++++++++++">>/root/clickhouse_removelogsnapshot/clickhouse_removelogsnapshot_$removedate.log
echo " Begin to remove...">>/root/clickhouse_removelogsnapshot/clickhouse_removelogsnapshot_$removedate.log
find /chdata/zookeeper/data/version-2/ -name 'log.*' -ctime +90 -exec ls -l {} \; >>/root/clickhouse_removelogsnapshot/clickhouse_removelogsnapshot_$removedate.log
find /chdata/zookeeper/data/version-2/ -name 'log.*' -ctime +90 -exec rm -rf {} \;
find /chdata/zookeeper/data/version-2/ -name 'snapshot.*' -ctime +90 -exec ls -l {} \; >>/root/clickhouse_removelogsnapshot/clickhouse_removelogsnapshot_$removedate.log
find /chdata/zookeeper/data/version-2/ -name 'snapshot.*' -ctime +90 -exec rm -rf {} \;