Persistent
RDB
是什么?
Redis DataBase
在指定的时间间隔内将内存中的数据集快照写入磁盘,也就是Snapshot快照,它恢复时是将快照文件直接读到内存中。
Redis会单独创建(fork)一个子线程进行持久化,会先将数据写入到一个临时文件中,待持久化都结束了,再用这个临时文件替换上次持久化好的文件。
整个过程中,主进程不进行任何IO操作确保性能。
如果需要进行大规模数据的恢复,且对于数据恢复的完整性不是非常敏感,那RDB方式要比AOF方式更加高效。RDB的缺点是最后一次持久化后的数据可能丢失。
Fork
Fork的作用是复制一个与当前进程一样的进程。新进程的所有数据(变量、环境变量、程序计数器等)数值都和原进程一致,但是是一个全新的进程,并作为原进程的子进程。
保存快照
FLUSHALL、SHUTDOWN会迅速生成dump.rdb
-
save
save时只管保存,全部阻塞无法写入
-
bgsave
redis会在后台异步进行快照操作,快照同时还会响应客户端请求。 可以通过lastsave命令获取最后一次成功执行快照的时间
恢复
将备份文件(dump.rdb)移动到Redis配置的dir目录下启动服务即可
停止保存
redis-cli config set save ""
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
save ""
优势
官网
RDB advantages
-
RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance you may want to archive your RDB files every hour for the latest 24 hours, and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.
-
RDB is very good for disaster recovery, being a single compact file that can be transferred to far data centers, or onto Amazon S3 (possibly encrypted).
-
RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent instance will never perform disk I/O or alike.
-
RDB allows faster restarts with big datasets compared to AOF.
总结
- 适合大规模的数据恢复,对于数据的完整性和一致性要求不高。
劣势
官网
RDB disadvantages
-
RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure different save points where an RDB is produced (for instance after at least five minutes and 100 writes against the data set, but you can have multiple save points). However you'll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason you should be prepared to lose the latest minutes of data.
-
RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great. AOF also needs to fork() but you can tune how often you want to rewrite your logs without any trade-off on durability.
总结
- 在一定间隔时间做一次备份,所以如果Redis意外宕机,就会丢失最后一次快照后的所有修改。
- Fork的时候,内存中的数据会被克隆一份,大约2倍的膨胀性需要考虑。
AOF
是什么?
Append Only File
以日志的形式来记录每个写操作,将Redis执行过的所有写指令都记录下来(读操作不记录),只许追加文件不可以改写文件,Redis启动之初会读取文件重新构建数据,Redis重启就会根据日志文件内容将写指令从前到后重新执行一遍以完成数据的恢复。
aof保存appendonly.aof文件
aof文件和rdb文件同时存在时,Redis启动时先加载aof文件。
Rewrite
是什么?
AOF采用文件追加方式,文件会越写越大为避免次情况,新增了重写机制,当AOF文件大小超过所设定的阈值时,Redis就会启动AOF文件的内容压缩,只保留可以恢复数据的最小指令集,可以使用命令bgrewriteaof。
重写原理
AOF文件持续增长而过大时,会fork出一条新的进程来将文件重写(同样也是先写临时文件最后重命名文件),遍历新进程内存中数据,每条记录有一条的Set语句。重写aof文件的操作,并不会读取旧的aof文件而是将整个内存中的数据库内容用命令的方式重写一个新的aof文件。
触发机制
Redis会记录上次重写时AOF文件大小,默认配置当AOF文件大小是上次rewrite后大小的一倍且文件大于64m时触发
配置
Appendfsync
- Always 同步持久化每次发生数据变化便会立即记录到磁盘,性能不好但数据完整性高
- Everysec 出厂默认配置,异步操作,每秒记录 如果1s宕机会有数据丢失
- No
no-appendfsync-on-rewrite
重写时时候可以用Appendfsync,一般使用默认no,保证数据安全性
重写触发条件配置
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
优势
官网
AOF advantages
- Using AOF Redis is much more durable: you can have different fsync policies: no fsync at all, fsync every second, fsync at every query. With the default policy of fsync every second write performances are still great (fsync is performed using a background thread and the main thread will try hard to perform writes when no fsync is in progress.) but you can only lose one second worth of writes.
- The AOF log is an append only log, so there are no seeks, nor corruption problems if there is a power outage. Even if the log ends with an half-written command for some reason (disk full or other reasons) the redis-check-aof tool is able to fix it easily.
- Redis is able to automatically rewrite the AOF in background when it gets too big. The rewrite is completely safe as while Redis continues appending to the old file, a completely new one is produced with the minimal set of operations needed to create the current data set, and once this second file is ready Redis switches the two and starts appending to the new one.
- AOF contains a log of all the operations one after the other in an easy to understand and parse format. You can even easily export an AOF file. For instance even if you flushed everything for an error using a FLUSHALL command, if no rewrite of the log was performed in the meantime you can still save your data set just stopping the server, removing the latest command, and restarting Redis again.
总结
- 可以灵活配置,使用默认的每秒fsync操作时,性能依旧很高且能保证数据完整(虽然会丢失1s的数据)
- aof采用追加日志的方式使用redis-check-aof工具能够轻松修复后恢复数据
- aof文件时太大会有重写机制,生成一个包含恢复当前数据最少命令的新文件
- aof文件易于理解,可以轻松修改命令,然后重新启动redis
劣势
官网
AOF disadvantages
- AOF files are usually bigger than the equivalent RDB files for the same dataset.
- AOF can be slower than RDB depending on the exact fsync policy. In general with fsync set to every second performance is still very high, and with fsync disabled it should be exactly as fast as RDB even under high load. Still RDB is able to provide more guarantees about the maximum latency even in the case of an huge write load.
- In the past we experienced rare bugs in specific commands (for instance there was one involving blocking commands like BRPOPLPUSH) causing the AOF produced to not reproduce exactly the same dataset on reloading. These bugs are rare and we have tests in the test suite creating random complex datasets automatically and reloading them to check everything is fine. However, these kind of bugs are almost impossible with RDB persistence. To make this point more clear: the Redis AOF works by incrementally updating an existing state, like MySQL or MongoDB does, while the RDB snapshotting creates everything from scratch again and again, that is conceptually more robust. However - 1) It should be noted that every time the AOF is rewritten by Redis it is recreated from scratch starting from the actual data contained in the data set, making resistance to bugs stronger compared to an always appending AOF file (or one rewritten reading the old AOF instead of reading the data in memory). 2) We have never had a single report from users about an AOF corruption that was detected in the real world.
总结
- 通常情况aof文件都比相同数据rdb文件要大得多,恢复速度慢与rdb
- aof运行效率要慢于rdb,不同步效率和rbd相同
which one?
如果只做缓存使用,可以不开启持久化方式
如果可以承受几分钟的数据丢失可以单单开启rdb
官网上不建议单单开启aof,因为aof可能潜在bug
性能建议
RDB文件只用做后备用途,建议只在Slave上持久化RDB文件,而且只要15分钟备份一次就够了,只保留save 900 1 这一条规则。
如果开启AOF,好处是最恶劣的情况下也只会丢失1s的数据,启动后加载aof文件即可恢复数据。代价一是持续的IO,二是AOF rewrite最后将rewrite过程中产生的新数据写到新文件造成的阻塞是不可避免的。在硬盘允许的情况要将auto-aof-rewrite-min-size配置放大,可以设置到5G以上。
如果不开启AOF,仅仅靠Master-Salve Replication 实现高可用性也是可行的。能节省很大一笔IO开销同时减少rewrite带来的系统波动。代价是Master/Salve同时宕机会损失十几分钟数据,启动恢复前也要比较Master/Slave中的RDB文件,载入较新的那个。