之前研究过基于主从+分片实现的集群,现在研究下基于哨兵模式的集群模式。
1. 下载redis
版本: 3.2.100
2. 开始搭建集群
(1) 复制集三个redis 目录,分别命名为 redis-63791、 redis-63792、 redis-63793 63791是master节点,剩下两个是slave 节点
(2) redis-63791 目录下修改:
1》redis.windows.conf 大概79 行左右: 修改端口号
port 63791
2》 新建startup.bat
title master_63791
redis-server.exe redis.windows.conf
3》 新建sentinel.conf
port 36379 sentinel monitor mymaster 127.0.0.1 63791 1 sentinel down-after-milliseconds mymaster 5000 sentinel config-epoch mymaster 10
4》新建startup-sentinel.bat
title sentinel_63791 redis-server.exe sentinel.conf --sentinel
(2) redis-63792 目录下修改:
1》redis.windows.conf 大概79 行左右: 修改端口和声明是从服务器
port 63792 slaveof 127.0.0.1 63791
2》 新建startup.bat
title slave_63792 redis-server.exe redis.windows.conf
3》 新建sentinel.conf
port 36380 sentinel monitor mymaster 127.0.0.1 63791 1 sentinel down-after-milliseconds mymaster 5000 sentinel config-epoch mymaster 10
4》新建startup-sentinel.bat
title sentinel_63792 redis-server.exe sentinel.conf --sentinel
(3) redis-63793 目录下修改:
1》redis.windows.conf 大概79 行左右:
port 63793 slaveof 127.0.0.1 63791
2》 新建startup.bat
title slave_63793 redis-server.exe redis.windows.conf
3》 新建sentinel.conf
port 36381 sentinel monitor mymaster 127.0.0.1 63791 1 sentinel down-after-milliseconds mymaster 5000 sentinel config-epoch mymaster 10
4》新建startup-sentinel.bat
title sentinel_63793 redis-server.exe sentinel.conf --sentinel
title 是windows 批处理命令,用于设置cmd窗口的title。
3. 启动服务
1. 先启动三个redis 服务startup.bat
依次启动63791/63792/63793, 查看63791 服务的打印的日志如下:
[8928] 24 Apr 15:16:54.001 # Server started, Redis version 3.2.100 [8928] 24 Apr 15:16:54.002 * DB loaded from disk: 0.001 seconds [8928] 24 Apr 15:16:54.002 * The server is now ready to accept connections on port 63791 [8928] 24 Apr 15:17:12.189 * Slave 127.0.0.1:63792 asks for synchronization [8928] 24 Apr 15:17:12.189 * Full resync requested by slave 127.0.0.1:63792 [8928] 24 Apr 15:17:12.190 * Starting BGSAVE for SYNC with target: disk [8928] 24 Apr 15:17:12.259 * Background saving started by pid 9400 [8928] 24 Apr 15:17:12.661 # fork operation complete [8928] 24 Apr 15:17:12.664 * Background saving terminated with success [8928] 24 Apr 15:17:12.674 * Synchronization with slave 127.0.0.1:63792 succeeded [8928] 24 Apr 15:17:26.079 * Slave 127.0.0.1:63793 asks for synchronization [8928] 24 Apr 15:17:26.080 * Full resync requested by slave 127.0.0.1:63793 [8928] 24 Apr 15:17:26.080 * Starting BGSAVE for SYNC with target: disk [8928] 24 Apr 15:17:26.158 * Background saving started by pid 8224 [8928] 24 Apr 15:17:26.759 # fork operation complete [8928] 24 Apr 15:17:26.761 * Background saving terminated with success [8928] 24 Apr 15:17:26.767 * Synchronization with slave 127.0.0.1:63793 succeeded
63792 打印的日志如下:
[7556] 24 Apr 15:17:12.180 # Server started, Redis version 3.2.100 [7556] 24 Apr 15:17:12.181 * DB loaded from disk: 0.001 seconds [7556] 24 Apr 15:17:12.182 * The server is now ready to accept connections on port 63792 [7556] 24 Apr 15:17:12.182 * Connecting to MASTER 127.0.0.1:63791 [7556] 24 Apr 15:17:12.184 * MASTER <-> SLAVE sync started [7556] 24 Apr 15:17:12.185 * Non blocking connect for SYNC fired the event. [7556] 24 Apr 15:17:12.186 * Master replied to PING, replication can continue... [7556] 24 Apr 15:17:12.188 * Partial resynchronization not possible (no cached master) [7556] 24 Apr 15:17:12.260 * Full resync from master: f797667812e495c1e89823ee97f454db6a97b51e:1 [7556] 24 Apr 15:17:12.671 * MASTER <-> SLAVE sync: receiving 380 bytes from master [7556] 24 Apr 15:17:12.834 * MASTER <-> SLAVE sync: Flushing old data [7556] 24 Apr 15:17:12.834 * MASTER <-> SLAVE sync: Loading DB in memory [7556] 24 Apr 15:17:12.835 * MASTER <-> SLAVE sync: Finished with success
可以看到主从复制是通过sync 命令进行同步数据的
2. 接下来启动三个哨兵进程startup-sentinel.bat
3. 客户端连接到三个服务查看集群信息
(1) 63791查看
E:\redis-cluster\redis-63791>redis-cli -p 63791 127.0.0.1:63791> info replication # Replication role:master connected_slaves:2 slave0:ip=127.0.0.1,port=63792,state=online,offset=26578,lag=0 slave1:ip=127.0.0.1,port=63793,state=online,offset=26728,lag=0 master_repl_offset:26728 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:26727
可以看到是master 节点,并且可以看到两个从节点。
(2) 查看63792 的端口
E:\redis-cluster\redis-63791>redis-cli -p 63792 127.0.0.1:63792> info replication # Replication role:slave master_host:127.0.0.1 master_port:63791 master_link_status:up master_last_io_seconds_ago:0 master_sync_in_progress:0 slave_repl_offset:36862 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0
可以看到是从节点,并且 slave_read_only 只读属性为true, 也就是从节点只能读取数据。
(3) 查看63793集群信息
E:\redis-cluster\redis-63791>redis-cli -p 63793 127.0.0.1:63793> info replication # Replication role:slave master_host:127.0.0.1 master_port:63791 master_link_status:up master_last_io_seconds_ago:0 master_sync_in_progress:0 slave_repl_offset:53022 slave_priority:100 slave_read_only:1 connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0
3. 模拟63791 主节点挂掉:
将63791 哨兵以及redis-server 都停掉,再次查看从63792/63793 查看集群信息
(1)从哨兵控制台可以看到产生了新的master 节点,日志如下:
[11188] 24 Apr 15:29:15.388 # +sdown master mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:15.388 # +odown master mymaster 127.0.0.1 63791 #quorum 1/1 [11188] 24 Apr 15:29:15.388 # +new-epoch 11 [11188] 24 Apr 15:29:15.388 # +try-failover master mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:15.393 # +vote-for-leader 956fe7dd1644c38db24792da8af749847bd3c247 11 [11188] 24 Apr 15:29:15.401 # 10765c1132658cf10050164915ea69a5b77da4eb voted for 956fe7dd1644c38db24792da8af749847bd3c24 7 11 [11188] 24 Apr 15:29:15.493 # +elected-leader master mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:15.493 # +failover-state-select-slave master mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:15.555 # +selected-slave slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:15.555 * +failover-state-send-slaveof-noone slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0. 0.1 63791 [11188] 24 Apr 15:29:15.623 * +failover-state-wait-promotion slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:16.466 # +promoted-slave slave 127.0.0.1:63792 127.0.0.1 63792 @ mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:16.467 # +failover-state-reconf-slaves master mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:16.524 * +slave-reconf-sent slave 127.0.0.1:63793 127.0.0.1 63793 @ mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:17.482 * +slave-reconf-inprog slave 127.0.0.1:63793 127.0.0.1 63793 @ mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:18.513 * +slave-reconf-done slave 127.0.0.1:63793 127.0.0.1 63793 @ mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:18.583 # +failover-end master mymaster 127.0.0.1 63791 [11188] 24 Apr 15:29:18.583 # +switch-master mymaster 127.0.0.1 63791 127.0.0.1 63792 [11188] 24 Apr 15:29:18.585 * +slave slave 127.0.0.1:63793 127.0.0.1 63793 @ mymaster 127.0.0.1 63792 [11188] 24 Apr 15:29:18.585 * +slave slave 127.0.0.1:63791 127.0.0.1 63791 @ mymaster 127.0.0.1 63792 [11188] 24 Apr 15:29:19.291 # +sdown sentinel 71a27c573f9eb89554a1db4dae04c406e19fd85a 127.0.0.1 36379 @ mymaster 127.0. 0.1 63792 [11188] 24 Apr 15:29:23.665 # +sdown slave 127.0.0.1:63791 127.0.0.1 63791 @ mymaster 127.0.0.1 63792
(2) 从63792 查看集群信息
127.0.0.1:63792> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=63793,state=online,offset=12526,lag=0 master_repl_offset:12526 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:12525
可以看到升级为主节点。并且63793 为从节点。
4. 63791 再次上线
启动服务节点与哨兵节点。从63792 查看集群信息
127.0.0.1:63792> info replication # Replication role:master connected_slaves:2 slave0:ip=127.0.0.1,port=63793,state=online,offset=25459,lag=1 slave1:ip=127.0.0.1,port=63791,state=online,offset=25459,lag=0 master_repl_offset:25595 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:25594
可以看到63791 上线后成为slave 节点。然后查看63791的配置文件:redis.windows.conf 和 sentinel.conf
(1) redis.windows.conf
slaveof 127.0.0.1 63792
可以看到成为从节点
(2) 查看哨兵配置信息: 监控的主节点也变为63792
port 36379 sentinel myid 71a27c573f9eb89554a1db4dae04c406e19fd85a sentinel monitor mymaster 127.0.0.1 63792 1 sentinel down-after-milliseconds mymaster 5000 # Generated by CONFIG REWRITE dir "E:\\redis-cluster\\redis-63791" sentinel config-epoch mymaster 11 sentinel leader-epoch mymaster 0 sentinel known-slave mymaster 127.0.0.1 63791 sentinel known-slave mymaster 127.0.0.1 63793 sentinel known-sentinel mymaster 127.0.0.1 36381 956fe7dd1644c38db24792da8af749847bd3c247 sentinel known-sentinel mymaster 127.0.0.1 36380 10765c1132658cf10050164915ea69a5b77da4eb sentinel current-epoch 11
4. 主节点写入数据
注意主节点可以写入数据,从节点只能读取数据;主节点接到写入命令后会将命令同步到其他节点。
当然从节点可以用如下属性进行设置: 只是从节点不会同步至主节点,所以从节点写入数据没必要。
slave-read-only no
1. 63792 主节点进行操作
127.0.0.1:63792> set str value OK 127.0.0.1:63792> keys * 1) "str" 127.0.0.1:63792> get str "value" 127.0.0.1:63792> ttl str (integer) -1
2. 63793 从节点查看
127.0.0.1:63793> keys * 1) "str" 127.0.0.1:63793> get str "value" 127.0.0.1:63793> set kk kk (error) READONLY You can't write against a read only slave. 127.0.0.1:63793> flushall (error) READONLY You can't write against a read only slave.
可以看到63793 从节点只能读取数据,不能进行数据的修改操作。
总结:
1. redis.windows.conf 配置文件中主节点只需要修改端口即可,从节点需要声明是哪个节点的从节点。
2. sentinel.conf 需要声明三个不一样的端口。因为哨兵实际上是启动一个进程进行检测,所以需要声明为是三个不同的端口。
3. 哨兵声明检测的是master 节点。
4. 当主节点挂掉后会自动从从节点选取一个主节点,然后对应的redis.windows.conf 会改变相应的slave 信息; 哨兵配置文件 sentinel.conf 也会改变为对应的文件。
5. 如果不启动哨兵进程, 主节点挂掉之后将没有节点可写入数据,所以需要启动哨兵进程,主节点挂掉之后随机选择一个节点作为主节点, 此节点可以写入数据。
6. 启动各节点之后可以用redis-cli -p 63791 连接到之后用命令 info replication 查看节点的集群信息。
7. redis 主从节点同步是主节点收到命令会同步发送给客户端,并且在启动的时候客户端通过sync 命令同步主节点的数据信息。