rabbitmq集群安装与配置(故障恢复)

0、首先按照http://www.cnblogs.com/zhjh256/p/5922562.html在至少两个节点安装好(不建议单机,没什么意义)

1、先了解rabbitmq集群架构,http://www.cnblogs.com/zhjh256/p/6368288.html

2、vi /etc/hosts 在两个节点分别添加别名,互ping对方,确保通。确保防火墙没有开或者相应的端口开放

3、设置erlang集群通信用的cookie,通常每种集群都有一种机制,可能是共享磁盘比如rac或者session比如tomcat亦或是token比如分布式系统

RabbitMQ节点之间和命令行工具 (e.g. rabbitmqctl)是使用Cookie互通的,Cookie是一组随机的数字+字母的字符串。当RabbitMQ服务器启动的时候,Erlang VM会自动创建一个随机内容的Cookie文件。如果是通过rpm安装RabbitMQ的话,Erlang Cookie 文件在/var/lib/rabbitmq/.erlang.cookie。如果是通过源码或者二进制解压安装的RabbitMQ,Erlang Cookie文件$HOME/.erlang.cookie。确保组成集群的每个erlang节点的cookie相同。

[root@dev3 ~]# scp .erlang.cookie devel2:/root/
The authenticity of host 'devel2 (172.18.30.192)' can't be established.
RSA key fingerprint is e3:76:d5:eb:d3:7d:86:43:de:bc:5d:31:cb:21:0d:d2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'devel2,172.18.30.192' (RSA) to the list of known hosts.
root@devel2's password:
.erlang.cookie 100% 20 0.0KB/s 00:00
[root@dev3 ~]# cat .erlang.cookie
YQSISAMKQWVOZMRQZJDV

[root@dev3 ~]# chmod 400 .erlang.cookie

# 必须确保.erlang.cookie的权限位400,否则启动的时候会报“{error_logger,{{2017,2,27},{20,11,25}},"Cookie file /root/.erlang.cookie must be accessible by owner only",[]}”

[root@devel2 ~]# chmod 400 .erlang.cookie

4、后台模式启动rabbitmq

[root@dev3 ~]# rabbitmq-server -detached
Warning: PID file not written; -detached was passed.

[root@devel2 ~]# rabbitmq-server -detached
Warning: PID file not written; -detached was passed.
[root@devel2 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@devel2 ...
[{nodes,[{disc,[rabbit@devel2]}]},
{running_nodes,[rabbit@devel2]},
{cluster_name,<<"rabbit@devel2">>},
{partitions,[]}]
[root@devel2 ~]# rabbitmqctl stop_app
Stopping node rabbit@devel2 ...

[root@devel2 ~]# rabbitmqctl join_cluster rabbit@dev3
Clustering node rabbit@devel2 with rabbit@dev3 ...
[root@devel2 ~]# rabbitmqctl start_app
Starting node rabbit@devel2 ...
[root@devel2 ~]# rabbitmqctl cluster_status  ##默认都是disk模式,如果要ram模式,则加上--ram选项
Cluster status of node rabbit@devel2 ...
[{nodes,[{disc,[rabbit@dev3,rabbit@devel2]}]},
{running_nodes,[rabbit@dev3,rabbit@devel2]},
{cluster_name,<<"rabbit@dev3">>},
{partitions,[]}]

5、启用控制台插件

[root@devel2 ~]# rabbitmq-plugins list
Configured: E = explicitly enabled; e = implicitly enabled
| Status: * = running on rabbit@devel2
|/
[ ] amqp_client 3.5.7
[ ] cowboy 0.5.0-rmq3.5.7-git4b93c2d
[ ] mochiweb 2.7.0-rmq3.5.7-git680dba8
[ ] rabbitmq_amqp1_0 3.5.7
[ ] rabbitmq_auth_backend_ldap 3.5.7
[ ] rabbitmq_auth_mechanism_ssl 3.5.7
[ ] rabbitmq_consistent_hash_exchange 3.5.7
[ ] rabbitmq_federation 3.5.7
[ ] rabbitmq_federation_management 3.5.7
[ ] rabbitmq_management 3.5.7
[ ] rabbitmq_management_agent 3.5.7
[ ] rabbitmq_management_visualiser 3.5.7
[ ] rabbitmq_mqtt 3.5.7
[ ] rabbitmq_shovel 3.5.7
[ ] rabbitmq_shovel_management 3.5.7
[ ] rabbitmq_stomp 3.5.7
[ ] rabbitmq_test 3.5.7
[ ] rabbitmq_tracing 3.5.7
[ ] rabbitmq_web_dispatch 3.5.7
[ ] rabbitmq_web_stomp 3.5.7
[ ] rabbitmq_web_stomp_examples 3.5.7
[ ] sockjs 0.3.4-rmq3.5.7-git3132eb9
[ ] webmachine 1.10.3-rmq3.5.7-gite9359c7
[root@devel2 ~]# rabbitmq-plugins enable rabbitmq_management
The following plugins have been enabled:
mochiweb
webmachine
rabbitmq_web_dispatch
amqp_client
rabbitmq_management_agent
rabbitmq_management

Applying plugin configuration to rabbit@devel2... started 6 plugins.

6、创建用户。默认情况下,内置的guest只能通过localhost访问,通常我们都是在linux服务器上安装rabbitmq,在本地客户端操作,所以需要创建管理用户。

[root@devel2 ~]# rabbitmqctl add_user admin admin
Creating user "admin" ..

[root@devel2 ~]# rabbitmqctl set_permissions -p / admin '.*' '.*' '.*'  ##仅仅设置权限是不足以登录服务器进行管理的,还需要为用户分配user_tag,简单理解就是角色
Setting permissions for user "admin" in vhost "/" ...

当前有如下tag:

rabbitmq集群安装与配置(故障恢复)

[root@devel2 ~]# rabbitmqctl set_user_tags admin administrator
Setting tags for user "admin" to [administrator] ...
[root@devel2 ~]# rabbitmqctl add_user monitor monitor
Creating user "monitor" ...
[root@devel2 ~]# rabbitmqctl set_user_tags monitor monitoring
Setting tags for user "monitor" to [monitoring] ...

7、登录控制台查看状态

rabbitmq集群安装与配置(故障恢复)

每个节点都需要启用manage插件才能显示统计信息。

8、设置mirror,一般来说应该在设计阶段或者开发早期根据应用模块或者子系统规划exchange/queue的命名规范,这样便于从维护的角度进行统一处理,而不是将MQ作为一个临时补充。

比如,设置sys开头的队列复制到所有节点:

[root@dev3 ~]# rabbitmqctl set_policy -p / ha-allqueue "^sys" '{"ha-mode":"all"}'
Setting policy "ha-allqueue" for pattern "^sys" to "{\"ha-mode\":\"all\"}" with priority "0" ...

rabbitmq集群安装与配置(故障恢复)

9、测试某节点宕机。

[root@dev3 ~]# kill -9 19496

rabbitmq集群安装与配置(故障恢复)

宕机期间消费消息/并生成新的消息:

rabbitmq集群安装与配置(故障恢复)

重启宕机的节点:

[root@dev3 ~]# rabbitmq-server -detached  #因为已经是cluster的一部分,所以就不需要重新join了,启动的时候会自动join。否则会出错如下:

[root@dev3 ~]# rabbitmqctl join_cluster rabbit@devel2
Clustering node rabbit@dev3 with rabbit@devel2 ...
{"init terminating in do_boot",{function_clause,[{rabbit_ctl_misc,print_cmd_result,[join_cluster,already_member],[]},{rabbit_cli,main,,[]},{init,start_it,,[]},{init,start_em,,[]}]}} Crash dump is being written to: erl_crash.dump...done
init terminating in do_boot ()

启动后,可以发现有些queue没有同步,需要人工进行同步。

rabbitmq集群安装与配置(故障恢复)

同步后可见消息都是正确的,已经消费的会删掉,新增的会拷贝过去。

10、如果某些时候因为过去时间较长、消息较多,不想同步,而是作为新节点重新join,怎么办?

可以在目标节点执行

rabbitmqctl reset

rabbitmq集群安装与配置(故障恢复)

然后重新执行加入节点的步骤即可。

11、rabbitmq集群日常维护的注意事项和故障恢复可参考rabbitmq集群故障恢复详解

上一篇:plugin-barcodescanner 报错


下一篇:NEXTDAY