Replica sets 在主从复制上做的扩展,增加了故障自动切换和自动修复成员节点。下面从技术上介绍如何搭建mongodb的replica set (个人觉得,搭建mongodb本身没有多少干货,重要是如何灾难规划)
1 建立复制集群节点的数据存放目录
mkdir -p /opt/mongodata/r1
mkdir -p /opt/mongodata/r2
mkdir -p /opt/mongodata/r3
2 在三个窗口分别执行如下命令:
./mongod --dbpath /opt/mongodata/r1 --port 27018 --rest --replSet myset
./mongod --dbpath /opt/mongodata/r2 --port 27019 --rest --replSet myset
./mongod --dbpath /opt/mongodata/r3 --port 27020 --rest --replSet myset
3 在第四个窗口执行如下命令:
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27018 init.js
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27018/test
init.js 内容如下:
[mongodb@rac4 bin]$ cat init.js
rs.initiate({
_id : "myset",
members : [
{_id : 0, host : "10.250.7.220:27018"},
{_id : 1, host : "10.250.7.220:27019"},
{_id : 2, host : "10.250.7.220:27020"}
]
})
启动3个服务节点,从log日志可以看到,三个节点相互协商,选择端口为27018的一个节点作为Primary,另外两个自动作为Secondary节点。
Mon Oct 31 20:27:53 [conn2] replSet info saving a newer config version to local.system.replset
Mon Oct 31 20:27:53 [conn2] replSet saveConfigLocally done
Mon Oct 31 20:27:53 [conn2] replSet replSetInitiate config now saved locally. Should come online in about a minute.
Mon Oct 31 20:27:53 [conn2] command admin.$cmd command: { replSetInitiate: { _id: "myset", members: [ { _id: 0.0, host: "10.250.7.220:27018" }, { _id: 1.0, host: "10.250.7.220:27019" }, { _id: 2.0, host: "10.250.7.220:27020" } ] } } ntoreturn:1 reslen:112 12095ms
Mon Oct 31 20:27:53 [conn2] end connection 127.0.0.1:15252
Mon Oct 31 20:27:53 [rsStart] replSet STARTUP2
Mon Oct 31 20:27:53 [rsHealthPoll] replSet info 10.250.7.220:27019 is down (or slow to respond): still initializing
Mon Oct 31 20:27:53 [rsHealthPoll] replSet member 10.250.7.220:27019 is now in state DOWN
Mon Oct 31 20:27:53 [rsHealthPoll] replSet info 10.250.7.220:27020 is down (or slow to respond): still initializing
Mon Oct 31 20:27:53 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state DOWN
Mon Oct 31 20:27:53 [rsSync] replSet SECONDARY
Mon Oct 31 20:27:55 [initandlisten] connection accepted from 10.250.7.220:44134 #3
Mon Oct 31 20:27:59 [rsHealthPoll] replSet info member 10.250.7.220:27019 is up
Mon Oct 31 20:27:59 [rsHealthPoll] replSet member 10.250.7.220:27019 is now in state STARTUP2
Mon Oct 31 20:27:59 [rsMgr] not electing self, 10.250.7.220:27019 would veto
Mon Oct 31 20:28:01 [initandlisten] connection accepted from 10.250.7.220:44137 #4
Mon Oct 31 20:28:05 [rsMgr] replSet info electSelf 0
Mon Oct 31 20:28:05 [rsMgr] replSet PRIMARY
Mon Oct 31 20:28:07 [rsHealthPoll] replSet member 10.250.7.220:27019 is now in state RECOVERING
Mon Oct 31 20:28:10 [initandlisten] connection accepted from 10.250.7.220:44141 #5
Mon Oct 31 20:28:10 [initandlisten] connection accepted from 10.250.7.220:44142 #6
Mon Oct 31 20:28:11 [slaveTracking] build index local.slaves { _id: 1 }
Mon Oct 31 20:28:11 [slaveTracking] build index done 0 records 0.001 secs
Mon Oct 31 20:28:13 [rsHealthPoll] replSet info member 10.250.7.220:27020 is up
Mon Oct 31 20:28:13 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state STARTUP2
Mon Oct 31 20:28:14 [conn6] end connection 10.250.7.220:44142
Mon Oct 31 20:28:14 [conn5] end connection 10.250.7.220:44141
Mon Oct 31 20:28:15 [initandlisten] connection accepted from 10.250.7.220:44144 #7
Mon Oct 31 20:28:15 [rsHealthPoll] replSet member 10.250.7.220:27019 is now in state SECONDARY
Mon Oct 31 20:28:15 [rsHealthPoll] replSet member 10.250.7.220:27020 is now in state RECOVERING
Mon Oct 31 20:28:28 [initandlisten] connection accepted from 127.0.0.1:59232 #8
从客户端进入数据库:
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27018
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27018/test
PRIMARY> rs.status() --查看replica set状态的命令
{
"set" : "myset",
"date" : ISODate("2011-10-31T12:29:17Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "10.250.7.220:27018",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1320064073000,
"i" : 1
},
"optimeDate" : ISODate("2011-10-31T12:27:53Z"),
"self" : true
},
{
"_id" : 1,
"name" : "10.250.7.220:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 78,
"optime" : {
"t" : 1320064073000,
"i" : 1
},
"optimeDate" : ISODate("2011-10-31T12:27:53Z"),
"lastHeartbeat" : ISODate("2011-10-31T12:29:16Z"),
"pingMs" : 0
},
{
"_id" : 2,
"name" : "10.250.7.220:27020",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 64,
"optime" : {
"t" : 1320064073000,
"i" : 1
},
"optimeDate" : ISODate("2011-10-31T12:27:53Z"),
"lastHeartbeat" : ISODate("2011-10-31T12:29:16Z"),
"pingMs" : 1
}
],
"ok" : 1
}
状态中关键数据位
state:1表示该host是当前可以进行读写,2:不能读写
health:1表示该host目前是正常的,0:异常
下面会进行从库的读测试,会出现error: { "$err" : "not master and slaveok=false", "code" : 13435 }
PRIMARY> rs.conf() --查看replica set配置的命令
{
"_id" : "myset",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "10.250.7.220:27018"
},
{
"_id" : 1,
"host" : "10.250.7.220:27019"
},
{
"_id" : 2,
"host" : "10.250.7.220:27020"
}
]
}
PRIMARY> db.isMaster();--查看是否是主库的命令,当然可以从提示符中看出primary
{
"setName" : "myset",
"ismaster" : true, ##是primary
"secondary" : false,
"hosts" : [
"10.250.7.220:27018",
"10.250.7.220:27020",
"10.250.7.220:27019"
],
"primary" : "10.250.7.220:27018",
"me" : "10.250.7.220:27018",
"maxBsonObjectSize" : 16777216,
"ok" : 1
}
分别登录其他两个mongodb 服务:
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27019
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27019/test
SECONDARY>
SECONDARY>
SECONDARY> db.isMaster();
{
"setName" : "myset",
"ismaster" : false,
"secondary" : true,
"hosts" : [
"10.250.7.220:27019",
"10.250.7.220:27020",
"10.250.7.220:27018"
],
"primary" : "10.250.7.220:27018",
"me" : "10.250.7.220:27019",
"maxBsonObjectSize" : 16777216,
"ok" : 1
}
SECONDARY>
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27020
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27020/test
SECONDARY>
SECONDARY> db.isMaster();
{
"setName" : "myset",
"ismaster" : false,
"secondary" : true,
"hosts" : [
"10.250.7.220:27020",
"10.250.7.220:27019",
"10.250.7.220:27018"
],
"primary" : "10.250.7.220:27018",
"me" : "10.250.7.220:27020",
"maxBsonObjectSize" : 16777216,
"ok" : 1
}
至此搭建成功。。
对主库进行写操作,从库查看:
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27020
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27020/test
PRIMARY> use test
switched to db test
PRIMARY>
PRIMARY> db.yql.insert({val:"this is a message on 27020 primary !"});
PRIMARY>
主库日志:
Mon Oct 31 21:03:46 [FileAllocator] allocating new datafile /opt/mongodata/r3/test.ns, filling with zeroes...
Mon Oct 31 21:03:46 [FileAllocator] done allocating datafile /opt/mongodata/r3/test.ns, size: 16MB, took 0.256 secs
Mon Oct 31 21:03:46 [FileAllocator] allocating new datafile /opt/mongodata/r3/test.0, filling with zeroes...
Mon Oct 31 21:03:48 [clientcursormon] mem (MB) res:35 virt:2726 mapped:1248
Mon Oct 31 21:03:50 [FileAllocator] done allocating datafile /opt/mongodata/r3/test.0, size: 64MB, took 4.488 secs
Mon Oct 31 21:03:50 [conn6] build index test.yql { _id: 1 }
Mon Oct 31 21:03:50 [conn6] build index done 0 records 0 secs
Mon Oct 31 21:03:50 [conn6] insert test.yql 4759ms
Mon Oct 31 21:03:50 [FileAllocator] allocating new datafile /opt/mongodata/r3/test.1, filling with zeroes...
Mon Oct 31 21:03:51 [conn8] getmore local.oplog.rs query: { ts: { $gte: new Date(5669632022159556609) } } cursorid:6257712144272734285 nreturned:1 reslen:146 5031ms
Mon Oct 31 21:03:51 [conn5] getmore local.oplog.rs query: { ts: { $gte: new Date(5669632022159556609) } } cursorid:423878080662643430 nreturned:1 reslen:146 5631ms
Mon Oct 31 21:03:54 [FileAllocator] done allocating datafile /opt/mongodata/r3/test.1, size: 128MB, took 3.818 secs
从库日志,可以看出 从库从主库应用日志,复制数据文件的过程。
Mon Oct 31 20:49:27 [clientcursormon] mem (MB) res:19 virt:2693 mapped:1232
Mon Oct 31 20:54:27 [clientcursormon] mem (MB) res:19 virt:2693 mapped:1232
Mon Oct 31 20:59:27 [clientcursormon] mem (MB) res:19 virt:2693 mapped:1232
Mon Oct 31 21:03:51 [FileAllocator] allocating new datafile /opt/mongodata/r2/test.ns, filling with zeroes...
Mon Oct 31 21:03:54 [FileAllocator] done allocating datafile /opt/mongodata/r2/test.ns, size: 16MB, took 3.396 secs
Mon Oct 31 21:03:54 [FileAllocator] allocating new datafile /opt/mongodata/r2/test.0, filling with zeroes...
Mon Oct 31 21:04:00 [FileAllocator] done allocating datafile /opt/mongodata/r2/test.0, size: 64MB, took 5.79 secs
Mon Oct 31 21:04:00 [rsSync] build index test.yql { _id: 1 }
Mon Oct 31 21:04:00 [rsSync] build index done 0 records 0 secs
Mon Oct 31 21:04:00 [FileAllocator] allocating new datafile /opt/mongodata/r2/test.1, filling with zeroes...
Mon Oct 31 21:04:03 [FileAllocator] done allocating datafile /opt/mongodata/r2/test.1, size: 128MB, took 2.965 secs
Mon Oct 31 21:04:37 [clientcursormon] mem (MB) res:17 virt:2853 mapped:1312
Mon Oct 31 21:04:41 [conn6] end connection 127.0.0.1:44672
如前面介绍rs.status()时所说,state:1表示该host是当前可以进行读写,2:不能读写
{
"_id" : 1,
"name" : "10.250.7.220:27019",
"health" : 1,
"state" : 2, -- 从库的state为 2 ,此时是不可读写的。
"stateStr" : "SECONDARY",
"uptime" : 78,
"optime" : {
"t" : 1320064073000,
"i" : 1
},
在从库进行读操作,会报错。
[mongodb@rac4 bin]$ ./mongo 127.0.0.1:27019
MongoDB shell version: 2.0.1
connecting to: 127.0.0.1:27019/test
SECONDARY> use test
switched to db test
SECONDARY> db.yql.find();
error: { "$err" : "not master and slaveok=false", "code" : 13435 }