Let’s follow the given steps to create and configure the 3 nodes replica sets in MongoDB. The configuration should be as mentioned below:
- Port 27017 (default port) – primary node, 1.log as log file name 1.log, data file in rs1
- Port 27018 – secondary node, 2.log as log file name, data file in rs2
- Port 27019 – secondary node, 3.log as log file name, data file in rs3.
The replica sets name is “Testing”.
Steps 1
You need to create three directories to store three mongod instances data separately.
On windows
Open command prompt and give the following command.
C:\>mkdir \data\rs1 \data\rs2 \data\rs3
On unix or mac
mkdir –p /data/rs1 /data/rs2 /data/rs3
Step 2
Now let’s start three mongod instances as follows.
On windows
C:\>start mongod --replSet Testing --logpath \data\rs1\1.log --dbpath \data\rs1 --port 27017 --smallfiles --oplogSize 64
C:\>start mongod --replSet Testing --logpath \data\rs1\2.log --dbpath \data\rs2 --port 27018 --smallfiles --oplogSize 64
C:\>start mongod --replSet Testing --logpath \data\rs1\3.log --dbpath \data\rs3 --port 27019 --smallfiles --oplogSize 64
On unix or mac
mongod --replSet Testing --logpath \data\rs1\1.log --dbpath /data/rs1 --port 27017 --smallfiles --oplogSize 64 --fork
mongod --replSet Testing --logpath \data\rs1\2.log --dbpath /data/rs2 --port 27018 --smallfiles --oplogSize 64 –fork
mongod --replSet Testing --logpath \data\rs1\3.log --dbpath /data/rs3 --port 27019 --smallfiles --oplogSize 64 --fork
Now three mongod servers are running but they are not configured or initialized yet to interconnect to work for replica sets.
Step 3
You need to interconnect all three nodes. You need to open a command prompt and write configuration code and finally you need to call rs.initiate(config) command to start the replica sets as expected. I am giving replica sets name as “acemyskillsrepsets” but you can give it any other name.
C:\>mongo --port 27017 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27017/test > config = {_id:"Testing", members: [ {_id: 0, host: "localhost:27017"}, {_id: 1, host: "localhost:27018"}, {_id: 2, host: "localhost:27019"}] }; > rs.initiate(config); { "ok" : 1 }
Now replica sets with three nodes is setup and configured successfully. You can check the status of the replica sets withrs.status().
Testing:PRIMARY> rs.status(); { "set" : "Testing", "date" : ISODate("2015-09-13T10:14:03.474Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 1855, "optime" : Timestamp(1442138448, 1), "optimeDate" : ISODate("2015-09-13T10:00:48Z"), "electionTime" : Timestamp(1442138452, 1), "electionDate" : ISODate("2015-09-13T10:00:52Z"), "configVersion" : 1, "self" : true }, { "_id" : 1, "name" : "localhost:27018", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 795, "optime" : Timestamp(1442138448, 1), "optimeDate" : ISODate("2015-09-13T10:00:48Z"), "lastHeartbeat" : ISODate("2015-09-13T10:14:02.580Z"), "lastHeartbeatRecv" : ISODate("2015-09-13T10:14:02.579Z"), "pingMs" : 0, "configVersion" : 1 }, { "_id" : 2, "name" : "localhost:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 795, "optime" : Timestamp(1442138448, 1), "optimeDate" : ISODate("2015-09-13T10:00:48Z"), "lastHeartbeat" : ISODate("2015-09-13T10:14:02.580Z"), "lastHeartbeatRecv" : ISODate("2015-09-13T10:14:02.580Z"), "pingMs" : 0, "configVersion" : 1 } ], "ok" : 1 }
If you observe the above result set its clearly written that there are three nodes and localhost:27017 is the primary and rest two are secondary nodes. Congratulations!!! the replication system named as “Testing” is now ready to use.
Testing
Now we need to insert a document into the primary node and after insertion read the inserted document.
Testing:PRIMARY> db.movies.insert({name: "Up", releasedyear: "2009"}); WriteResult({ "nInserted" : 1 }) Testing:PRIMARY> db.movies.find(); { "_id" : ObjectId("55f54f55524898bbac0e5792"), "name" : "Up", "releasedyear" : "2009" }
Now let’s connect with a secondary node and read the data inserted by the primary node. If we read the data in the secondary node means the secondary nodes are in sync with the primary node and primary node’s replication is available to all secondary nodes.
To connect with the localhost:27018 by opening another command prompt and connect with a secondary node localhost:27018. After connecting with localhost:27018 and once you try to read the data from the node, it gives you an error because by default you cannot read from the secondary node. To read data from the secondary node you need to give the command rs.slaveOk().
C:\>mongo --port 27018 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27018/test Testing:SECONDARY> db.movies.find(); Error: error: { "$err" : "not master and slaveOk=false", "code" : 13435 } Testing:SECONDARY> rs.slaveOk(); Testing:SECONDARY> db.movies.find(); { "_id" : ObjectId("55f54f55524898bbac0e5792"), "name" : "Up", "releasedyear" : "2009" }
You can only read data from the secondary nodes but cannot insert document to the secondary nodes. Try the following query which gives you error.
Testing:SECONDARY> db.movies.insert({name: "The Day After Tomorrow", releasedyear: 2004}); WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } })
Replication Internals
To know how this replication is happening internally, you need to analyze the oplog.rs file in each nodes including primary and secondary.
Let’s connect to the primary node (localhost:27017) and switch to local database. You can use show collections command to see all the collections present in the local database. Finally, you can see the oplog.rs file by giving the query db.oplog.rs.find().pretty().
C:\>mongo --port 27017 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27017/test Testing:PRIMARY> use local switched to db local Testing:PRIMARY> show collections me oplog.rs startup_log system.indexes system.replset Testing:PRIMARY> db.oplog.rs.find().pretty(); { "ts" : Timestamp(1442138448, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } { "ts" : Timestamp(1442139989, 1), "h" : NumberLong("-5383414929107714031"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "o" : { "create" : "movies" } } { "ts" : Timestamp(1442139989, 2), "h" : NumberLong("-3864814140778310651"), "v" : 2, "op" : "i", "ns" : "test.movies", "o" : { "_id" : ObjectId("55f54f55524898bbac0e5792"), "name" : "Up", "releasedyear" : "2009" }
In the above file you can see there are three operations creating replica sets, creating movies collection and inserting a movie “up” into the collection. You can find the similar oplog.rs file in all secondary nodes. This is the file which is synced between the nodes and replication is happened between the nodes.
Now let’s connect to a secondary node localhost:27018 to observe the oplog.rs but you will definitely see the similar file over there too.
C:\>mongo --port 27018 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27018/test Testing:SECONDARY> use local switched to db local Testing:SECONDARY> show collections me oplog.rs replset.minvalid startup_log system.indexes system.replset Testing:SECONDARY> db.oplog.rs.find().pretty(); { "ts" : Timestamp(1442138448, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } { "ts" : Timestamp(1442139989, 1), "h" : NumberLong("-5383414929107714031"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "o" : { "create" : "movies" } } { "ts" : Timestamp(1442139989, 2), "h" : NumberLong("-3864814140778310651"), "v" : 2, "op" : "i", "ns" : "test.movies", "o" : { "_id" : ObjectId("55f54f55524898bbac0e5792"), "name" : "Up", "releasedyear" : "2009" } }
Failover in Replication
To see the failover situation in replica sets, let’s shutdown the primary node. It can be clearly seen in which node you are now in the command prompt. But if you want to check from command, you can give rs.isMaster(). If you are in the secondary node, you need to connect to localhost:27017 i.e. primary node. Once you are inside the primary node, we need to shutdown the primary node giving thedb.shutdownServer() command. Once the primary server is down there will be automatic election process to select the a new primary node which will take only a few milliseconds.
C:\>mongo --port 27017 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27017/test Testing:PRIMARY> db.shutdownServer(); shutdown command only works with the admin database; try 'use admin' Testing:PRIMARY> use admin switched to db admin Testing:PRIMARY> db.shutdownServer(); 2015-09-13T17:11:16.247+0545 I NETWORK DBClientCursor::init call() failed server should be down... 2015-09-13T17:11:16.274+0545 I NETWORK trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed 2015-09-13T17:11:16.275+0545 I NETWORK reconnect 127.0.0.1:27017 (127.0.0.1) ok 2015-09-13T17:11:17.031+0545 I NETWORK Socket recv() errno:10054 An existing connection was forcibly closed by the remote host. 127.0.0.1:27017 2015-09-13T17:11:17.032+0545 I NETWORK SocketException: remote: 127.0.0.1:27017 error: 9001 socket exception [RECV_ERROR] server [127.0.0.1:27017] 2015-09-13T17:11:17.032+0545 I NETWORK DBClientCursor::init call() failed 2015-09-13T17:11:17.036+0545 I NETWORK trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed 2015-09-13T17:11:18.038+0545 W NETWORK Failed to connect to 127.0.0.1:27017, reason: errno:10061 No connection could be made because the target machine actively refused it. 2015-09-13T17:11:18.039+0545 I NETWORK reconnect 127.0.0.1:27017 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:27017 (127.0.0.1), connection attempt failed
Now let’s connect to localhost:27018 and give the rs.status() command . In the result set it can be seen that one of the two secondary nodes becomes primary. Whenever the previously shutdown node becomes up, it will be secondary.
C:\>mongo --port 27018 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27018/test Testing:PRIMARY> rs.status(); { "set" : "Testing", "date" : ISODate("2015-09-13T11:27:09.780Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2015-09-13T11:27:09.580Z"), "lastHeartbeatRecv" : ISODate("2015-09-13T11:26:15.452Z"), "pingMs" : 0, "lastHeartbeatMessage" : "Failed attempt to connect to localhost:27017; couldn't connect to server localhost:27017 (127.0.0.1), connection attempt failed", "configVersion" : -1 }, { "_id" : 1, "name" : "localhost:27018", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 6208, "optime" : Timestamp(1442139989, 2), "optimeDate" : ISODate("2015-09-13T10:26:29Z"), "electionTime" : Timestamp(1442143577, 1), "electionDate" : ISODate("2015-09-13T11:26:17Z"), "configVersion" : 1, "self" : true }, { "_id" : 2, "name" : "localhost:27019", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 5179, "optime" : Timestamp(1442139989, 2), "optimeDate" : ISODate("2015-09-13T10:26:29Z"), "lastHeartbeat" : ISODate("2015-09-13T11:27:09.476Z"), "lastHeartbeatRecv" : ISODate("2015-09-13T11:27:09.475Z"), "pingMs" : 0, "configVersion" : 1 } ], "ok" : 1 }
To see what happens when the previous node is down but one document is inserted into the newly elected primary node. Its very simple, whenever the previously down server becomes up it becomes as secondary node and sync all the data (oplog.rs) that is in the new elected primary nodes. Let’s see this scenario in the following section.
- Inserting a movie document into the new primary node.
- Testing:PRIMARY> db.movies.insert({name: “The Day After Tomorrow”, releasedyear: 2004});
- WriteResult({ “nInserted” : 1 })
- Make the previously down node to up
- C:\>start mongod –replSet Testing –logpath \data\rs1\1.log –dbpath \data\rs1 –port 27017 –smallfiles –oplogSize 64
Connect to the localhost:27017 (which was down before) and you can see all the documents including the inserted documents when it was down
C:\>mongo --port 27017 MongoDB shell version: 3.0.3 connecting to: 127.0.0.1:27017/test Testing:SECONDARY> rs.slaveOk(); Testing:SECONDARY> db.movies.find(); { "_id" : ObjectId("55f54f55524898bbac0e5792"), "name" : "Up", "releasedyear" : "2009" } { "_id" : ObjectId("55f55e78e57ac7df5b34d4cf"), "name" : "The Day After Tomorrow", "releasedyear"
- Lastly, you can check the oplog.rs file and if you match it primary node. Its similar as primary node.
Testing:SECONDARY> use local switched to db local Testing:SECONDARY> show collections me oplog.rs replset.minvalid startup_log system.indexes system.replset Testing:SECONDARY> db.oplog.rs.find().pretty(); { "ts" : Timestamp(1442138448, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } { "ts" : Timestamp(1442139989, 1), "h" : NumberLong("-5383414929107714031"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "o" : { "create" : "movies" } } { "ts" : Timestamp(1442139989, 2), "h" : NumberLong("-3864814140778310651"), "v" : 2, "op" : "i", "ns" : "test.movies", "o" : { "_id" : ObjectId("55f54f55524898bbac0e5792"), "name" : "Up", "releasedyear" : "2009" } } { "ts" : Timestamp(1442143864, 1), "h" : NumberLong("2305258910925694865"), "v" : 2, "op" : "i", "ns" : "test.movies", "o" : { "_id" : ObjectId("55f55e78e57ac7df5b34d4cf"), "name" : "The Day After Tomorrow", "releasedyear" : 2004 } }
That’s the end of this article. I believe you understand the basic set up and configuration of replica sets in MongoDB. After reading and exercising this article you can try replica sets with separate computer or if possible geographically separated servers that will reflect the real implementation of the replica sets but don’t forget to manage your firewall configuration in that case. Happy Coding!!! Cheers!!!