Meteor.js and MongoDB Replica Set for Oplog Tailing

Meteor updated the live database driver that is responsible for the real time updates and is one of Meteors great features. If someone changes a document in the MongoDB using the Meteor Application connected to the MongoDB, it was easy for this Meteor Application to inform any client that is subscribed to the data and update its UI. But what if data changes in the database through another process? The Meteor Application does not notice this immediately but there is a mechanism that allows Meteor to notice this database change anyway. This was done by checking if the data changed every 10 Seconds (poll and diff strategy) and if so, Meteor updates the clients that rely on this specific data. 10 Seconds are quite good but not really real time and this is the reason why the Meteor team had an optimization on the roadmap from the beginning. This optimization is now released and works with the so called ‘oplog tailing’.

##What is oplog tailing? This is the wrong question to ask in the first place. Before we answer this we need to know…

##What is a MongoDB Replica Set? This basically means that you do not run only one MongoDB process but many and every MonogDB process always have the exact same data in their databases. This is quite important for production environments. If you only have one MongoDB process running and it crashes your whole application would not be able to run anymore.

So we want to run multiple MongoDB processes in a replica set and one process of the replica set is the so called primary. All other MongoDB processes are called secondary. Reads and writes from a client typically only go to the primary process of the replica set. There are more options than just primary and secondaries but we will only look at this configuration. We also want to have one primary with exactly two secondaries. This way we would run three MongoDB processes which is a good amount for redundancy. If the primary becomes unavailable the replica set holds an election and one of the secondaries becomes the new primary. So this is good so far, but…

##What is oplog tailing? We now have three MongoDB processes running (I’ll show you how this works in practice in this article, too) and each of these have the same data set, we know that now. But how on earth do the secondaries know exactly what the data set of the primary looks like, sync their own data set and all of this in real time? This is where the oplog comes into play. The oplog is the operations log and is a collection that keeps track of all operations that modifies data. If someone inserts a new document to a collection this operation will stored into the oplog collection. The secondary processes simply copy the changes in the oplog from the primary to their own oplog collection and update their data set. And this oplog is exactly what we need and one of the reasons why we need the replica set because now our Meteor Application or multiple Meteor Application can watch the oplog as well and inform clients that rely on the data immediately.

##Okay! How do I run a replica set? At first we need to start three ‘mongod’ processes from our terminal and we have to specify the name of the replica set, an individual port, an individual db path and a logpath. Note that only the name of the replica set must be the same for each mongod process. I start my three mongod like this:

mongod --replSet msrs --port 27017 --dbpath ~/Workspace/priv/manuel-schoebel.com/db/rs1/ --fork --logpath ~/Workspace/priv/manuel-schoebel.com/db/rs1/rs1.log

mongod --replSet msrs --port 27016 --dbpath ~/Workspace/priv/manuel-schoebel.com/db/rs2/ --fork --logpath ~/Workspace/priv/manuel-schoebel.com/db/rs2/rs2.log

mongod --replSet msrs --port 27015 --dbpath ~/Workspace/priv/manuel-schoebel.com/db/rs3/ --fork --logpath ~/Workspace/priv/manuel-schoebel.com/db/rs2/rs2.log

It may happen that there is a warning in your log but you can ignore it on your local development machine. Check if this warning also exists in your production environment, in my case it did not.

** WARNING: soft rlimits too low. Number of files is 256, should be at least 1000

Now when our mongod process are running we can also double check it in our terminal by typing:

ps aux | grep mongo

Next we want to connect to our primary mongod process and we choose the first one on port 27017 for that. We connect to it with our mongo shell like this:

mongo localhost:27017

Now we can create our replica set, type this into your mongo shell:

config = {_id: ‘msrs’, members: [{_id: 0, host: ‘localhost:27017’}, {_id: 1, host: 'localhost:27016},{_id: 2, host: 'localhost:27015'}]};
 
rs.initiate(config);

Our replica set will be initiated with our specified members and this could take some time. We can check the status with:

rs.status();

Okay, if everything worked out like expected we should have a replica set up and running now. Now we start with the Meteor related part and the first step is that we need a special user for meter oplog that we can call s.th. like 'oplogger'.

Make sure you are logged in on your primary mongod instance. You can verify that by typing 'db.isMaster()' on your mongo shell. You should see on your mongo shell:

msrs:PRIMARY>

and not

msrs:SECONDARY>

If this is not the case you can configure what members of your replica set should be master and slave.

config.members[0].priority = 1
config.members[1].priority = 0.5
config.members[2].priority = 0.5
rs.reconfig(config)

Then the first member should be the master and this is our mongod process running on port 27017 we connected to.

We do not really need to, but we check if there are users already:

show users

For a clean mongodb there should not be one. This command shows the users depending on which database you are using. We want to create our user in the special ‘admin’ database. So we switch to this database by:

msrs:PRIMARY> use admin

Again, there will be no users, too. Now we add the oplogger into the admin database.

    msrs:PRIMARY> db.addUser({user:'oplogger',pwd:'YOUR_PASSWORD',roles:[],otherDBRoles:{local:["read"]}})

If you type now ‘show users’ we will see the oplogger. The oplogger has the right to read everything that is written to the local database. Every mongod instance has its own local database and in this database there is the oplog in the local.oplog.rs collection. In this collection is every operation logged and if you inspect it (e.g. with RoboMongo) you will see a document that looks like this:

{
  "ts" : Timestamp(1389778589, 1),
  "h" : -6184987159182693880,
  "v" : 2,
  "op" : "i",
  "ns" : "admin.system.users",
  "o" : {
    "_id" : ObjectId("52d6569ddfa5a7b581dbe963"),
    "user" : "oplogger",
    "pwd" : "552d22594bb530f7c56c087830f9e08d",
    "roles" : [],
    "otherDBRoles" : {
      "local" : [
        "read"
      ]
    }
  }
}

This was the insert (“op”:”i”) of our ‘oplogger’ user.

Okay, that looks good. So now we can start our meteor app backed by a mongo replicaSet with oplog enabled. We need to set the environment variable MONGO_OPLOG_URL like this:

MONGO_OPLOG_URL='mongodb://oplogger:YOUR_PASSWORD@localhost:27017,localhost:27016,localhost:27015/local?authSource=admin'

If you do all of this you should have your Meteor Application up and running backed by a replica set. If you have a serious project you should run your primary and secondary mongod process each on a different server, of course.

##Start your mongod processes from an upstart script If you are on your production server you also want to start your mongod processes for your replica with an upstart script. I found a good one here.

Make sure that you remove the default /etc/mongodb.conf file so that it is not started, too. Also make sure that the database folders for your mongod instances are owned by the mongodb user. You can do this with chown:

sudo chown -R mongodb:mongodb /path/to/dbs

If you want to dig deeper into all of this, here are some good resources you might want to read:

Meteor.js and MongoDB Replica Set for Oplog Tailing

Privacy

Imprint

Terms and Conditions