During my university course of Web and Cloud Computing (WACC) at the RUG university, we were given the task to build a distributed and scalable web application that should be run in a cluster. Since docker is quite a nice environment to run single applications in, we decided to actually run the whole application with all services in a docker swarm environment. As the new docker swarm mode was just around the corner, we started using it and really liked in – also because it synchronized over multiple nodes. The whole application is public at https://github.com/timonback/wacc.

For the application we decided to use multiple technologies, such as MongoDB, Cassandra, Play Framework (for Scala) and a bit of Java.

Database

Because we spend quite a lot of time on figuring out how to actually use docker swarm mode for the databases, here are the commands to run it on a cluster:

docker service create --name cassandra --network services --endpoint-mode dnsrr --mode global -e 'CASSANDRA_SEEDS=auto' -e "SEEDS_COMMAND=nslookup tasks.cassandra | awk '/^Address: / {print \$2}' | paste -d, -s -" -e 'CASSANDRA_BROADCAST_ADDRESS=auto' -e 'CASSANDRA_LISTEN_ADDRESS_COMMAND=hostname -i' -e 'CASSANDRA_BROADCAST_ADDRESS_COMMAND=hostname -i' --mount type=bind,target=/var/lib/cassandra,source=/vol/cassandra -e MAX_HEAP_SIZE=150M -e HEAP_NEWSIZE=100M webscam/cassandra:3.7

docker service create --mode global --network services --endpoint-mode dnsrr --mount type=bind,source=/vol/mongo/data,target=/data/db --mount type=bind,source=/vol/mongo/config,target=/data/configdb --name mongo mongo mongod --replSet mongoReplica

For both databases, we use a volume in the /vol folder, which you need to create yourself:

sudo mkdir -p /vol
sudo chown ubuntu /vol
sudo chgrp ubuntu /vol
mkdir -p /vol/cassandra
mkdir -p /vol/mongo /vol/mongo/data /vol/mongo/config

Cassandra

Cassandra does run already out of the box now. The request get distributed in a round-robin way (–endpoint-mode dnsrr). Remark: We reduced the memory amount of Cassandra manually, but this is truely optional and depends on your needs.

MongoDB

MongoDB needs a little extra configuration after the first start, unfortunately. Basically, the ReplicateSet needs to be initalized in one instance and the others need to join to enable Data Sharding between all MongoDB instances.

MongoDB communicates via the IP addresses of the single instances, so first the IPs of all MongoDB instances are needed. Luckily, docker does not make this too hard. Just run the following command for all MongoDB instances (you might have to run the on each specific node, since docker exec does not communicate on the local node, but not other nodes in the swarm/cluster). Always replace the with stars * annotated strings the your configuration:

docker exec -it *container-name* hostname -i

So on the first (master node), you create a ReplicateSet, for that first the Mongo Shell is opened and then another command is executed:

docker exec -it mongo.*container-name* mongo

rs.initiate({ _id: "mongoReplica", members: [{ _id: 0, host: "*ip-of-container*:27017" }], settings: { getLastErrorDefaults: { w: "majority" }}})

Then there are two options, either add the other nodes now, or add the other nodes from their system (might be interesting for automatic scripts). To add the others directly, use this:

rs.add("*ip-of-new-node*")

Or to add nodes from that specific node, get again a MongoDB shell (connected to the master) and then add the ip:

docker exec -it *mongo-container-name* mongo --host *ip-of-prime*

rs.add("*ip-of-new-node*")

Now, MongoDB is all set to run directly in the docker cluster. However there might be some trouble when a Mongo node goes down and joins with a different IP addresse, but this is rather a problem in MongoDB (since it’s no problem in Cassandra) and hopefully they implenent a feature for that. Till then, manually reconfiguration might be possible.

Conclusion

Docker Swarm Mode is quite a nice tool, but the first configuration takes a while to figure it out. But it is definetly worth a try, especially in a clustered environment, when docker-compose is not possible anymore.