MongoDB: Introduction
MongoDB (from “humongous”) is an open-source
document-oriented NoSQL database. Much like relational databases have rows
stored in tables, MongoDB has “documents” stored in “collections”. Documents are
essentially blobs of JSON data. In a relational database, all rows in the same
table have the same set of columns, but in MongoDB the documents in a
collection can have a different schemas (though, quite often you find some
minimal structure that is shared among all documents in a single collection, if
nothing else there is at least the mandatory _id
field in every document).
As the name suggests, MongoDB is engineered to allow really large databases. MongoDB attempts to improving scaling compared to relational databases using two trade-offs, namely:
-
store information in denormalized form to enable horizontal partitioning (shards), and
-
offload work from the database servers to frontend servers wherever possible.
For example, if you’re implementing a social network where each user publishes a
stream of 140-char “posts”, in MongoDB you could1
naively just embed the posts as items in an array on the user document. Instead
of JOINing a user
table with a posts
table the user document already
contains everything needed to render a timeline that shows all the posts for
that user. The benefit of this denormalized form, where you have “everything you
need” in a single document, is that you can split up the document index among N
machines and run queries N times faster (horizontal partitioning / sharding).
Notably, the machine that “owns” a particular document can handle both reads
from and writes to that document.
Splitting a table among N machines like that is much harder to do in relational database, because often your primary table is related to a bunch of other secondary tables, possibly in many-to-one mappings, and then after splitting the primary table up in N shards, the database has to figure out which rows in the secondary table that should live on each shard. However, since this was a many- to-one mapping there will be such rows that needs to live on multiple shards, and now which one of these “copies” is the master copy? And JOINing data from several of the N machines will be slow and defeats the purpose of the N-split.
Traditionally, the easy way to scale up a relational database is to start with read-write splitting so that all writes goes to a master node. The master node is then continuously replicated out to N read nodes and all incoming read queries are scheduled round-robin over the N read nodes. However, this only works if the number of writes you’re getting can be handled by one (really high end) machine. Of course, relational databases also have support for proper sharding where writes are handled by N machines, but it’s quite complicated to setup and, most importantly, the way you set it up depends a lot on your database schema and your workload. This is the point where you pay a lot of money to hire a good (expensive!) DBA.
While the first trade-off listed above was largely about “avoid joins because they make horizontal partition hard”, the truth is that you cannot really avoid joins for real world data. The second trade-off (i.e. that MongoDB offloads more work to the frontend servers compared to relational databases) is about how MongoDB deals with the remaining “joins”.
Suppose we’re still implementing our own social network, but we want to change
the design a bit. Now we want the posts to have a title and allow them to be
much longer. If we still embedded the posts on the user document, we’d soon run
into MongoDBs maximum document size of 16MB, so we need to move the posts to
their own collection. At this point we could use either child-referencing
(letting the posts
field in the user document be an array of document IDs
from the posts
collection) or parent-referencing (letting each post the
posts
collection have a author
field that contains a document id
of the user collection) to connect the collections.
If you anticipate a very large amount of documents in a related collection (perhaps it contains log lines or similar) then you should go with parent- referencing because the IDs alone might make parent document bloated. One benefit of the child-referencing model is ordering of the children (especially for many-to-one relationships where the ordering cannot be stored in the child collection). For tree-like hierarchical structures where you need to find ancestors/descendants efficiently, you should look into materialized paths or nested sets instead.
Regardless of whether we do child or parent-referencing, when we want to render
a profile page for user FOO that includes (at least) the titles of his/her
posts, then we need to do another query to get the titles. For child-referencing
we wait until we get the posts
IDs from the first query, and then we use
those IDs in the second query (two serialized roundtrips from the frontend
server to the mongo server might be slow). For parent-referencing we can send
both queries immediately, but it’s still two queries. In both cases we’re
effectively doing a JOIN in application code.
On one hand, application JOINs are just business as usual for NoSQL. The MongoDB client libraries even have explicit support for them via the DBRef feature; which stores the collection name along with the foreign document ID so that the client library can load them automatically (i.e. much like a foreign key except it’s actually the MongoDB client library that resolves the reference and not the database server).
However, if JOINing in application code just isn’t fast enough, one other option is too denormalize further by duplicating data and including copies of the post titles directly in the user document. This means that we can now render a profile page for user FOO that includes the titles of his/her posts without the need for a second query. Of course, whenever we update a title we also have to update the user document copy of that title, so this change sacrifices some write performance to get better read performance.
In addition to child and parent-referencing, there is also two-way referencing (another form of denormalization). In the example above, we could do both child-referencing and parent-referencing simultaneously, and we might also include a copy of the username in the post document so that the post itself can be loaded without having to query for the parent user document.
It’s worth pointing out that MongoDB 3.2 and onwards does support left-outer
equi-joins via the $lookup
operator, but it is only available in
aggregation pipelines and these are much slower than regular find queries. The
idea is that this functionality should only be used for analytics/reporting and
not in the main application workload.
Another important point is that MongoDB only offers atomicity on document level. So when you’re updating some field and then subsequently updating a copy of that field living in another document, MongoDB might be running several read queries in between and thus exposing the “inconsistent state of the database”. It’s also possible that the databases crashes or restarts before the second write is finished. Avoiding this type of inconsistencies is a big part of why one tends to normalize relational database schemas in the first place.
For updates of multiple documents within the same collection, the
$isolated
operator can be used to prevent read operations between two
write operations but this locks the entire collection (even if you’re running
the WiredTiger storage engine which in general has document-level locking),
effectively making your MongoDB server single threaded. Also, $isolated
doesn’t work on sharded clusters, and if the second update hits an error, the
first update is not rolled back.
Since MongoDB does not offer any way to have “all or nothing” written you need to implement two-phase commit “in application code” (another example where problems are pushed from the database server to the frontend server). Also, even if you do implement two-phase commit manually, you still would not get “proper transactions” because other read operations can still see partial results.
1: If our social network also had a view that rendered all posts from all users interleaved in chronological order, then it would be much better to store "posts" in a separate collection. Another caveat is that an arbitrary number of posts taken together are too big for embedding anyway because MongoDB cannot handle documents larger than 16MB, so even if the posts where max 140 chars the design would break down at 100k posts.