Database Performance: n+1 query problem

To make your query more performant we should avoid using n+1 query. It refers to instead of retrieving all data from database with a single query, it executes a separate query for each record.

This ultimately leads significant increase of number of query. Then it will take time to execute each query. Hence latency will be increased.

Let’s take a look of a scenario. We have a users table,

idname
1Muhammad Lahin
2Sadikul Islam
3Shaju Miah

(for example purposes I used two columns)

Then we have a posts table,

idtitlecreator
1This is my awesome post2
2Hello world.1
3I love software engineering3

Now let’s say we want to fetch all the post for each user. First starts with n+1 query,

const users = "SELECT * FROM users";
const posts = [];

users.forEach(user => {
  const post = `SELECT * FROM posts WHERE creator=${user.id}`;
  posts.push({ ... post, ... user });
});

Now, for each user, we are executing a specific query. If we have 2,000,000 users, we are executing 2,000,000 queries. That’s the reason why it is called an n+1 query, where n is the number of queries, in our case 2,000,000, and +1 is the query that we executed to get all users.

Now let’s write the query using JOIN. Join is a technique that helps us to query instead of one by one, it batches the all query and process at once.

const posts = `SELECT posts.title as postTitle, users.name as creatorName
FROM posts 
JOIN users ON posts.creator = users.id`;

This is how we can batch process query. We can different kind of batch processing using different ORM or ODM. For MongoDB you can use $in operator.

const user_ids = db.users.find().map(user => user.id);
const userPosts = db.posts.find({ creator: { $in: user_ids } });

Leave a Comment

Your email address will not be published. Required fields are marked *