In the last blog post we understand how DB Index can improve performance. In this post we will go through Primary Index, Cluster Index.
Primary Index
In a table or a collection a Primary Index in an index on a set of keys that includes the unique primary key. For example, in a user table the id key is unique so database creates a primary index for that key by default. You can create primary index by yourself for key like, username, email etc.
Let’s create a user table.
id | name | username | |
1 | Mizanur Lahin | lahin31 | lahin@gmail.com |
2 | Desh Chowdhury | desh | desh@gmail.com |
3 | Mujammal Ahmed | mujammal | mujammal@gmail.com |
4 | Sadikul Islam | sadik | sadik@gmail.com |
In this table the id is the primary key so the id is set to primary index by default.
How index table look like,
block | id |
1 | 1 |
2 | 3 |
The index table holds block reference and another one is the actual id of the user.
This is also called Sparse.
Cluster Index
Cluster Index is an index that need to be sorted and the index value will be repeated to different rows of the database table. Let’s look at an example of it,
id | name | department |
1 | Lahin | Computer Science |
2 | Shaju | Computer Science |
3 | Sadik | Economics |
4 | Nafisa | Economics |
5 | Rasel | English |
As you can see the Computer Science and Economics department is repeated. Doesn’t matter it is repeatable or not, in index table this will create a single row for Computer Science and Economics as well as rest of them. So the index table will be like this,
block | department |
1 | Computer Science |
1 | Economics |
2 | English |
Our block and index table,
This is also called Dense.
Since the Economics Department exists in two blocks, Block 1 and Block 2, that’s why Block 1 has a pointer, which is called Block Hanker, that points to Block 2.
In Cluster Index the actual value of index attribute can be repeated and thats why there are multiple reference to a single block from index table. This is why it is called Dense.