In this post we are going to understand some basic concepts of database engineering. Those will help us to know what is actually happening at lower level.
Generally Databases identify each row in a table with uniquely identifier called row id. MySQL uses primary key as row id. Postgresql has ctid which acts as row id.
Row based databases usually store their rows in a fixed size location inside the disk. Lets say for example, we have MySQL database and the storage engine is innoDB. Our user table has 9 rows.
Inside the disk we have isolated memory location which is called page, since the storage engine is innoDB each page has fixed size 16kb.
It is a tree based Data Structure where all pages are stored.
Fetching something from heap is generally costly and this is the reason why we have indexing.
Generally data is stored inside disk as permanent storage and heap as temporary storage.
I/O, or input/output, refers to an operation that is a read request to the heap. An i/o can fetch one or more pages. It is better to minimize the expensive i/o or use some optimisation technique like indexing.
Database Indexing is one of the most common technique to make database query faster. In the heap, there is one or more index table. In the table, there is a pointer which points to a particular column’s page number. Rather than traversing the whole pages with the help pointer coming from index table we can fetch particular page.
If you want to know more about indexing.