Building a database engine - Premium package
This contains:
- The 133-page part I
- The 163-page part II
- The 98-page appendix
- Source code to the DB engine (4 different versions)
- Money-back guarantee
You can download a 42-page sample chapter here
In the book, we are building a real database with:
- Storage layer
- WAL
- Data pages
- Buffer pools
- B-Tree and hash-based indexes
Storage layer
Exploring how real database engines store your data.
Starting from a naive CSV-based approach we slowly work our way up to TLV-encoded binary files that can encode any type of data into an efficient, language-agnostic format.
We're going to store column definitions, records, B-Tree indexes, WAL log entries, hash indexes in TLV format.
B-Tree indexes and data pages
Exploring how real database engines handle indexing.
Each table is organized into 4KB data pages. That simple trick reduces the number of I/O operations.
After that, we introduce B-Tree indexes where each node points to a specific page.
Using the index, it's blazing fast to read an entire page from the disk.
Buffer pools
Exploring how real database engines cache your query results.
Caching entire data pages instead of individual records or result sets. This technique exploits data locality and requires less I/O.
Implemented by an LRU (least recently used) cache backed by a linked list and a hash map.
Part I
Building a database engine is a 2-part e-book. Part I is 133 pages and contains these chapters:
Storage layer
From a naive CSV-based approach, through a fixed-size format, to a variable-length TLV-based storage system.
TLV implementation
An efficient storage format is the foundation of the entire database engine. In this chapter, we implement reusable TLV encoders and decoders with generics.
Column definitions
Before writing data to tables the engine needs to handle columns and column options such as nullable
Project structure
The project will follow package-oriented design. In this chapter, we establish the main packages.
Databases and tables
It's time to create and store databases and tables on the disk. We'll follow Postgres' format. Each database is a folder, each table is a file.
Insert
Implementing insert which needs to encode a hash map to a TLV encoded record and store it in the table file.
Select
Implementing select. At this stage, it's a full table scan. Meaning it reads and decodes the entire table file sequentially.
Delete
Implementing delete which works the same as MySQL's delete. Meaning it doesn't actually delete the bytes. It only marks them as "deleted." It's more efficient.
Update
Implementing update which is a combination of delete and insert.
Part II
Building a database engine is a 2-part e-book. Part II is 163 pages and contains the following chapters. This is where the fun begins.
WAL
Adding Write-Ahead Log which will guarantee a certain degree of fault tolerance. Even if the engine fails while inserting, the data won't be lost.
Data pages
Organizing each table into 4KB data pages. This is a crucial step to support B-Tree indexes. On top of that, it's more efficient in lots of situations.
B-Tree indexes
In this chapter, we're going to implement a B-Tree based primary index for tables. It will result in blazing fast lookups.
Buffer pool
Adding an LRU-based page-level cache backed by a linked list and a hash map. This will also increase the performance of SELECT queries.
Hash-based full-text indexes
Finally, another type of index. It is backed by a hash map instead of a B-Tree. We'll use this to support some basic full-text search functionality.
Appendix
There's a 98-page appendix that teaches you MySQL-related concepts
Database indexing
This massive 60-page chapter teaches everything about database indexing. Theory and practice are both included.
MySQL can do more than you think
This chapter includes the most interesting MySQL features that lots of developers don't know about. Things like CTEs, windows functions, partitioning.
Query optimization 101
Discover the fastest and easiest tricks and tips you can apply to optimize your queries.
Understanding ACID
One of the most important properties of a relational database is ACID. Atomicity, consistency, isolation, durability.
Implemented in Golang
You don't need prior Go experience to understand the content. It's a very simple language with only 25 reserved keywords. We'll use 20 of those. I'll explain every Go-specific thing but there aren't many of them (defer is the most "advanced").