What to do when your database grew too big in size

Let’s be clear that a single monolithic database is big enough and majority of the companies run fine with it.

However, once you outgrow the database capacity, these are some of the ways to work around it.

Compress data#

Most databases now support compression in some way. For example PostgreSQL and MySQL.

Doordash https://www.google.com/amp/s/doordash.engineering/2019/01/02/speeding-up-redis-with-compression/amp/

Delete data#

The way to work with billions of data is, not to work with billions of data.

Keep the database slim

Do not keep the soft deleted data

Grab uses TTL to remove the data

Zendesk removes the data in the main db and keep them in a separate historical DB

Laravel move the deleted data to separate table https://github.com/spatie/laravel-deleted-models

Shard#

Split the data into chunks

Cons: choosing the shard key is difficult. No cross shard operations (eg joins, aggregation)

Robinhood shard the data at application level

Shopify split the data by tenant

Facebook is sharding their database aggressively using Tupperware orchestrator

Notion https://www.notion.so/blog/sharding-postgres-at-notion

Vitess

Cockroach DB

Citus

Use Bottomless database#

Pros: cheap, durable and virtually limitless storage size Cons: latency since it would need to retrieve data from the cloud. Not popular yet

Neon DB

Clickhouse storage tiering

Postgres Timescale storage tiering

Redpanda

Materialize

Dragonfly DB: quickly spin a new replica

Sqld bottomless

Conclusion#

There are many dimensions that require you to scale-out your database, for example scaling database reads and database writes.

© Fadhil Yaacob 2023

What to do when your database grew too big in size