Materialized Views: The Read Optimization Pattern
Standard database views are often a disappointment.
You write this beautiful, complex SQL with six JOINs and three aggregations. You save it as a VIEW. You think, “Great, now it’s fast.”
It’s not. A standard view is just a macro. The database still runs that nightmare query every single time you call it.
If you want speed, you need a Materialized View.
The “Cache” Inside Your DB#
A materialized view is basically the result of a query, saved to disk as a physical table. It’s like a cache that understands SQL.
Instead of re-calculating the total sales for every region every time a manager hits the dashboard, you calculate it once and store it.
Why not just use a standard table?#
You could. But materialized views come with a “Refresh” mechanism. The database knows the relationship between the view and the source data. You just tell it when to sync.
Solving the Sharding Headache#
In a sharded database, materialized views are a lifesaver.
If you shard your users by user_id, a query like “Who are the top 10 users in New York?” is expensive. You have to ask every single shard for their top users and then merge the results (scatter-gather).
Instead, you can create a materialized view partitioned by city.
Pre-aggregating across shards makes global dashboards possible without killing performance.
The Catch: Staleness#
The biggest trade-off is freshness.
- Sync Refresh: Updates every time the source data changes. This makes your writes crawlingly slow.
- Async Refresh: Updates on a schedule (e.g., every 5 minutes). Your reads are blazing fast, but the data is always a bit “old.”
What I’m Learning#
I used to think materialized views were “cheating.” I wanted my queries to be so well-optimized that they were fast on their own.
But at scale, sometimes the math just doesn’t work out. You can’t index your way out of a cross-shard aggregation.
The trick is deciding how much staleness your users can tolerate. Does a manager need to see sales data that is accurate to the millisecond? Usually not. 30 seconds of lag is often a fair price to pay for a dashboard that loads in 20ms instead of 20 seconds.
Have you used Materialized Views or did you build your own “caching” tables manually?