CDN and Edge Caching: Serving Content from Next Door

A user in Tokyo requests a video hosted in Virginia. Round trip: 150-200ms. Multiply by every segment, every viewer, every concurrent stream. Your origin server melts.

CDNs solve this by copying content to edge servers worldwide. Tokyo users hit the Tokyo edge. Virginia users hit the Virginia edge. Origin only serves cache misses.

Pull vs Push#

Two strategies for getting content to the edge.

Pull CDN: edge server gets a request, doesn’t have the content, fetches from origin, caches it, serves it. First request is slow. Subsequent requests are fast. Good for content with unpredictable popularity.

Push CDN: you proactively upload content to edge servers before anyone requests it. First request is fast. But you’re paying for storage and bandwidth on every edge server, even for content nobody watches. Good for content you know will be popular.

Most real systems use both. Push the popular stuff. Let the long tail be pulled on demand.

graph TD U["User Request"] --> E["Edge Server"] E --> C{Cache Hit?} C -->|Yes| R["Serve from Edge (5ms)"] C -->|No| O["Fetch from Origin (200ms)"] O --> S["Cache at Edge"] S --> R style U fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style E fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style C fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style R fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style O fill:#000000,stroke:#ff0000,stroke-width:2px,color:#fff style S fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff

Cache Hierarchy#

Edge servers can’t store everything. When an edge misses, it doesn’t always go straight to origin. A regional cache sits between edge and origin. Edge -> regional -> origin. This is multi-level caching applied globally.

The cache key matters enormously. For video, it’s typically the video ID plus the segment number plus the quality level. video_123/segment_42/720p. This lets different quality levels of the same segment be cached independently.

Cache Invalidation at the Edge#

User uploads a video, then immediately watches it. The edge doesn’t have it yet. Worse: user updates a video thumbnail. The old thumbnail is cached at 50 edge servers worldwide. You need to invalidate all of them.

CDN invalidation is usually eventual. You send a purge request, it propagates to edges over seconds to minutes. For most content this is fine. For user-facing updates, you can version the URL: thumbnail_v2.jpg instead of invalidating thumbnail.jpg. Same trick as cache invalidation in application caches, just at global scale.

At Salesforce, we had a similar multi-level caching challenge for code generation templates. Templates were cached at the service level (in-process), team level (shared cache), and global level (source of truth). When a template changed, stale copies at lower levels caused incorrect code generation. We added version-based cache keys: the template hash became part of the key. No more invalidation needed. A changed template was just a new key. Stale entries expired naturally via TTL. This reduced template-related generation errors significantly and simplified the cache logic.

What I’m Learning#

CDNs are just caching taken to geographic scale. The same principles apply: cache hierarchy, key design, invalidation strategy. The hardest part isn’t serving cached content. It’s deciding what to cache, where to cache it, and how to handle changes. If you understand caching patterns and invalidation, you already understand CDNs. The rest is infrastructure.

Have you had to deal with CDN cache invalidation in your systems?