Monitoring on Sohil Ladhani Blog

Monitoring on Sohil Ladhani Bloghttps://sohilladhani.com/blog/tags/monitoring/Recent content in Monitoring on Sohil Ladhani BlogHugoen-usThu, 19 Mar 2026 00:00:00 +0000Push vs Pull Metrics Collection: Two Ways to Get the Numbershttps://sohilladhani.com/blog/post/2026-03-19-push-vs-pull-metrics-collection/Thu, 19 Mar 2026 00:00:00 +0000https://sohilladhani.com/blog/post/2026-03-19-push-vs-pull-metrics-collection/You have 200 microservices. Each produces metrics. How do those metrics reach your monitoring system? Two fundamentally different approaches, and the choice affects service discovery, failure modes, and scalability. Pull Model (Prometheus-Style) Each service exposes a /metrics endpoint. The monitoring system knows about all services (via service discovery) and scrapes each one on a schedule: every 15 seconds, hit /metrics, parse the response, store the data. // Spring Boot Actuator exposes metrics automatically // GET /actuator/prometheus returns: // http_server_requests_seconds_count{method="GET",uri="/api/users"} 1523 // http_server_requests_seconds_sum{method="GET",uri="/api/users"} 45.Downsampling: Keeping Trends, Not Every Data Pointhttps://sohilladhani.com/blog/post/2026-03-18-downsampling-and-data-retention/Wed, 18 Mar 2026 00:00:00 +0000https://sohilladhani.com/blog/post/2026-03-18-downsampling-and-data-retention/Your monitoring system stores CPU usage every second. That’s 86,400 data points per day per metric. For 1,000 metrics across 200 services, you’re generating 17 billion data points per day. Storage isn’t free, and nobody will ever look at per-second data from three months ago. But you can’t just delete it. “What was our error rate trend last quarter?” is a legitimate question. You need the trend without the granularity.Time-Series Databases: Storage Built for Timestampshttps://sohilladhani.com/blog/post/2026-03-17-time-series-databases/Tue, 17 Mar 2026 00:00:00 +0000https://sohilladhani.com/blog/post/2026-03-17-time-series-databases/Every second, your system emits: CPU usage, memory, request count, error rate, latency percentiles, queue depth. Multiply by 200 services. That’s hundreds of thousands of data points per second, all append-only, all timestamped, and you mostly query them by time range. Regular databases can handle this, technically. But they weren’t built for it. What Makes Time-Series Different The access pattern is extreme. Writes are almost entirely appends: new data comes in, old data never changes.