<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Stream-Processing on Sohil Ladhani Blog</title><link>https://sohilladhani.com/blog/tags/stream-processing/</link><description>Recent content in Stream-Processing on Sohil Ladhani Blog</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 04 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://sohilladhani.com/blog/tags/stream-processing/index.xml" rel="self" type="application/rss+xml"/><item><title>Lambda and Kappa Architecture</title><link>https://sohilladhani.com/blog/post/2026-04-04-lambda-and-kappa-architecture/</link><pubDate>Sat, 04 Apr 2026 00:00:00 +0000</pubDate><guid>https://sohilladhani.com/blog/post/2026-04-04-lambda-and-kappa-architecture/</guid><description>Real-time results are fast and approximate. Historical results are slow and accurate. The tension between them is where Lambda and Kappa architecture come from.
Lambda: Two Pipelines Lambda runs two parallel systems. The batch layer processes all historical data on a schedule (Spark on HDFS, every few hours) and produces ground truth. The speed layer processes the live stream (Kafka Streams or Flink) for low-latency results. The serving layer merges both: &amp;ldquo;latest batch result plus stream delta since the last batch.</description></item><item><title>Watermarks and Late-Arriving Data</title><link>https://sohilladhani.com/blog/post/2026-04-03-watermarks-and-late-data/</link><pubDate>Fri, 03 Apr 2026 00:00:00 +0000</pubDate><guid>https://sohilladhani.com/blog/post/2026-04-03-watermarks-and-late-data/</guid><description>There are two clocks in any stream processing system. Event time: when the click actually happened, recorded in the payload. Processing time: when your system received it. On a healthy network they&amp;rsquo;re close. In reality they&amp;rsquo;re not.
Mobile clients buffer events when offline. Retries add delay. A click at 10:00:05 might reach your processor at 10:00:47. The 10:00 window has long since closed.
The Problem With Never Waiting If you never close a window, you never produce output.</description></item><item><title>Stream Processing Windows</title><link>https://sohilladhani.com/blog/post/2026-04-02-stream-processing-windows/</link><pubDate>Thu, 02 Apr 2026 00:00:00 +0000</pubDate><guid>https://sohilladhani.com/blog/post/2026-04-02-stream-processing-windows/</guid><description>Aggregating over an infinite stream sounds easy until you realize you have no idea when it ends. You need to cut it into chunks. That&amp;rsquo;s what windows are.
Three Window Types Tumbling windows are fixed, non-overlapping buckets. &amp;ldquo;Clicks per minute&amp;rdquo; is a tumbling window: minute 1, minute 2, minute 3, no overlap. Simple to implement, but events that span the boundary get split across buckets.
Sliding windows overlap. &amp;ldquo;Average clicks in the last 5 minutes, recomputed every minute&amp;rdquo; means each event can appear in up to 5 windows.</description></item></channel></rss>