Posts for: #Kafka

Handling Incompatible Schema Changes

Sometimes the change you need to make breaks compatibility. You can’t add a default. The field type genuinely needs to change. You can’t keep the old schema. Here’s what you do instead. The New Topic Strategy The cleanest approach: create a new topic with the new schema. Producers write to both old and new topics in parallel. Consumers migrate to the new topic one by one. When all consumers are on the new topic, stop writing to the old one.
[Read more]

Event Schema Evolution

Kafka retains messages for days or weeks. Your consumer code will be updated independently of your producer. That means old messages need to be readable by new consumer code, and new messages need to be readable by old consumer code. You can’t just change a field. What Backward and Forward Mean Backward compatibility: a new consumer can read messages written with the old schema. If you add an optional field with a default, old messages (which don’t have that field) are still valid.
[Read more]

Schema Registry

Service A writes a Kafka message with field user_id. Service B reads it. Service A’s team renames it to userId next sprint. Service B starts throwing deserialization errors at runtime. Neither team knew about the other. The Problem In a microservices system passing messages through Kafka, producers and consumers evolve independently. There’s no enforced contract. A producer can change a field name, add a required field, or change a data type, and the consumer finds out when deserialization fails in production.
[Read more]

Lambda and Kappa Architecture

Real-time results are fast and approximate. Historical results are slow and accurate. The tension between them is where Lambda and Kappa architecture come from. Lambda: Two Pipelines Lambda runs two parallel systems. The batch layer processes all historical data on a schedule (Spark on HDFS, every few hours) and produces ground truth. The speed layer processes the live stream (Kafka Streams or Flink) for low-latency results. The serving layer merges both: “latest batch result plus stream delta since the last batch.
[Read more]

Watermarks and Late-Arriving Data

There are two clocks in any stream processing system. Event time: when the click actually happened, recorded in the payload. Processing time: when your system received it. On a healthy network they’re close. In reality they’re not. Mobile clients buffer events when offline. Retries add delay. A click at 10:00:05 might reach your processor at 10:00:47. The 10:00 window has long since closed. The Problem With Never Waiting If you never close a window, you never produce output.
[Read more]

Stream Processing Windows

Aggregating over an infinite stream sounds easy until you realize you have no idea when it ends. You need to cut it into chunks. That’s what windows are. Three Window Types Tumbling windows are fixed, non-overlapping buckets. “Clicks per minute” is a tumbling window: minute 1, minute 2, minute 3, no overlap. Simple to implement, but events that span the boundary get split across buckets. Sliding windows overlap. “Average clicks in the last 5 minutes, recomputed every minute” means each event can appear in up to 5 windows.
[Read more]