Kafka retains messages for days or weeks. Your consumer code will be updated independently of your producer. That means old messages need to be readable by new consumer code, and new messages need to be readable by old consumer code. You can’t just change a field.

What Backward and Forward Mean#

Backward compatibility: a new consumer can read messages written with the old schema. If you add an optional field with a default, old messages (which don’t have that field) are still valid. The new consumer fills in the default.

Forward compatibility: an old consumer can read messages written with the new schema. If you add an optional field, old consumer code that doesn’t know about it simply ignores it.

Full compatibility is both directions at once. Most restrictive, but safest for long-running Kafka topics where you don’t control all consumers.

Safe vs Unsafe Changes#

Safe changes: adding an optional field with a default, removing an optional field, widening a numeric type (int to long).

Unsafe changes: removing a required field, renaming any field, changing a field’s type, adding a required field without a default. These break either backward or forward compatibility.

Renaming is the one that catches teams off guard. In Avro, a field’s name is part of its identity. Rename it and you’ve effectively deleted the old field and added a new one. Old messages become unreadable with the new schema unless you add the old name as an alias.

graph TD A[Schema v1: user_id int, email string] --> B[Add optional field: phone string default null] B --> C[Schema v2: backward compatible] C --> D[Old consumer reads v2 message - ignores phone] C --> E[New consumer reads v1 message - phone is null] F[Rename user_id to userId] --> G[Breaks old consumers] style A fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style B fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style C fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style D fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style E fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style F fill:#000000,stroke:#ff0000,stroke-width:2px,color:#fff style G fill:#000000,stroke:#ff0000,stroke-width:2px,color:#fff

Avro’s Union Type#

Avro handles optional fields through union types: ["null", "string"] means the field can be null or a string, with null as the default. This pattern is so common it’s essentially the standard way to add optional fields. The default must match the first type in the union, which is why null comes first.

At Salesforce#

We had a platform event schema for CRM updates. A team added a required field for the region code, thinking all consumers would be updated the same week. Three consumer services had quarterly deploy schedules. For six weeks, those services crashed on every message with the new required field because they didn’t know about it. The fix was making the field optional with a default. The lesson was that Kafka topics are shared infrastructure, and schema changes need coordination across all consumers before they land in production.

What I’m Learning#

Treating Kafka schema changes like database migrations helps. You wouldn’t add a NOT NULL column without checking every application that writes to it. The same care applies to event schemas, with the added problem that you can’t easily roll back messages already written.

How do you coordinate schema changes across teams when a Kafka topic has multiple consumer groups?