Column-Family Storage

Your query is always “give me all events for user X, sorted by time.” A row-oriented database gives you rows where you pay for every column you didn’t ask for. Wide-column stores flip the model: you design the schema around your query, not the other way around.

How It Works#

In a wide-column store like Cassandra or HBase, the primary key has two parts: the partition key and the clustering key. The partition key determines which node stores the data. The clustering key determines the sort order within that partition.

For a time-series query, the partition key is the user ID. The clustering key is the timestamp. All events for a user land on the same node, sorted by time. A range query becomes a sequential read on a single node. No scatter-gather across nodes, no sorting at read time.

Wide Rows#

Each partition can hold millions of rows. A user with five years of events has five years of timestamped entries in one partition. This is where the “wide” comes from. The column names themselves can carry semantic meaning, not just “column_1” or “column_2.”

graph TD A[Write: user_id=42, ts=1714000000, event=login] --> B[Partition Key: user_id=42] B --> C[Node responsible for partition 42] C --> D[Clustering key: sorted by timestamp] D --> E[Sequential read for time range queries] F[Query: user 42, last 24 hours] --> B style A fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style B fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style C fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style D fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style E fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff style F fill:#000000,stroke:#00ff00,stroke-width:2px,color:#fff

The Data Modeling Difference#

In a relational database, you normalize and let queries figure out access patterns. In a wide-column store, you design for your query patterns first, denormalize aggressively, and accept that some queries become impossible without a secondary index or a separate table.

This trips up teams migrating from MySQL. They try to run ad-hoc queries. They try JOINs. That’s not what these stores are built for. You trade query flexibility for predictable, fast access along a specific dimension.

At Oracle#

We evaluated Cassandra for our 5G network event logging, storing session events per subscriber. The access pattern was always “events for subscriber X in the last hour.” MySQL with a composite index on (subscriber_id, event_time) worked at our scale (around 2 million rows), but query time grew as the table grew. The Cassandra data model gave constant-time reads for that query regardless of total event count. That property, not raw throughput, was what we actually needed.

What I’m Learning#

The partition key choice makes or breaks the data model. A bad partition key creates hot partitions where one node receives disproportionate traffic. High-cardinality keys distribute load evenly. Low-cardinality keys concentrate it.

Have you hit hot partition problems, and what did the fix look like?