Manhattan has 50,000 restaurants. Rural Wyoming has 3 per county. A fixed-size grid wastes cells on empty space and overloads dense areas. Quadtrees adapt.
Posts for: #Data-Structures
Trie Data Structures: Prefix Search in Milliseconds
User types three characters and expects instant suggestions. A hash map can’t do prefix lookups. A trie can, in O(k) time where k is the query length.
Inverted Indexes: How Search Actually Works
A normal index maps documents to words. An inverted index maps words to documents. That reversal is why search is fast.
Content Fingerprinting: Detecting Near-Duplicates at Scale
Exact duplicates are easy. Near-duplicates are hard. SimHash turns documents into compact fingerprints where similar content produces similar hashes.
Approximate Counting: HyperLogLog and Count-Min Sketch
Counting unique items across billions of events. A HashSet needs gigabytes. HyperLogLog does it in 12KB. The trick is accepting a little error.
Bloom Filters: Definitely Not Here
Bloom filters skip unnecessary disk reads in LSM trees by saying ‘definitely not here’ with zero false negatives. Learn how Cassandra and RocksDB use them.
LSM Trees vs B-Trees: Write Fast or Read Fast
LSM Trees vs B-Trees: the write-fast or read-fast tradeoff. Learn when to use B-trees (MySQL) vs LSM trees (Cassandra) based on your database workload.