You split work evenly across 4 threads. Two finish in 10ms, two take 10 seconds. Half your CPU sits idle while the other half grinds. Work stealing fixes this.
Posts for: #Java
Delayed Message Delivery: Execute This in 30 Minutes
Send a reminder in 24 hours. Retry this job in 5 minutes. Expire this hold at midnight. Delayed execution is everywhere, and Thread.sleep isn’t the answer.
Leader Election: Picking One Node to Rule
Three nodes, one job. Without leader election, all three run it simultaneously. With leader election, exactly one does the work while the others stand by.
MapReduce: Processing Data That Won’t Fit on One Machine
Your dataset is 10TB. One machine can’t hold it, let alone process it. MapReduce splits the work across hundreds of machines with a deceptively simple API.
Trie Data Structures: Prefix Search in Milliseconds
User types three characters and expects instant suggestions. A hash map can’t do prefix lookups. A trie can, in O(k) time where k is the query length.
Inverted Indexes: How Search Actually Works
A normal index maps documents to words. An inverted index maps words to documents. That reversal is why search is fast.
Checkpointing: Resuming Long-Running Jobs Without Starting Over
A batch job runs for three hours and crashes at hour two. Without checkpointing, you restart from zero. With it, you lose ten minutes of work.
Content Fingerprinting: Detecting Near-Duplicates at Scale
Exact duplicates are easy. Near-duplicates are hard. SimHash turns documents into compact fingerprints where similar content produces similar hashes.
Priority Queues in Distributed Systems
FIFO queues treat every message equally. But urgent config updates shouldn’t wait behind a thousand bulk sync jobs. Priority queues fix this, if you handle starvation.
Reconciliation: When Your Systems Disagree
Your database says one thing. The external system says another. Reconciliation is how you find the drift before your users do.
State Machines: Making Distributed Workflows Predictable
Boolean flags and status strings create impossible states. An explicit state machine tells you exactly where a workflow is, what transitions are valid, and how to recover.
Optimistic vs Pessimistic Concurrency: Locks vs Versions
Two users update the same row. Pessimistic locking blocks one until the other finishes. Optimistic locking lets both try and fails the loser. Choosing wrong kills either throughput or correctness.
Two-Phase Commit: The Original Distributed Transaction
Two-phase commit guarantees atomicity across multiple databases. It also blocks everything if the coordinator dies. Here’s why microservices moved on.
Base62 Encoding: Turning Numbers into Short Strings
A 64-bit integer is 19 digits. Encode it in base62 and it’s 7 characters. The math behind compact, URL-safe identifiers.
Distributed ID Generation: Snowflake and Friends
Auto-increment IDs break the moment you have more than one database. Snowflake IDs, UUIDs, and database sequences each solve this differently.
Event Aggregation: When 47 Notifications Become One
Showing every individual event overwhelms users. Grouping related events into summaries is a distributed systems problem hiding as a UX problem.
Pre-Signed URLs: Uploading Files Without Touching Your Servers
Routing file uploads through your API server is a scaling bottleneck. Pre-signed URLs let clients upload directly to object storage.
Cursor-Based Pagination: Why Offset Breaks at Scale
OFFSET 50000 makes MySQL scan 50,000 rows just to skip them. Cursor pagination stays fast no matter how deep you go.
WebSockets vs Long Polling: Choosing a Real-Time Transport
Your client needs real-time updates from the server. HTTP wasn’t built for this. Here’s how long polling, SSE, and WebSockets solve it differently.
Structured Logging in Distributed Systems
Grep through 50 log files to find one request. Or use structured logging with correlation IDs and find it in seconds.
Making Consumers Idempotent
Exactly-once delivery is impossible across boundaries. Here’s the pattern that actually works: at-least-once delivery with idempotent consumers.
Connection Pooling: Why Opening Connections Is Expensive
The hidden cost of database connections. How connection pools work, why they matter, and how to size them without guessing.
The In-Memory Trap: Why Objects Are Slow
In-memory doesn’t always mean fast. How shifting from object-based to vector-based storage (Apache Arrow) delivered a 13x performance boost.
Rate Limiting: Token Bucket vs Leaky Bucket
Protecting services from overload with rate limiting. Token bucket and leaky bucket algorithms explained with Java implementations and real-world trade-offs.
Backpressure: When Consumers Can’t Keep Up
Handling slow consumers in distributed systems. Queue growth, memory exhaustion, and strategies for applying backpressure - rejection, rate limiting, and flow control.
Retry Strategies: Exponential Backoff and Jitter
How to retry failed requests without overwhelming servers. Exponential backoff, jitter, and when to give up. Java implementations and real-world patterns.
Idempotency: Why Retries Need It
How to make operations safe to retry. Idempotency keys, database patterns, and why retrying non-idempotent operations causes data corruption.
Circuit Breakers: Failing Fast to Stay Alive
How circuit breakers prevent cascading failures in microservices. State transitions, Java implementation with Resilience4j, and real-world thresholds.
Load Balancing Strategies: Picking the Right Server
Comparing load balancing algorithms - Round Robin, Least Connections, Weighted Round Robin, and IP Hash. Java implementations and real-world trade-offs.
My First ‘Self-Made’ Swing Application
Building my first Java Swing application: Click Tester. Learn how to create a simple GUI app that tests clicking speed using JButton, JLabel, and Stopwatch.