Input Validation and Abuse Prevention in Distributed Systems
You build a public API. Users can submit content. Day one, someone submits malicious payloads. Day two, a bot floods your endpoint with 10,000 requests per second. Day three, spam starts showing up in your system.
Every public-facing write path needs defense. Not just authentication. Actual input validation and abuse prevention.
The Layered Defense Pattern#
Don’t put all your validation in one place. Layer it.
Layer 1: Syntactic validation. Is the input well-formed? Correct length, valid characters, expected format. Fast, cheap, reject garbage immediately.
public void validate(CreateRequest request) {
if (request.getContent() == null || request.getContent().length() > 10000) {
throw new BadRequestException("Content too long");
}
if (!ALLOWED_SCHEMES.contains(request.getUrl().getScheme())) {
throw new BadRequestException("Invalid URL scheme");
}
}
Layer 2: Rate limiting. Per-user, per-IP, per-endpoint. A legitimate user submits a few requests per minute. A bot submits thousands per second. Rate limiting catches the obvious abuse without complex logic.
Layer 3: Reputation and blocklists. Check the content against known bad patterns. DNS blocklists, keyword filters, regex patterns. More expensive than syntactic checks but catches targeted abuse.
Async Deep Scanning#
Some checks are too slow for the request path. DNS resolution, third-party reputation APIs, content analysis. Run these asynchronously after accepting the content.
@Async
public void deepScan(String contentId, String content) {
boolean safe = reputationService.check(content);
if (!safe) {
// Flag for review, don't delete automatically
contentRepo.flag(contentId, "REPUTATION_CHECK_FAILED");
// Move to DLQ for manual review
dlqPublisher.publish(new FlaggedContent(contentId, content));
}
}
Accept the content immediately (good user experience), scan in the background, flag or remove if it fails the deep check. The dead letter queue pattern works well here: flagged content goes to a review queue where humans or a secondary system can make the final call.
The False Positive Problem#
Aggressive validation blocks bad content. It also blocks good content. A legitimate user whose content matches a blocklist pattern gets rejected. They get frustrated. They leave.
The trade-off: strict validation reduces abuse but increases false positives. Lenient validation reduces false positives but lets abuse through.
Most systems err toward lenient on the synchronous path (accept content, rate limit aggressively) and strict on the async path (scan and flag). This way, good users aren’t blocked, but bad content gets caught eventually.
At Oracle, we had an API that accepted network function configuration payloads. No validation initially. A misconfigured client sent malformed JSON that passed deserialization but caused downstream processing failures. We added layered validation: schema validation first (fast, reject malformed), then business rule validation (slower, check constraints), then async consistency checks against the running config. Backpressure on the validation pipeline kept it from overwhelming the consistency checker during bulk updates.
What I’m Learning#
Input validation isn’t a single check. It’s a pipeline. Each layer catches different problems at different costs. The temptation is to put everything in the request path. But slow validation in the hot path kills latency. Fast checks synchronously, deep checks asynchronously.
The rule I follow: never trust input from outside your system boundary. Inside your microservices, trust the data (your own code wrote it). At the edge, validate everything.
What’s your approach to input validation? Do you validate at every service boundary or just at the edge?