Machine-Learning on Sohil Ladhani Blog

Machine-Learning on Sohil Ladhani Bloghttps://sohilladhani.com/blog/tags/machine-learning/Recent content in Machine-Learning on Sohil Ladhani BlogHugoen-usWed, 22 Apr 2026 00:00:00 +0000Feature Storeshttps://sohilladhani.com/blog/post/2026-04-22-feature-stores/Wed, 22 Apr 2026 00:00:00 +0000https://sohilladhani.com/blog/post/2026-04-22-feature-stores/You train a model using yesterday’s data. You serve it using today’s data. The feature computation logic is slightly different between the two. The model degrades silently and you spend a week figuring out why. The Training-Serving Skew Problem ML models are trained on offline batches: historical data, features computed via Spark jobs, labels aggregated over time. At serving time, features are computed online: live data, lower latency budget, different code path.Embedding Vectors and ANN Searchhttps://sohilladhani.com/blog/post/2026-04-21-embedding-vectors-and-ann-search/Tue, 21 Apr 2026 00:00:00 +0000https://sohilladhani.com/blog/post/2026-04-21-embedding-vectors-and-ann-search/“Find the 10 most similar items to this one” sounds simple. With millions of items represented as 256-dimensional vectors, exact search is too slow to be useful in production. What Embeddings Are An ML model maps an item (a product, a document, a user’s history) to a dense numeric vector. The geometry of that vector space encodes semantic similarity: similar items land close together. You train the model on interaction data and the embeddings learn to represent “things that users treat similarly.Collaborative Filteringhttps://sohilladhani.com/blog/post/2026-04-20-collaborative-filtering/Mon, 20 Apr 2026 00:00:00 +0000https://sohilladhani.com/blog/post/2026-04-20-collaborative-filtering/You don’t know what a user wants. But you know what people like them have wanted. That’s the intuition behind collaborative filtering. The Two Approaches User-based CF finds users similar to you, then recommends what they liked. Item-based CF finds items similar to what you’ve already liked. Item-based is generally more stable because user behavior shifts rapidly (you might buy a couch once), while item similarity changes slowly (a couch is similar to other furniture regardless of who buys it).