Back in 2021, we found ourselves wrestling with an intriguing challenge: how to build semantic search that could actually work in our production e-commerce environments, where human control and merchandising are the key differentiators our customers expect from us. The promise was compelling, but the practical implementation questions were daunting.
When you type "cozy sweater for winter" into a search box, what are you really looking for? Traditional search engines parse your keywords and match them against product titles and descriptions. But what about that perfect chunky knit cardigan that never mentions "cozy" in its description? What about understanding that "winter" might mean different things in different contexts? This is something semantic search transforms is pretty good at!
Enter vectors
The breakthrough comes from neural networks that can encode products into high-dimensional vector representations. Think of it as creating a unique mathematical fingerprint for each product based on its visual features, textual descriptions, and behavioral patterns.
Here's what's fascinating: once you encode a product as a vector (typically 256 or 512 dimensions), finding similar products becomes a mathematical problem. Products that are conceptually similar end up with vectors that are close together in this high-dimensional space.
The process is straightforward. Feed the neural network massive amounts of product data, customer behavior patterns, and images. The network learns to identify patterns and relationships we might never have explicitly programmed. Transform each product into a dense vector representation that captures its essential characteristics. Convert user queries into the same vector space and find the closest matches using similarity calculations.
How to integrate your approaches
The real challenge isn't creating these vectors. It's integrating them into a production search system that needs to handle filtering, faceting, pagination, merchandising rules, and real-time performance at scale.
This is where traditional search engines become incredibly powerful. Rather than building a separate vector database and trying to merge results from multiple systems, you can store your AI-generated vectors directly alongside traditional product data in your existing search infrastructure.
The composability this creates is remarkable. You can write queries that pre-filter products based on traditional criteria (price, brand, availability), score and rank the filtered results using vector similarity, apply merchandising rules and business logic, calculate facets and aggregations in a single request, and handle pagination efficiently.
What this looks like in practice
The architecture that emerges is surprisingly elegant. Product data flows through the AI model to generate vectors, then gets indexed alongside traditional attributes. User input gets encoded using the same AI model, then the search engine executes compound queries that combine traditional filtering with vector similarity scoring. Results come back already ranked, filtered, and paginated with no complex result merging required.
What makes this approach particularly powerful is how it handles different search modalities. "Floral summer dress" gets encoded and matched against products based on meaning, not just literal text matches. Upload an image or click "find similar" on a product, and the system uses the product's existing vector to find visually similar items. If the AI model incorporates user behavior, the vectors themselves become personalized, enabling recommendation systems that understand individual preferences.
The performance reality
Of course, vector operations aren't free. Calculating similarity across millions of products requires careful optimization. CPU-optimized instances often outperform memory-optimized ones for vector calculations, contrary to what you might expect. Pre-filtering with traditional attributes before applying vector scoring dramatically improves performance. How you structure your search indices and shards directly impacts vector query performance at scale.
Beyond similarity
Here's where the integration really shines: you're not just doing similarity search. You're building a complete product discovery engine that can seamlessly blend AI-driven relevance with business requirements.
Need to boost certain brands? Apply query-time boosts. Want to filter out out-of-stock items? Add it to your compound query. Need to implement complex business rules while maintaining semantic relevance? The composable query structure makes it straightforward.
Several key insights emerge from implementing this approach in production. Start simple with basic vector similarity and gradually add complexity. The AI model quality matters more than sophisticated query orchestration. Traditional search metrics like click-through rates remain important, but you'll need new ways to evaluate semantic relevance. Users won't tolerate slow search, regardless of how semantically brilliant your results are. Always optimize for the 95th percentile response time.
And then?
What's exciting about this approach is how it opens up entirely new search experiences. Visual search becomes natural. Cross-category discovery happens organically. Products can be found through abstract concepts rather than specific keywords.
But perhaps most importantly, it creates a foundation that can evolve with advances in AI. As models get better at understanding context, intent, and preference, the search experience improves automatically without requiring fundamental architectural changes.
The combination of neural network embeddings and robust search infrastructure creates something greater than the sum of its parts: a search engine that finally understands what customers are actually looking for, not just what they're typing.
If you're building e-commerce search today, the question isn't whether to incorporate AI-powered semantic search. It's how quickly you can get started and begin learning from real user behavior.