Vespa AI and Surpassing the Limits of Vector Search

Vespa AI and Surpassing the Limits of Vector Search

Author: softwareengineeringdaily.com May 12, 2026 Duration: 38:34
Vector search has risen to become a foundational tool in modern search and retrieval systems, including the RAG pipelines that power many AI applications. However, the demands on retrieval systems are growing more sophisticated, which is revealing the limits of relying on a single vector similarity score. Vespa is a popular open source search and data serving engine. Central to Vespa’s architecture is tensor-based retrieval, which is an approach that represents data as tensors rather than simple vectors. Tensor-based retrieval enables richer mathematical operations and more flexible ranking functions that can surmount the limitations of a single vector similarity score. Radu Gheorghe is a software engineer at Vespa with a background spanning nearly 12 years of consulting and training on Elasticsearch and Solr. In this episode, Radu joins Sean Falconer to discuss why vector similarity alone falls short in production, how tensor-based retrieval generalizes to support richer ranking functions, the trade-offs in chunking and multi-stage re-ranking architectures, and where AI search is headed next. Full Disclosure: This episode is sponsored by Vespa. Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from AI to quantum computing. Currently, Sean is an AI Entrepreneur in Residence at Confluent where he works on AI strategy and thought leadership. You can connect with Sean on LinkedIn.   Please click here to see the transcript of this episode. Sponsorship inquiries: sponsor@softwareengineeringdaily.com

For anyone curious about how the code running our world actually gets built, Software Engineering Daily offers a clear and consistent look behind the curtain. This isn't about hype cycles or surface-level news; it's a deep, technical conversation with the engineers, architects, and thinkers who are shaping our digital infrastructure. Each episode focuses on a specific technology, practice, or problem, breaking down complex systems into understandable parts. You'll hear detailed discussions on everything from database architectures and programming language design to the organizational challenges of scaling teams and the real-world trade-offs made in production systems. Hosted by softwareengineeringdaily.com, the podcast serves as a reliable source for developers who want to stay informed and inspired, translating the rapid pace of technological change into substantive, lasting knowledge. It’s for professionals who believe that understanding the "how" and "why" is just as important as knowing the "what." By dedicating time to thorough exploration, this podcast provides context that shorter formats simply cannot, making it an essential resource for anyone building the future, one line of code at a time. Tune in to hear unfiltered insights from the people on the front lines, discussing the tools and decisions that define modern software engineering.
Author: Language: en-us Episodes: 100

Software Engineering Daily
Podcast Episodes
FastMCP with Adam Azzam and Jeremiah Lowin [not-audio_url] [/not-audio_url]

Duration: 1:07:03
The Model Context Protocol, or MCP, gives developers a common way to expose tools, data, and capabilities to large language models, and it has quickly become an important standard in agentic AI. FastMCP is an open source…
SED News: OpenCode, AI Code vs. Shipped Code, and the LiteLLM Breach [not-audio_url] [/not-audio_url]

Duration: 58:42
SED News is a monthly podcast from Software Engineering Daily where hosts Gregor Vand and Sean Falconer unpack the biggest stories shaping software engineering, Silicon Valley, and the broader tech industry. In this epis…
FreeBSD with John Baldwin [not-audio_url] [/not-audio_url]

Duration: 1:03:50
FreeBSD is one of the longest-running and most influential open-source operating systems in the world. It was born from the Berkeley Software Distribution in the early 1990s, it has powered everything from high-performan…
Cilium, eBPF, and Modern Kubernetes Networking with Bill Mulligan [not-audio_url] [/not-audio_url]

Duration: 59:29
Modern cloud-native systems are built on highly dynamic, distributed infrastructure where containers spin up and down constantly, services communicate across clusters, and traditional networking assumptions break down. L…
Games That Push Back with Bennett Foddy [not-audio_url] [/not-audio_url]

Duration: 1:08:33
Bennett Foddy is a legendary game designer known for creating wholly distinctive games such as QWOP, Getting Over It with Bennett Foddy, and the recently released Baby Steps. He’s also a former professor at the NYU Game…
Prettier and Opinionated Code Formatting with James Long [not-audio_url] [/not-audio_url]

Duration: 51:07
Developer tooling shapes how software gets written day to day, but the best tools often disappear into the background once they succeed. Formatting, linting, and build systems can either create friction and endless debat…
Skate Story with Sam Eng [not-audio_url] [/not-audio_url]

Duration: 58:07
Skateboarding games have long balanced technical precision with a sense of flow and expression, but Skate Story takes the genre in a radically different direction. It has a distinct vaporwave vibe and blends fluid skate…
DeepMind’s RAG System with Animesh Chatterji and Ivan Solovyev [not-audio_url] [/not-audio_url]

Duration: 40:57
Retrieval-augmented generation, or RAG, has become a foundational approach to building production AI systems. However, deploying RAG in practice can be complex and costly. Developers typically have to manage vector datab…
Reinventing the Python Notebook with Akshay Agrawal [not-audio_url] [/not-audio_url]

Duration: 49:04
Interactive notebooks were popularized by the Jupyter project and have since become a core tool for data science, research, and data exploration. However, traditional, imperative notebooks often break down as projects gr…
Organizational Context for AI Coding Agents with Dennis Pilarinos [not-audio_url] [/not-audio_url]

Duration: 49:21
AI agents have taken on a growing share of software development work, so much so that the hardest problems are shifting away from code generation towards something new, context. The challenge is now contextualizing why s…