Postgres for Search + Analytics with Philippe Noël

Postgres for Search + Analytics with Philippe Noël

Author: Software Huddle June 25, 2024 Duration: 42:39
ParadeDB is Postgres for search and analytics. As Postgres continues to rise in popularity, the "Just Use Postgres'' movement is getting stronger and stronger. Yet there are still things that standard Postgres doesn't do well, and advanced search and analytics functionality is near the top of the list. The ParadeDB team provides a pair of Postgres extensions. The first, pg_search, brings a more performant and full-featured search experience to Postgres. It uses Tantivy (think: Lucene but Rust) as the search engine and provides advanced ranking and querying functionality. The second, pg_lakehouse, allows you to perform large analytical queries over object store data. Together, these provide compelling new features wrapped in a familiar operational package. Philippe Noël is one of the founders of ParadeDB. In this episode, we talk about why these extensions were needed, why the 'Just Use Postgres' movement exists, and where ParadeDB fits in your architecture. Follow Philippe: https://x.com/philippemnoel Follow Alex: https://x.com/alexbdebrie Follow Sean: https://x.com/seanfalconer Check Out ParadeDB: https://www.paradedb.com/ Timestamps 01:50:18 Intro 04:30:23 Where does seach on Postgres fall down? 05:33:09 BM25 and TF-IDF 07:23:03 Postgres Tipping Point 10:05:08 Tantivy 11:50:14 Tantivy vs Lucene 13:07:06 vs ZomboDB 15:35:21 Just Use Postgres for Everything? 17:57:17 Developing a Postgres Extension 19:26:03 Arvid's Problem 20:27:08 Postgres and Log Data 23:28:01 Separate OLTP and Search Instances 28:32:01 Search Nodes vs OLTP Nodes 30:02:12 ParadeDB Analytics 35:27:05 Hosted Service 39:03:15 Stumbling upon the Idea 39:51:22 Community 41:01:15 Getting Started with ParadeDB

Every week on Software Huddle, Alex DeBrie and Sean Falconer sit down with a different expert from across the tech landscape. The conversations are less about quick tips and more about substantive discussions, digging into the real challenges and decisions behind building software, launching products, and navigating the industry's constant shifts. You'll hear from practitioners who have been in the trenches, offering perspectives that blend deep technical knowledge with hard-won business and entrepreneurial experience. Alex brings his specialized expertise as the author of The DynamoDB Book and an AWS Data Hero, while Sean contributes a unique viewpoint shaped by over two decades as an engineer, founder, and marketing executive, recognized as a Snowflake Data Superhero. Together, they create a space where complex topics in software development and technology trends become accessible and genuinely engaging. This podcast is for anyone who wants to move beyond surface-level news and understand the "why" behind the tools and strategies shaping our digital world. Tune in for a thoughtful huddle that feels more like a candid conversation between colleagues than a formal interview.
Author: Language: en-us Episodes: 79

Software Huddle
Podcast Episodes
Navigating Large Language Models with Vino Duraisamy from Snowflake [not-audio_url] [/not-audio_url]

Duration: 59:42
In this episode, we spoke with Vino Duraisamy, Developer advocate at Snowflake. Vino has been working as a data and AI engineer for her entire career across companies like Apple, Treeverse, and now Snowflake. And in this…
AGI is Surely Coming with Former Snowflake CEO Bob Muglia [not-audio_url] [/not-audio_url]

Duration: 59:14
Today we have the former CEO of Snowflake, a 23 year veteran of Microsoft, Bob Muglia on the show. In this interview, we discuss Bob's book, Datapreneurs, which takes you on a journey about the people behind the first re…
reInvent BTS, Sam Altman, SEC on Solarwinds, Apple RCS, and more [not-audio_url] [/not-audio_url]

Duration: 51:19
Our special episode is back, and we have a special guest this time. Join Sean, Alex & Merritt in this fun conversation. Timestamps: 00:00 Introduction 01:19 What is a CISO 08:10 Balance of Power 13:50 reInvent BTS 19:45…
AI-driven Database Cache with Ben Hagan from PolyScale [not-audio_url] [/not-audio_url]

Duration: 57:10
PolyScale is a database cache, specifically designed to cache just your database. It is completely Plug and Play and it allows you to scale a database without a huge amount of effort, cost, and complexity. PolyScale curr…
Building for Scale with Mario Žagar from Infobip [not-audio_url] [/not-audio_url]

Duration: 50:00
In this episode, we spoke with Mario Žagar, a Distinguished Engineer at Infobip. Infobip is a tech unicorn based out of Croatia that is a global leader in omnichannel communication, bootstrapping its way to a staggering…
Distributed Financial Databases with Joran Dirk Greef of TigerBeetle [not-audio_url] [/not-audio_url]

Duration: 1:03:39
In this episode we spoke with Joran Dirk Greef, who's the co-founder at TigerBeetle. TigerBeetle is a Financial Transactions Database that's focused on correctness and safety while hitting orders of magnitude more perfor…
First Year as a Startup Founder and CEO with Nucleus's Evis Drenova [not-audio_url] [/not-audio_url]

Duration: 44:42
In this episode, we spoke with Evis Drenova, CEO and co-founder of Nucleus, a Y Combinator graduate from 2022 focused on making it easy to deploy, build, and manage on Kubernetes. Evis left Skyflow, where he was one of t…
Architecting Real-time Analytics with Dhruba Borthakur of Rockset [not-audio_url] [/not-audio_url]

Duration: 1:09:30
In this episode, we spoke with Dhruba Borthakur, Dhruba is the CTO and Co-founder at Rockset. Rockset is a search and analytics database hosted on the cloud. Dhruba was the founding engineer of the RocksDB project at Fac…