Decodable Streaming with Eric Sammer

Decodable Streaming with Eric Sammer

Author: Data Archives - Software Engineering Daily June 1, 2022 Duration: 44:58
Streaming data platforms like Kafka, Pulsar, and Kinesis are now common in mainstream enterprise architectures, providing low-latency real-time messaging for analytics and applications. However, stream processing – the act of filtering, transforming, or analyzing the data inside the messages – is still an exercise left to the receiving microservice or datastore, a custom programming exercise

Dive deep into the conversations that shape how we build and understand complex systems with Data Archives-Software Engineering Daily. This podcast pulls from a rich library of technical discussions, each one a focused exploration into the specific tools, architectures, and challenges that define modern software engineering. Rather than surface-level news, these episodes offer sustained, thoughtful dialogues with the engineers and thinkers who are working on the front lines of data infrastructure, distributed systems, and emerging technologies. You'll hear detailed breakdowns of real-world problems and the nuanced solutions teams are implementing, providing a practical sense of how theoretical concepts translate into production code and resilient platforms. The archive serves as an enduring resource, whether you're looking to grasp the fundamentals of a new database or understand the intricate trade-offs in a system design. Tune in for a consistently substantive listen that treats software engineering with the depth and seriousness it deserves, all through the unfiltered lens of expert conversation.
Author: Language: en-us Episodes: 100

Data Archives - Software Engineering Daily
Podcast Episodes
Accessing Data at Scale with Justin Borgman [not-audio_url] [/not-audio_url]

Duration: 46:18
The Presto/Trino project makes distributed querying easier across a variety of data sources. As the need for machine learning and other high volume data applications has increased, the need for support, tooling, and clou…
Building on the Data Cloud with Torsten Grabs [not-audio_url] [/not-audio_url]

Duration: 40:03
Building and managing data-intensive applications has traditionally been costly and complex, and has placed an operational burden on developers to maintain as their organization scales. Todays’ developers, data scientist…
Serverless Clickhouse for Developers with Jorge Sancha [not-audio_url] [/not-audio_url]

Duration: 35:14
Data analytics technology and tools have seen significant improvements in the past decade. But, it can still take weeks to prototype, build and deploy new transformations and deployments, usually requiring considerable e…
Data Infrastructure for Finance [not-audio_url] [/not-audio_url]

Duration: 54:26
Data is becoming a bank’s biggest asset. These complex enterprises have a huge opportunity ahead – to transform themselves to become a trusted hub of a much broader data ecosystem that goes beyond the financial industry…
Faking Data Using Tonic.ai with Ian Coe and Adam Kamor [not-audio_url] [/not-audio_url]

Duration: 46:49
Ian Coe CEO Adam Kamor Head of Engineering Companies that gather data about their users have an ethical obligation and legal responsibility to protect the personally identifiable information in their dataset. Ideally, de…
Couchbase with Ravi Mayuram [not-audio_url] [/not-audio_url]

Duration: 30:08
Couchbase is a distributed NoSQL cloud database. Since its creation, Couchbase has expanded into edge computing, application services, and most recently, a database-as-a-service called Capella. Couchbase started as an in…
Data Delivery with Naqeeb Memon [not-audio_url] [/not-audio_url]

Duration: 28:14
Data-as-a-service is a company category type that is not as common as API-as-a-service, software-as-a-service, or platform-as-a-service. In order to vend data, a data-as-a-service provider needs to define how that data w…
Data Labeling with Michael Malyuk [not-audio_url] [/not-audio_url]

Duration: 41:51
Data labeling allows machine learning algorithms to find patterns among the data. There are a variety of data labeling platforms that enable humans to apply labels to this data and ready it for algorithms. Heartex is a d…
Pinot and StarTree with Chinmay Soman [not-audio_url] [/not-audio_url]

Duration: 44:17
Real-time analytics are difficult to achieve because large amounts of data must be integrated into a data set as that data streams in. As the world moved from batch analytics powered by Hadoop into a norm of “real-time”…
Data Loss Prevention with Yasir Ali [not-audio_url] [/not-audio_url]

Duration: 40:55
Data loss can occur when large data sources such as Slack or Google Drive get leaked. In order to detect and avoid leaks, a data asset graph can be built to understand the risks of a company environment. Polymer is a dat…