Measuring LLMs with Jodie Burchell

Measuring LLMs with Jodie Burchell

Author: Carl Franklin and Richard Campbell April 3, 2025 Duration: 1:00:44
How do you measure the quality of a large language model? Carl and Richard talk to Dr. Jodie Burchell about her work measuring large language models for accuracy, reliability, and consistency. Jodie talks about the variety of benchmarks that exist for LLMs and the problems they have. A broader conversation about quality digs into the idea that LLMs should be targeted to the particular topic area they are being used for - often, smaller is better! Building a good test suite for your LLM is challenging but can increase your confidence that the tool will work as expected.

Hosted by Carl Franklin and Richard Campbell, .NET Rocks! is a long-running conversation with the people building the future of software. This isn't a dry lecture; it's a lively, technical deep dive where two seasoned developers explore the vast ecosystem around Microsoft .NET, Azure, and modern development practices with a diverse roster of expert guests. Each episode feels like you're pulling up a chair in a room full of brilliant minds, listening to unfiltered discussions about real-world coding challenges, architectural patterns, and the tools that shape our daily work. You'll hear practical advice, war stories from the trenches, and forward-looking insights that go far beyond the documentation. Tuning into this podcast means connecting with a community of professionals who are as passionate about the craft as you are, offering perspectives that can transform how you approach your next project. Whether you're deep into C# or just curious about cloud-native development, these conversations provide a valuable blend of knowledge, humor, and genuine enthusiasm for technology.
Author: Language: English Episodes: 1000

.NET Rocks!
Podcast Episodes
Fixing Websites with RemoteDebug with Kenneth Auchenberg [not-audio_url] [/not-audio_url]

Duration: 46:25
How do you debug browser problems in your web apps? Carl and Richard talk to Kenneth Auchenberg about RemoteDebug, an open source tool to bring a common debugging solution across browsers. And Ken really means across bro…
DevOps in a Windows World with Jon Arild Tørresdal [not-audio_url] [/not-audio_url]

Duration: 58:40
What's missing from the DevOps story in the Windows world? While at NDC in Oslo, Carl and Richard sat down with Jon Arild Tørresdal to talk about his struggles to have an effective DevOps practice in his organization usi…
Lean Functional with Bryan Hunter [not-audio_url] [/not-audio_url]

Duration: 1:01:07
Isn't all functional programming lean? What does that even mean? While at NDC, Carl and Richard chatted with Bryan Hunter about lean principles and how they apply to functional programming. After a quick debate around th…
Developing ASP.NET on Linux with Mark Rendle [not-audio_url] [/not-audio_url]

Duration: 1:04:53
So what does it take to develop ASP.NET web sites on Linux? While at NDC in Oslo, Carl and Richard sat down with Mark Rendle to talk through the ins and outs of ASP.NET on Linux. And we're not just talking servers either…
Building a Compiler with Philip Laureano [not-audio_url] [/not-audio_url]

Duration: 54:06
Why in the world would you want to build a compiler? While at NDC, Carl and Richard talked to Philip Laureano about why he makes compilers! Philip starts out talking how building compilers helps you think about parsers a…
No Estimates with Woody Zuill [not-audio_url] [/not-audio_url]

Duration: 54:19
How do you estimate your projects? While at NDC, Carl and Richard talk to Woody Zuill about delivering software WITHOUT estimates. Woody starts out with a clarification - it's not zero estimates, just no estimates around…
Elixir and Phoenix with Chris McCord [not-audio_url] [/not-audio_url]

Duration: 53:51
While at NDC, Carl and Richard talked to Chris McCord about Elixir - the friendly language on top of Erlang. Chris talks about his Phoenix framework which brings an MVC-style development approach to building web sites in…
NuGet, Chocolatey, Boxstarter and Vagrant with Justin James [not-audio_url] [/not-audio_url]

Duration: 54:02
While at NDC, Carl and Richard chat with Justin James about his deployment tool chain of NuGet, Chocolatey, Boxstarter and Vagrant. Each of these tools builds on the other, starting at the lowest level with specific libr…
Passwords, SQL Injection and WiFi Security with Troy Hunt [not-audio_url] [/not-audio_url]

Duration: 59:02
While at NDC in Oslo, Carl and Richard talk to Troy Hunt about all the scary stuff going on in security today. The conversation starts out recapping some discussion on passwords - how do we get past them? Troy also digs…
.NET Everywhere with Rocky Lhotka [not-audio_url] [/not-audio_url]

Duration: 1:00:49
Where will .NET go next? Carl and Richard talk to Rocky Lhotka (who happily is still alive after having his entire aorta replaced) about the resurgence in .NET. Between the open sourcing of .NET creating a common codebas…