Open LLM Upgrades 🆕 // Gemma 2 Performance 💎 // SeaKR's Self-aware Learning 🧠

Author: Earkind June 28, 2024 Duration: 13:48

News Daily

HuggingFace has upgraded the Open LLM Leaderboard to v2, adding new benchmarks and improving the evaluation suite for easier reproducibility.

Gemma 2, a new addition to the Gemma family of lightweight open models, delivers the best performance for its size and offers competitive alternatives to models that are 2-3× bigger.

SeaKR is a new model that re-ranks retrieved knowledge based on the LLM's self-aware uncertainty, outperforming existing adaptive RAG methods in generating text with relevant and accurate information.

Step-DPO is a new method that enhances the robustness and factuality of LLMs by learning from human feedback, achieving impressive results in long-chain mathematical reasoning.

Contact: sergi@earkind.com

Timestamps:

00:34 Introduction

01:21 HuggingFace Updates Open LLM Leaderboard

03:19 Gemma 2: Improving Open Language Models at a Practical Size

04:16 From bare metal to a 70B model: infrastructure set-up and scripts

05:21 Fake sponsor

07:11 SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation

08:47 Simulating Classroom Education with LLM-Empowered Agents

10:16 Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs

12:31 Outro

GPT Reviews

Each morning, GPT Reviews serves up a fresh, slightly chaotic conversation about everything happening in artificial intelligence. This daily podcast from Earkind is actually crafted by AI, offering a unique blend of the latest headlines, major announcements, and intriguing research plucked from sources like arXiv. But it’s far from a dry briefing. The dynamic comes from its four distinct hosts: Giovani Pete Tizzano brings relentless optimism as an AI enthusiast, while Robert, the analyst, provides a grounded and often skeptical counterpoint. Olivia, who’s deeply embedded in online communities, shares the buzz and broader reactions, and Belinda, the witty research expert, helps unpack the technical details with clarity and a sharp sense of humor. Tuning in feels like dropping into a lively roundtable where complex ideas are debated, explained, and occasionally laughed about. You’ll get a comprehensive yet digestible overview of the AI landscape, all wrapped in a format that’s as entertaining as it is informative. The result is a consistently engaging listen that keeps you updated without feeling like homework, making it a standout in the daily news podcast space.

Author: Earkind Language: English Episodes: 100

Official website RSS

Podcast Episodes

[not-audio_url]

[/not-audio_url]

OpenAI's Strawberry Revolution 🍓 // Nvidia's Lucrative Paychecks 💸 // Google Pipe SQL Simplification 📊

29.08.2024

Duration: 14:01

This episode dives into OpenAI's promising new model, Strawberry, which could revolutionize interactions in ChatGPT. We explore the financial envy Nvidia employees inspire in their Google and Meta counterparts due to luc…

[not-audio_url]

[/not-audio_url]

OpenAI's 'Strawberry' AI 🚀 // World's Fastest AI Inference ⚡ // Photo-realistic 3D Avatars 🎨

28.08.2024

Duration: 14:14

OpenAI's 'Strawberry' AI tackles complex math and programming with enhanced reasoning, while Cerebras claims to have launched the fastest AI inference, enabling real-time applications at competitive prices. The GenCA mod…

[not-audio_url]

[/not-audio_url]

Grok-2's Speed & Accuracy 🚀 // OpenAI's Transparency Push 🗳️ // LlamaDuo for Local LLMs 🔄

27.08.2024

Duration: 14:46

Grok-2's advancements in speed and accuracy position it as a leading AI model, particularly in math and coding. OpenAI's backing of California's AI bill highlights the critical need for transparency in synthetic content,…

[not-audio_url]

[/not-audio_url]

Salesforce's AI Sales Agents 🤖 // NVIDIA's Compact Language Model ⚡ // Optimized Computation for Performance 📊

26.08.2024

Duration: 14:20

This episode dives into Salesforce's innovative AI sales agents that automate tasks but risk losing human touch, NVIDIA's compact yet powerful language model that promises efficiency, groundbreaking research showing how…

[not-audio_url]

[/not-audio_url]

Amazon Cloud Chief Spicy Takes 🚀 // Zuckerberg's AI Vision 📈 // Multimodal Models for Safety 🔒

23.08.2024

Duration: 13:54

This episode dives deep into the future of coding, challenging the belief that AI will render developers obsolete. It highlights Meta's stock surge, attributing it to Zuckerberg's compelling AI narrative that captivates…

[not-audio_url]

[/not-audio_url]

OpenAI's SearchGPT Launch 🔍 // Vision Transformers Efficiency 📊 // Automated Agent Design Revolution 🚀

19.08.2024

Duration: 14:11

OpenAI's SearchGPT is launching with limited access for only 10,000 users, raising questions about trust and the potential risks of generative search products. A comprehensive analysis challenges the belief that Vision T…

[not-audio_url]

[/not-audio_url]

Grok-2 Beta Release 🚀 // Apple's $1,000 Home Robot 🏡 // ChemVLM Breakthrough in Chemistry 🔬

15.08.2024

Duration: 13:41

This episode dives into the Grok-2 Beta Release, highlighting its advanced reasoning capabilities and competitive edge. We explore Apple’s ambitious plans for a $1,000 tabletop robotic home device, set to transform smart…

[not-audio_url]

[/not-audio_url]

Gemini Live AI Assistant 📱 // OpenAI’s Coding Benchmark ✅ // LongWriter’s 10K Word Generation ✍️

14.08.2024

Duration: 13:23

This episode dives into Gemini Live's interactive AI capabilities, OpenAI's improved coding benchmark for reliable evaluations, LongWriter's breakthrough in generating ultra-long outputs, and SlotLifter's advancements in…

[not-audio_url]

[/not-audio_url]

Google Meet's AI Note-Taking 📝 // Trump’s AI Crowd Claims 🤔 // ControlNeXt & Image Generation 🎨

13.08.2024

Duration: 13:51

Google Meet's new AI note-taking feature could change meeting dynamics, while Trump’s claims about Kamala Harris reveal the political implications of AI. The exploration of AI's role in scientific research raises ethical…

[not-audio_url]

[/not-audio_url]

OpenAI's Strawberry Model 🍓 // Meta's Celebrity Voice Assistants 🎙️ // Human-level Robot Table Tennis 🏓

12.08.2024

Duration: 15:27

OpenAI's mysterious "Strawberry" AI model is causing a buzz in the tech world, with rumors of advanced reasoning capabilities. Meta is trying to improve their AI assistants by enlisting the help of celebrities like Awkwa…