Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode p...
We cover Anthropic’s groundbreaking Model Context Protocol (MCP). Though it was released in November 2024, we've been seeing a lot of hype around it lately, and thought it was well worth digging into. Learn how this open standard is revolutionizing AI by enabling seamless integration between LLMs and external data sources, fundamentally transforming them into capable, context-aware agents. We explore the key benefits of MCP, including enhanced context retention across interactions, improved interoperability for agentic workflows, and the development of more capable AI agents that can execute complex tasks in real-world environments.Learn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
15:03
AI Roundup: DeepSeek’s Big Moves, Claude 3.7, and the Latest Breakthroughs
This week, we're mixing things up a little bit. Instead of diving deep into a single research paper, we cover the biggest AI developments from the past few weeks.We break down key announcements, including:DeepSeek’s Big Launch Week: A look at FlashMLA (DeepSeek’s new approach to efficient inference) and DeepEP (their enhanced pretraining method).Claude 3.7 & Claude Code: What’s new with Anthropic’s latest model, and what Claude Code brings to the AI coding assistant space.Stay ahead of the curve with this fast-paced recap of the most important AI updates. We'll be back next time with our regularly scheduled programming. Learn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
30:23
How DeepSeek is Pushing the Boundaries of AI Development
This week, we dive into DeepSeek. SallyAnn DeLucia, Product Manager at Arize, and Nick Luzio, a Solutions Engineer, break down key insights on a model that have dominating headlines for its significant breakthrough in inference speed over other models. What’s next for AI (and open source)? From training strategies to real-world performance, here’s what you need to know.Read a summary: https://arize.com/blog/how-deepseek-is-pushing-the-boundaries-of-ai-development/Learn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
29:54
Multiagent Finetuning: A Conversation with Researcher Yilun Du
We talk to Google DeepMind Senior Research Scientist (and incoming Assistant Professor at Harvard), Yilun Du, about his latest paper "Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains." This paper introduces a multiagent finetuning framework that enhances the performance and diversity of language models by employing a society of agents with distinct roles, improving feedback mechanisms and overall output quality.The method enables autonomous self-improvement through iterative finetuning, achieving significant performance gains across various reasoning tasks. It's versatile, applicable to both open-source and proprietary LLMs, and can integrate with human-feedback-based methods like RLHF or DPO, paving the way for future advancements in language model development.Read an overview on the blogWatch the full discussionLearn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
--------
30:03
Training Large Language Models to Reason in Continuous Latent Space
LLMs have typically been restricted to reason in the "language space," where chain-of-thought (CoT) is used to solve complex reasoning problems. But a new paper argues that language space may not always be the best for reasoning. In this paper read, we cover an exciting new technique from a team at Meta called Chain of Continuous Thought—also known as "Coconut." In the paper, "Training Large Language Models to Reason in a Continuous Latent Space" explores the potential of allowing LLMs to reason in an unrestricted latent space instead of being constrained by natural language tokens.Read a full breakdown of Coconut on our blogLearn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.
Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.