PodcastTecnologiaMachine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)
Machine Learning Street Talk (MLST)
Ultimo episodio

252 episodi

  • Machine Learning Street Talk (MLST)

    When AI Decides You're a Threat — Brad Carson

    31/05/2026 | 1 h 20 min
    Brad Carson was the Army's General Counsel, served two terms in Congress and was Acting Under Secretary of Defense for Personnel and Readiness. He now heads Americans for Responsible Innovation, the AI-policy advocacy group he co-founded. Keith Duggar spends roughly eighty minutes pushing back.

    SPONSOR:
    ---
    Cyber Fund built the Monastery to help founders ship products that were impossible a year ago. Applications for Batch 1 are now open.
    Apply now: https://cyber.fund
    ---

    Carson's whole case rests on one line: the genie is not out of the bottle. We have pulled dangerous tech back before. Asilomar halted recombinant DNA in 1975, and the West still controls the chips AI runs on. Calling it unstoppable, he says, is the most dangerous idea in the room.

    Then Keith drags him somewhere darker. A Palantir heat map scores you 0.73 on whether you are a combatant, and a strike follows. The model is wrong some accepted share of the time, and when it is, nobody answers for it. You cannot court-martial a model, and not even the interpretability researchers can say why it picked you.


    Note: after recording, we learned that Americans for Responsible Innovation is backed by EA-aligned philanthropy (not sponsored)

    ---
    TIMESTAMPS:
    00:00:00 From the Pentagon to AI governance
    00:04:52 Regulatory capture vs Silicon Valley networks
    00:07:56 Transparency and the Claude tier changes
    00:09:40 Tort liability when AI tools cause harm
    00:13:40 AI is a product, not a person
    00:16:01 Children, suicide, and the suicide business
    00:19:59 Opaque neural nets and the law of war
    00:25:54 Probabilistic targeting and the death of accountability
    00:28:47 The arms race fallacy: Asilomar and restraint
    00:34:02 Talking to China: track 2 talks and chip leverage
    00:39:45 Air power never wins: capital for labour
    00:43:29 Anthropic vs the Department of War
    00:51:29 Concentration, open source, and brain drain
    01:00:18 DeepSeek, Chinese culture, and AI as diplomacy
    01:12:25 Upskilling Congress and why public trust matters

    ---
    REFERENCES:
    organization:
    [00:02:45] ICRC position on autonomous weapons
    https://www.icrc.org/en/law-and-policy/autonomous-weapons
    [00:05:22] Americans for Responsible Innovation (ARI)
    https://ari.us
    [00:07:20] Andreessen Horowitz (a16z)
    https://a16z.com/
    [01:16:05] Office of Technology Assessment
    https://en.wikipedia.org/wiki/Office_of_Technology_Assessment
    other:
    [00:03:35] Beneficial AGI 2019 Conference (Future of Life Institute, Puerto Rico)
    https://futureoflife.org/event/beneficial-agi-2019/
    [00:18:30] Section 230 of the Communications Decency Act
    https://en.wikipedia.org/wiki/Section_230
    [00:19:59] Lethal Autonomous Weapons (LAWS)
    https://en.wikipedia.org/wiki/Lethal_autonomous_weapon
    [00:31:35] Strategic Arms Limitation Talks (SALT)
    https://en.wikipedia.org/wiki/Strategic_Arms_Limitation_Talks
    [00:32:28] Asilomar Conference on Recombinant DNA (1975)
    https://en.wikipedia.org/wiki/Asilomar_Conference_on_Recombinant_DNA
    [00:39:45] The New Iron Triangle (ARI policy byte)
    https://ari.us/policy-bytes/the-new-iron-triangle/
    [00:48:05] Defense Production Act
    https://en.wikipedia.org/wiki/Defense_Production_Act
    person:
    [00:03:35] Anthony Aguirre
    https://en.wikipedia.org/wiki/Anthony_Aguirre
    [00:06:48] Dean Ball — Hyperdimensional
    https://www.hyperdimensional.co/
    [00:23:13] Neel Nanda — mechanistic interpretability
    https://www.neelnanda.io/
    [00:36:02] Jack Clark (Anthropic) on Conversations with Tyler
    https://conversationswithtyler.com/episodes/jack-clark/
    [00:39:15] Robert Trager — Centre for the Governance of AI
    https://www.governance.ai/team/robert-trager
    [00:41:55] Giulio Douhet
    https://en.wikipedia.org/wiki/Giulio_Douhet
    [01:15:05] Don Beyer (US Congress)
    https://en.wikipedia.org/wiki/Don_Beyer
    tool:
    [00:22:19] Phalanx CIWS
    https://en.wikipedia.org/wiki/Phalanx_CIWS

    ---
    ReScript:
    https://app.rescript.info/public/share/9405ff35c0215b7cdae6402d41284171
    https://app.rescript.info/api/public/sessions/0a6c081b8e5fe413/pdf
  • Machine Learning Street Talk (MLST)

    Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)

    21/05/2026 | 1 h 17 min
    Michael I. Jordan, described by Science magazine as the most influential computer scientist alive, has never thought of himself as an AI researcher. In this conversation he explains why that distinction matters.

    SPONSOR:
    ---
    Cyber Fund built the Monastery to help founders ship products that were impossible a year ago. Applications for Batch 1 are now open.
    Apply now: https://cyber.fund
    ---

    Jordan trained as a statistician and cognitive scientist, and his career has been spent building machine learning systems that work in the real world: supply chains, commerce, healthcare, and large economic systems. When the field rebranded itself as AI and then AGI, he did not follow. Instead he argues that the framing is wrong. AI is better understood as a collective economic system than as a race to build a disembodied superintelligence.

    We talk about why AGI is mostly a PR term, what machine learning achieved before the LLM hype cycle, and why the assistant-on-your-shoulder vision may be less compelling than it sounds. Jordan explains why explanations need to be actionable, not merely mechanistic; why AlphaFold's missing error bars matter; how prediction-powered inference changes the picture; and why drug discovery is an incentive-design problem rather than a pure pattern-matching problem.

    ERRATA: Science magazine ranked him the most influential computer scientist, not Nature

    ---
    TIMESTAMPS:
    00:00:00 Cold open: A demoralizing message to young builders
    00:02:04 CyberFund sponsor read
    00:02:50 From symbolic AI to machine learning systems
    00:05:42 Why AGI is mostly a PR term
    00:08:48 A collectivist, economic perspective on AI
    00:11:33 Why LLMs need system design, not hype
    00:14:50 Predictability beats faux understanding
    00:17:55 AlphaFold, bias, and prediction-powered inference
    00:21:48 Stop anthropomorphizing intelligence
    00:27:44 Drug discovery as an incentive problem
    00:32:29 The three-layer data market
    00:38:07 Social knowledge, markets, and culture
    00:45:39 Creator economics beyond Spotify
    00:48:30 How science-fiction AI narratives mislead young builders
    00:51:45 AI should improve humans, not replace them
    00:56:42 Safety is a property of the whole system
    00:58:12 Silicon Valley gurus and the cream off the top
    01:00:47 Game theory, mechanism design, and contracts
    01:04:39 Conformal prediction, e-values, and anytime inference
    01:08:11 A new liberal arts triangle for the AI era
    01:11:30 The Bayesian duck and markets as uncertainty reduction

    ReScript (transcript, PDF, refs etc) - https://app.rescript.info/public/share/fb68f94af29d3745c6cf6125e01328b5
    ---
    REFERENCES:
    person:
    [00:02:50] Michael I. Jordan (homepage)
    https://people.eecs.berkeley.edu/~jordan/
    paper:
    [00:06:01] A Collectivist, Economic Perspective on AI
    https://arxiv.org/abs/2507.06268
    [00:18:09] AlphaFold
    https://www.nature.com/articles/s41586-021-03819-2
    [00:20:36] Prediction-Powered Inference
    https://arxiv.org/abs/2301.09633
    [00:33:47] On Three-Layer Data Markets
    https://arxiv.org/abs/2402.09697
    [01:04:39] Conformal Prediction with Conditional Guarantees
    https://arxiv.org/abs/2107.07511
    [01:04:51] A Tutorial on Conformal Prediction
    https://www.jmlr.org/papers/v9/shafer08a.html
    [01:06:00] E-Values Expand the Scope of Conformal Prediction
    https://arxiv.org/abs/2503.13050
    [01:08:23] Computational Thinking
    https://www.cs.cmu.edu/~CompThink/papers/Wing06.pdf
    other:
    [00:28:20] How Should the FDA Test?
    https://rdi.berkeley.edu/events/sbc-assets/pdfs/Summit%20session%20speaker%20slides%20submission%20form-s1-5%20%28File%20responses%29/Slides%20in%20PDF%20%28Please%20name%20the%20submitted%20file%20as%20_firstname_-_lastname_-slides.pdf%29.%20%28File%20responses%29/27-Michael%20Jordan-Session%20V.pdf#page=15
    [00:28:40] Michael I. Jordan Session V Slides
    <truncated, see ReScript link or YT VD>
  • Machine Learning Street Talk (MLST)

    The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]

    04/05/2026 | 1 h 53 min
    Beth Barnes and David Rein on the one graph that ate the AI timelines discourse, and why the two people who built it are the most careful about how you read it.**SPONSOR**Prolific - Quality data. From real people. For faster breakthroughs.https://www.prolific.com/?utm_source=mlstInterview: https://youtu.be/cnxZZTl1tkk---Beth Barnes and David Rein from METR on the one graph that ate the AI timelines discourse, and why the people who built it are the most careful about how it gets read.Beth founded METR after leaving OpenAI alignment. David is first author on GPQA and co-author on HCAST and the METR Time Horizons paper. Together they built the measurement Daniel Kokotajlo called the single most important piece of evidence on AI timelines: the log-linear line of "how long a task a frontier model can complete at 50% reliability" vs release date.The conversation opens on reward hacking. Current models can articulate in chat why a behaviour is undesired and then execute it anyway as agents. From there: construct validity, Melanie Mitchell's four-problem taxonomy, and the ARC-AGI 1-to-2 collapse as a worked example of adversarially-selected benchmarks regressing once labs target them. Beth's counter: METR deliberately does not adversarially select. David's: models do not have to do the right thing for the right reasons.Methodology, then specification — David's compiler analogy, Beth on four-month tasks as expensive to evaluate rather than unspecifiable. Then the SWE-bench reality check, the METR finding that half of passing PRs would not be merged, and Beth's horses-versus-bank-tellers analogy for the labour market.The close: monitorability, the coin-spinning boat, two-year recursive self-improvement, and Beth's line that "overhyped now" and "big deal later" are not correlated claims.---TIMESTAMPS:00:00:00 Intro00:02:06 Sponsor break: Prolific human-feedback infrastructure00:02:33 Welcome and the scalable oversight motivation00:06:02 Construct validity, benchmark pathologies and the Chollet worry00:15:45 Time Horizons: human time, HCAST tasks and the 50% logistic00:24:50 Is human difficulty really one variable?00:33:05 Agent harness evolution and the inference-compute dividend00:40:00 Scaffolding bells, token budgets and the credit-assignment problem00:44:15 Look at the damn graph: regularisation bug and reliability nuance00:50:00 Why 50%? Reliability, reward hacking and pizza-party transcripts00:55:20 Extrapolation risk and straight lines on graphs00:59:25 Software engineering as a specification acquisition problem01:07:40 Compilers also made ugly code: vibe-coding quality and Claude on METR Slack01:15:15 Strongest defensible claim, Carlini's compiler swarm and AI 202701:23:45 SWE-bench merge rates, the bank-teller analogy and horses01:31:45 Scheming, alignment faking and the mentalistic vocabulary problem01:40:45 Reward hacking, monitorability and chain-of-thought faithfulness01:45:25 Recursive self-improvement, knowledge vs intelligence and closing
    ReScript: https://app.rescript.info/public/share/de3bb40cc02ee39fdf36e2c60366eb4d
    (PDF, refs, transcript etc)
  • Machine Learning Street Talk (MLST)

    When AI Discovers The Next Transformer - Robert Lange (Sakana)

    13/03/2026 | 1 h 18 min
    Robert Lange, founding researcher at Sakana AI, joins Tim to discuss *Shinka Evolve* — a framework that combines LLMs with evolutionary algorithms to do open-ended program search. The core claim: systems like AlphaEvolve can optimize solutions to fixed problems, but real scientific progress requires co-evolving the problems themselves.

    GTC is coming, the premier AI conference, great opportunity to learn about AI. NVIDIA and partners will showcase breakthroughs in physical AI, AI factories, agentic AI, and inference, exploring the next wave of AI innovation for developers and researchers. Register for virtual GTC for free, using my link and win NVIDIA DGX Spark (https://nvda.ws/4qQ0LMg)

    • Why AlphaEvolve gets stuck — it needs a human to hand it the right problem. Shinka tries to invent new problems automatically, drawing on ideas from POET, PowerPlay, and MAP-Elites quality-diversity search.

    • The *architecture* of Shinka: an archive of programs organized as islands, LLMs used as mutation operators, and a UCB bandit that adaptively selects between frontier models (GPT-5, Sonnet 4.5, Gemini) mid-run. The credit-assignment problem across models turns out to be genuinely hard.

    • Concrete results — state-of-the-art circle packing with dramatically fewer evaluations, second place in an AtCoder competitive programming challenge, evolved load-balancing loss functions for mixture-of-experts models, and agent scaffolds for AIME math benchmarks.

    • Are these systems actually thinking outside the box, or are they parasitic on their starting conditions? When LLMs run autonomously, "nothing interesting happens." Robert pushes back with the stepping-stone argument — evolution doesn't need to extrapolate, just recombine usefully.

    • The AI Scientist question: can automated research pipelines produce real science, or just workshop-level slop that passes surface-level review? Robert is honest that the current version is more co-pilot than autonomous researcher.

    • Where this lands in 5-20 years — Robert's prediction that scientific research will be fundamentally transformed, and Tim's thought experiment about alien mathematical artifacts that no human could have conceived.

    Robert Lange: https://roberttlange.com/

    ---
    TIMESTAMPS:
    00:00:00 Introduction: Robert Lange, Sakana AI and Shinka Evolve
    00:04:15 AlphaEvolve's Blind Spot: Co-Evolving Problems with Solutions
    00:09:05 Unknown Unknowns, POET, and Auto-Curricula for AI Science
    00:14:20 MAP-Elites and Quality-Diversity: Shinka's Evolutionary Architecture
    00:28:00 UCB Bandits, Mutations and the Vibe Research Vision
    00:40:00 Scaling Shinka: Meta-Evolution, Democratisation and the Three-Axis Model
    00:47:10 Applications, ARC-AGI and the Future of Work
    00:57:00 The AI Scientist and the Human Co-Pilot: Who Steers the Search?
    01:06:00 AI Scientist v2, Slop Critique and the Future of Scientific Publishing

    ---
    REFERENCES:
    paper:
    [00:03:30] ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution
    https://arxiv.org/abs/2509.19349
    [00:04:15] AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery
    https://arxiv.org/abs/2506.13131
    [00:06:30] Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
    https://arxiv.org/abs/2505.22954
    [00:09:05] Paired Open-Ended Trailblazer (POET)
    https://arxiv.org/abs/1901.01753
    [00:10:00] PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
    https://arxiv.org/abs/1112.5309
    [00:10:40] Automated Capability Discovery via Foundation Model Self-Exploration
    https://arxiv.org/abs/2502.07577
    [00:15:30] Illuminating Search Spaces by Mapping Elites (MAP-Elites)
    https://arxiv.org/abs/1504.04909
    [00:47:10] Automated Design of Agentic Systems (ADAS)
    https://arxiv.org/abs/2408.08435
    <trunc, see ReScript/YT>

    PDF : https://app.rescript.info/api/sessions/b8a9dcf60623657c/pdf/download
    Transcript: https://app.rescript.info/public/share/SDOD_3oXOcli3zTqcAtR8eibT5U3gam84oo4KRtI-Vk
  • Machine Learning Street Talk (MLST)

    "Vibe Coding is a Slot Machine" - Jeremy Howard

    03/03/2026 | 1 h 26 min
    Dive into the realities of AI-assisted coding, the origins of modern fine-tuning, and the cognitive science behind machine learning with fast.ai founder Jeremy Howard. In this episode, we unpack why AI might be turning software engineering into a slot machine and how to maintain true technical intuition in the age of large language models.

    GTC is coming, the premier AI conference, great opportunity to learn about AI. NVIDIA and partners will showcase breakthroughs in physical AI, AI factories, agentic AI, and inference, exploring the next wave of AI innovation for developers and researchers. Register for virtual GTC for free, using my link and win NVIDIA DGX Spark (https://nvda.ws/4qQ0LMg)

    Jeremy Howard is a renowned data scientist, researcher, entrepreneur, and educator. As the co-founder of fast.ai, former President of Kaggle, and the creator of ULMFiT, Jeremy has spent decades democratizing deep learning. His pioneering work laid the foundation for modern transfer learning and the pre-training and fine-tuning paradigm that powers today's language models.

    Key Topics and Main Insights Discussed:

    - The Origins of ULMFiT and Fine-Tuning
    - The Vibe Coding Illusion and Software Engineering
    - Cognitive Science, Friction, and Learning
    - The Future of Developers

    RESCRIPT: https://app.rescript.info/public/share/BhX5zP3b0m63srLOQDKBTFTooSzEMh_ARwmDG_h_izk

    Jeremy Howard:
    https://x.com/jeremyphoward
    https://www.answer.ai/

    ---
    TIMESTAMPS (fixed):
    00:00:00 Introduction & GTC Sponsor
    00:04:30 ULMFiT & The Birth of Fine-Tuning
    00:12:00 Intuition & The Mechanics of Learning
    00:18:30 Abstraction Hierarchies & AI Creativity
    00:23:00 Claude Code & The Interpolation Illusion
    00:27:30 Coding vs. Software Engineering
    00:30:00 Cosplaying Intelligence: Dennett vs. Searle
    00:36:30 Automation, Radiology & Desirable Difficulty
    00:42:30 Organizational Knowledge & The Slope
    00:48:00 Vibe Coding as a Slot Machine
    00:54:00 The Erosion of Control in Software
    01:01:00 Interactive Programming & REPL Environments
    01:05:00 The Notebook Debate & Exploratory Science
    01:17:30 AI Existential Risk & Power Centralization
    01:24:20 Current Risks, Privacy & Enfeeblement

    ---
    REFERENCES:
    Blog Post:
    [00:03:00] fast.ai Blog: Self-Supervised Learning
    https://www.fast.ai/posts/2020-01-13-self_supervised.html
    [00:13:30] DeepMind Blog: Gemini Deep Think
    https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think/
    [00:19:30] Modular Blog: Claude C Compiler analysis
    https://www.modular.com/blog/the-claude-c-compiler-what-it-reveals-about-the-future-of-software
    [00:19:45] Anthropic Engineering Blog: Building C Compiler
    https://www.anthropic.com/engineering/building-c-compiler
    [00:48:00] Cursor Blog: Scaling Agents
    https://cursor.com/blog/scaling-agents
    [01:05:15] fast.ai Blog: NB Dev Merged Driver
    https://www.fast.ai/posts/2022-08-25-jupyter-git.html
    [01:17:30] Jeremy Howard: Response to AI Risk Letter
    https://www.normaltech.ai/p/is-avoiding-extinction-from-ai-really
    Book:
    [00:08:30] M. Chirimuuta: The Brain Abstracted
    https://mitpress.mit.edu/9780262548045/the-brain-abstracted/
    [00:30:00] Daniel Dennett: Consciousness Explained
    https://www.amazon.com/Consciousness-Explained-Daniel-C-Dennett/dp/0316180661
    [00:42:30] Cesar Hidalgo: Infinite Alphabet / Laws of Knowledge
    https://www.amazon.com/Infinite-Alphabet-Laws-Knowledge/dp/0241655676
    Archive Article:
    [00:13:45] MLST Archive: Why Creativity Cannot Be Interpolated
    https://archive.mlst.ai/read/why-creativity-cannot-be-interpolated
    Research Study:
    [00:24:30] METR Study: AI OS Development
    https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
    Paper:
    [00:24:45] Fred Brooks: No Silver Bullet
    https://www.cs.unc.edu/techreports/86-020.pdf
    [00:30:15] John Searle: Minds, Brains, and Programs
    https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/article/minds-brains-and-programs/DC644B47A4299C637C89772FACC2706A
Altri podcast di Tecnologia
Su Machine Learning Street Talk (MLST)
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).
Sito web del podcast

Ascolta Machine Learning Street Talk (MLST), Ciao, Internet! con Matteo Flora e molti altri podcast da tutto il mondo con l’applicazione di radio.it

Scarica l'app gratuita radio.it

  • Salva le radio e i podcast favoriti
  • Streaming via Wi-Fi o Bluetooth
  • Supporta Carplay & Android Auto
  • Molte altre funzioni dell'app