Can Defense in Depth Work for AI? (with Adam Gleave)
Adam Gleave is co-founder and CEO of FAR.AI. In this cross-post from The Cognitive Revolution Podcast, he joins to discuss post-AGI scenarios and AI safety challenges. The conversation explores his three-tier framework for AI capabilities, gradual disempowerment concerns, defense-in-depth security, and research on training less deceptive models. Topics include timelines, interpretability limitations, scalable oversight techniques, and FAR.AI’s vertically integrated approach spanning technical research, policy advocacy, and field-building.LINKS:Adam Gleave - https://www.gleave.meFAR.AI - https://www.far.aiThe Cognitive Revolution Podcast - https://www.cognitiverevolution.aiPRODUCED BY:https://aipodcast.ingCHAPTERS:(00:00) A Positive Post-AGI Vision(10:07) Surviving Gradual Disempowerment(16:34) Defining Powerful AIs(27:02) Solving Continual Learning(35:49) The Just-in-Time Safety Problem(42:14) Can Defense-in-Depth Work?(49:18) Fixing Alignment Problems(58:03) Safer Training Formulas(01:02:24) The Role of Interpretability(01:09:25) FAR.AI's Vertically Integrated Approach(01:14:14) Hiring at FAR.AI(01:16:02) The Future of GovernanceSOCIAL LINKS:Website: https://podcast.futureoflife.orgTwitter (FLI): https://x.com/FLI_orgTwitter (Gus): https://x.com/gusdockerLinkedIn: https://www.linkedin.com/company/future-of-life-institute/YouTube: https://www.youtube.com/channel/UC-rCCy3FQ-GItDimSR9lhzw/Apple: https://geo.itunes.apple.com/us/podcast/id1170991978Spotify: https://open.spotify.com/show/2Op1WO3gwVwCrYHg4eoGyP
--------
1:18:35
--------
1:18:35
How We Keep Humans in Control of AI (with Beatrice Erkers)
Beatrice works at the Foresight Institute running their Existential Hope program. She joins the podcast to discuss the AI pathways project, which explores two alternative scenarios to the default race toward AGI. We examine tool AI, which prioritizes human oversight and democratic control, and d/acc, which emphasizes decentralized, defensive development. The conversation covers trade-offs between safety and speed, how these pathways could be combined, and what different stakeholders can do to steer toward more positive AI futures.LINKS:AI Pathways - https://ai-pathways.existentialhope.comBeatrice Erkers - https://www.existentialhope.com/team/beatrice-erkersCHAPTERS:(00:00) Episode Preview(01:10) Introduction and Background(05:40) AI Pathways Project(11:10) Defining Tool AI(17:40) Tool AI Benefits(23:10) D/acc Pathway Explained(29:10) Decentralization Trade-offs(35:10) Combining Both Pathways(40:10) Uncertainties and Concerns(45:10) Future Evolution(01:01:21) Funding PilotsPRODUCED BY:https://aipodcast.ingSOCIAL LINKS:Website: https://podcast.futureoflife.orgTwitter (FLI): https://x.com/FLI_orgTwitter (Gus): https://x.com/gusdockerLinkedIn: https://www.linkedin.com/company/future-of-life-institute/YouTube: https://www.youtube.com/channel/UC-rCCy3FQ-GItDimSR9lhzw/Apple: https://geo.itunes.apple.com/us/podcast/id1170991978Spotify: https://open.spotify.com/show/2Op1WO3gwVwCrYHg4eoGyP
--------
1:06:45
--------
1:06:45
Why Building Superintelligence Means Human Extinction (with Nate Soares)
Nate Soares is president of the Machine Intelligence Research Institute. He joins the podcast to discuss his new book "If Anyone Builds It, Everyone Dies," co-authored with Eliezer Yudkowsky. We explore why current AI systems are "grown not crafted," making them unpredictable and difficult to control. The conversation covers threshold effects in intelligence, why computer security analogies suggest AI alignment is currently nearly impossible, and why we don't get retries with superintelligence. Soares argues for an international ban on AI research toward superintelligence.LINKS:If Anyone Builds It, Everyone Dies - https://ifanyonebuildsit.comMachine Intelligence Research Institute - https://intelligence.orgNate Soares - https://intelligence.org/team/nate-soares/PRODUCED BY:https://aipodcast.ingCHAPTERS:(00:00) Episode Preview(01:05) Introduction and Book Discussion(03:34) Psychology of AI Alarmism(07:52) Intelligence Threshold Effects(11:38) Growing vs Crafting AI(18:23) Illusion of AI Control(26:45) Why Iteration Won't Work(34:35) The No Retries Problem(38:22) Computer Security Lessons(49:13) The Cursed Problem(59:32) Multiple Curses and Complications(01:09:44) AI's Infrastructure Advantage(01:16:26) Grading Humanity's Response(01:22:55) Time Needed for Solutions(01:32:07) International Ban NecessitySOCIAL LINKS:Website: https://podcast.futureoflife.orgTwitter (FLI): https://x.com/FLI_orgTwitter (Gus): https://x.com/gusdockerLinkedIn: https://www.linkedin.com/company/future-of-life-institute/YouTube: https://www.youtube.com/channel/UC-rCCy3FQ-GItDimSR9lhzw/Apple: https://geo.itunes.apple.com/us/podcast/id1170991978Spotify: https://open.spotify.com/show/2Op1WO3gwVwCrYHg4eoGyP
--------
1:39:38
--------
1:39:38
Breaking the Intelligence Curse (with Luke Drago)
Luke Drago is the co-founder of Workshop Labs and co-author of the essay series "The Intelligence Curse". The essay series explores what happens if AI becomes the dominant factor of production thereby reducing incentives to invest in people. We explore pyramid replacement in firms, economic warning signs to monitor, automation barriers like tacit knowledge, privacy risks in AI training, and tensions between centralized AI safety and democratization. Luke discusses Workshop Labs' privacy-preserving approach and advises taking career risks during this technological transition. "The Intelligence Curse" essay series by Luke Drago & Rudolf Laine: https://intelligence-curse.ai/ Luke's Substack: https://lukedrago.substack.com/ Workshop Labs: https://workshoplabs.ai/ CHAPTERS: (00:00) Episode Preview(00:55) Intelligence Curse Introduction(02:55) AI vs Historical Technology(07:22) Economic Metrics and Indicators(11:23) Pyramid Replacement Theory(17:28) Human Judgment and Taste(22:25) Data Privacy and Control(28:55) Dystopian Economic Scenario(35:04) Resource Curse Lessons(39:57) Culture vs Economic Forces(47:15) Open Source AI Debate(54:37) Corporate Mission Evolution(59:07) AI Alignment and Loyalty(01:05:56) Moonshots and Career Advice
--------
1:09:38
--------
1:09:38
What Markets Tell Us About AI Timelines (with Basil Halperin)
Basil Halperin is an assistant professor of economics at the University of Virginia. He joins the podcast to discuss what economic indicators reveal about AI timelines. We explore why interest rates might rise if markets expect transformative AI, the gap between strong AI benchmarks and limited economic effects, and bottlenecks to AI-driven growth. We also cover market efficiency, automated AI research, and how financial markets may signal progress. Basil's essay on "Transformative AI, existential risk, and real interest rates": https://basilhalperin.com/papers/agi_emh.pdf Read more about Basil's work here: https://basilhalperin.com/CHAPTERS:(00:00) Episode Preview(00:49) Introduction and Background(05:19) Efficient Market Hypothesis Explained(10:34) Markets and Low Probability Events(16:09) Information Diffusion on Wall Street(24:34) Stock Prices vs Interest Rates(28:47) New Goods Counter-Argument(40:41) Why Focus on Interest Rates(45:00) AI Secrecy and Market Efficiency(50:52) Short Timeline Disagreements(55:13) Wealth Concentration Effects(01:01:55) Alternative Economic Indicators(01:12:47) Benchmarks vs Economic Impact(01:25:17) Open Research QuestionsSOCIAL LINKS:Website: https://future-of-life-institute-podcast.aipodcast.ingTwitter (FLI): https://x.com/FLI_orgTwitter (Gus): https://x.com/gusdockerLinkedIn: https://www.linkedin.com/company/future-of-life-institute/YouTube: https://www.youtube.com/channel/UC-rCCy3FQ-GItDimSR9lhzw/Apple Podcasts: https://geo.itunes.apple.com/us/podcast/id1170991978Spotify: https://open.spotify.com/show/2Op1WO3gwVwCrYHg4eoGyPPRODUCED BY: https://aipodcast.ing
The Future of Life Institute (FLI) is a nonprofit working to reduce global catastrophic and existential risk from powerful technologies. In particular, FLI focuses on risks from artificial intelligence (AI), biotechnology, nuclear weapons and climate change. The Institute's work is made up of three main strands: grantmaking for risk reduction, educational outreach, and advocacy within the United Nations, US government and European Union institutions. FLI has become one of the world's leading voices on the governance of AI having created one of the earliest and most influential sets of governance principles: the Asilomar AI Principles.