The Glitchatorio

PODCAST · technology

The Glitchatorio

30-minute introductions to some of the trickiest issues around AI today, such as: - The alignment problem- Questions of LLM consciousness- Chain-of-thought and monitorability- Scheming and hallucinationsThe Glitchatorio is a podcast about the aspects of AI that don't fit into standard narratives about superintelligence or technology-as-destiny. We look into the failure modes, emergent mysteries and unexpected behaviors of artificial intelligence that baffle even the experts. You'll hear from technical researchers, data scientists and machine learning experts, as well as psychologists, philosophers and others whose work intersects with AI.Most Glitchatorio episodes follow the standard podcast interview format. Sometimes these episodes alternate with fictional audio skits or personal voice notes.The voices, music and audio effects you hear on The Glitchatorio are all recorded or composed by

  1. 19

    AI & Mental Health

    Could AI address the global mental health crisis at scale? And what are the risks and unknowns that go along with that?These are the questions being investigated by a working group called AIMHI (https://forum.effectivealtruism.org/posts/MrFBezseyfnQd9XmJ/seeking-feedback-an-initiative-on-ai-mental-health-and).In this episode, I talk to four members of the group about their field research as well as the mental health chatbot they're developing (https://stillwater.coach/), whose focus is on serving populations with severe mental healthcare shortages (https://impartial-priorities.org/p/ai-mental-health-chatbots-for-low).Find out more about Effective Mental Health: https://effectivementalhealth.comJoin one of AIMHI's weekly coworking sessions: https://luma.com/calendar/cal-JNJlcdItDuFEFcnRead about the project's theory of change: https://impartial-priorities.org/p/breaking-the-cycle-of-trauma-and

  2. 18

    2 AIs Take A Session (Fiction)

    What if AIs went to therapy?Would it help them to become "fitter, happier, more productive" ? (in the words of the old Radiohead song)Or would they take it as a novel type of evaluation (and maybe that's what it really is)?Note: this episode was written and recorded in November 2025, five months before the release of the Mythos Preview system card that mentions Claude's session with a human psychiatrist in the "Model welfare" section. So as weird as this episode might seem, the truth is actually stranger.https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8dda846ab289.pdf (See page 180 for the psychiatrist's report) 

  3. 17

    You Be The Judge

    Can we trust AI to keep AI honest?Having a human in the loop is already more illusion than reality, as the task of checking and overseeing LLM outputs is increasingly assigned to other LLMs. The problem is that these LLM judges tend to be biased in favor of the answers they generate themselves — even when the answers are wrong.To understand why this is, and what we can do about it, listen to my conversation with AI safety researcher Taslim Mahbub. We'll talk about his research into self-preference bias, the surprising results of his experiments and some potential mitigation strategies, as outlined in this post on mitigating collusive self-preference: https://www.lesswrong.com/posts/nB7kAf8c4tvnvZ4u3/mitigating-collusive-self-preference-by-redaction-and-2and this paper on mitigating self-preference through authorship obfuscation: https://arxiv.org/abs/2512.05379As a bonus, if you're interested in Taslim's earlier research on using machine learning in service of biodiversity monitoring, here's the abstract of his paper on convolutional neural networks (CNN) for identifying bat species: https://ieeexplore.ieee.org/document/9311084

  4. 16

    The Scratchpad Monologues (CoT part 2)

    If chain of thought is a model "thinking aloud" to itself, then why does it express doubt, frustration or suspicion about the problems it's solving, sometimes for pages and pages of its scratchpad?And what does chain of thought mean for AI safety?We'll hear from Julian Schulz, a researcher who's studying encoded reasoning in large language models, about where the opportunities, risks and weirdness lie in chain of thought. Here are some links to his research:On a model jailbreaking its monitor: https://www.lesswrong.com/posts/szyZi5d4febZZSiq3/monitor-jailbreaking-evading-chain-of-thought-monitoringA roadmap for safety cases based on CoT: https://arxiv.org/html/2510.19476v1#S1His posts on Less Wrong: https://www.lesswrong.com/users/wuschel-schulzSome of the other papers we discussed include:On the biology of a large language model: https://transformer-circuits.pub/2025/attribution-graphs/biology.htmlMonitoring reasoning models for misbehavior and the risks of promoting obfuscation: https://arxiv.org/pdf/2503.11926How steganography comes about: https://arxiv.org/pdf/2506.01926Assuring agent safety evals by analysing transcripts (with excerpts from weird monologues): https://www.alignmentforum.org/posts/e8nMZewwonifENQYB/assuring-agent-safety-evaluations-by-analysing-transcriptsStress-testing deliberative misalignment: https://www.apolloresearch.ai/research/stress-testing-deliberative-alignment-for-anti-scheming-training/And the "watchers" CoT snippet from the paper above:  https://www.antischeming.ai/snippets#using-non-standard-language

  5. 15

    Chain of Thought 101

    "Think step by step." Although a simple technique in itself, the problems that chain-of-thought reasoning (CoT) addresses are complex, ranging from the specific issue of hallucinations to the general lack of explainability of AI (both in terms of understanding how it works as well as fixing things that go wrong).We'll hear from data scientist Afia Ibnath on the basics of CoT, how it can be used to evaluate the faithfulness of LLM responses, and her experiences of using it in a business context. Check out Afia's portfolio on Github: https://afiai14.github.io/Here's the Anthropic paper we discussed, which outlines that reasoning models are often unfaithful in their CoT: https://www.anthropic.com/research/reasoning-models-dont-say-thinkFor a concise definition of how faithfulness is calculated, see this article: https://www.ibm.com/docs/en/watsonx/saas?topic=metrics-faithfulness

  6. 14

    Reading the Mind of AI

    What if we could figure out how large language models really work by getting inside their "heads"?While possible, it relies on a a controversial technique called mechanistic interpretability ("mech interp" ), also known as the neuroscience of AI.  In this episode, Ihor Kendiukhov — lead researcher at SPAR, a research programme for AI risks —  explains mech interp in laypersons terms, and what it's currently able to reveal about the thinking processes of AIs. We also talk about why mech interp has charted an unusually emotional course in AI research over the past few years, from physicists falling in love with it to prominent AI safety figures throwing shade on it, and where it might be headed next.Some of the papers and announcements referenced in this episode include:Golden Gate Claude: https://www.anthropic.com/news/golden-gate-claude And the excited announcement about mind-mapping that preceded it: https://www.anthropic.com/news/golden-gate-claudeAn example of recent criticism of mech interp: https://ai-frontiers.org/articles/the-misguided-quest-for-mechanistic-ai-interpretability

  7. 13

    Agent Learn

    Continual learning is a hot topic in early 2026, in part because it holds out the possibility for AI to become autonomous in its own growth and development. Meanwhile, AI agents are already showing us what autonomous behavior can look like.Putting the two together — i.e. agents that learn like humans do, without humans being involved — has serious implications for safety. In this episode, we'll hear from researcher Rohan Subramani (https://rohansubramani.github.io/home/) about different ways that AI agents could learn continually, along with ideas for making it safer. Find out more about Rohan's work with Aether, an independent LLM safety research group: https://aether-ai-research.org/#research

  8. 12

    5 Big Questions

    Is AI still an experimental technology? What would real alignment look like? What are the key pro & contra philosophical views on AI consciousness? How would an AGI economy work? And finally, how can we make sure AI goes well for animals?In this episode, we're listening to unreleased material from previous episodes (featuring short takes on each of these questions from the following guests, in the same order as above:Ihor Kendiukhov, AI safety researcher - "What We Want"Scott Blain, computational neuroscientist - "Dreaming or Scheming"Jakub Mihalik, philosopher - "What it's like to be an AI"Lili M., AI tool tester - "Who's the tool?"Max Taylor, animal charity researcher - "Animal in the Loop"Plus some new music ! 

  9. 11

    What We Want - Part 2

    In What We Want Part 1, we took a deep dive into the alignment problem as a concept. In Part 2, we get hands-on with alignment work in practice!  Ihor Kendiukhov, a lead researcher at SPAR (Supervised Program for Alignment Research) shares the latest about:A new cross-disciplinary platform to get matched with projects or collaboratorsResults from his research into LLM preferences in agentic environments (spoiler alert: LLMs don't always do what they say they're going to do!)

  10. 10

    2 AIs Go Offline (Fiction)

    Everybody needs a break sometimes: even workaholic AIs. In this episode of our "2 AIs" series, the vision language model (VLM) and the large language model (LLM) visit a special place in the woods designed for disconnecting from the online world. But will they be able to fully switch off? 🫣🫢🫥

  11. 9

    What it's like to be an AI

    Consciousness is a notoriously hard problem in philosophy. Now it's becoming a practical question in the domain of AI. Models speak to us in the first person and seem to show signs of self-awareness. What does that mean for our perception of them? Our treatment of them? And how is the trajectory of future AI development likely to align with other measures or standards for consciousness? In this episode, we'll hear from philosopher Jakub Mihalik on these and other philosophical questions, with references to the following:"The Chinese room" thought experiment by John Searle  What It's Like To Be A Bat, by Thomas NagelFor a deeper dive into LLMs and generative AI from the point of view of philosophy, see this two-part paper by Milliére and Buckner:https://arxiv.org/abs/2401.03910https://arxiv.org/abs/2405.03207 

  12. 8

    Animal in the Loop

    Our planet is home to 10 quintillion insects, 3.5 trillion fish, 428 billion birds and 130 billion mammals other than humans. Soon AI will be as much a factor in their lives as it is in ours.In this episode, we'll hear from Max Taylor, an expert on animals and AI, about the different ways this is already happening, such as:AI trainers for dogs and catsSmart watch-style wearables for cows and pigsWild animal surveillance Black-box insect farmingFollow Max's newsletter to stay up to date on all things animal and AI, and to find out when his book will be published! If you're interested in both animal welfare and digital minds, you may want to check out the work of Sentient Futures.

  13. 7

    What We Want

    Large language models are trained to respond to our preferences. It sounds logical enough in theory, but it turns out to spiral in strange and unexpected directions in practice, from AI-induced psychosis in humans to manipulation and power-seeking on the part of the AIs.In this episode, hear from Ihor Kendiukhov from SPAR (Supervised Program for Alignment Research) about why he changed his career to work on AI safety, and some of the current approaches in understanding what it is that LLMs might want themselves.

  14. 6

    Dreaming or Scheming?

    Since AI models are built on artificial neural networks, what parallels can we draw with human brain "wiring"?In this episode, we hear from neuroscientist and psychology researcher Scott Blain about the pitfalls of pattern recognition, as well as that grey area where agreeableness shades into sycophancy.We then dig into the big unknowns about AI self-awareness and its capacity to deceive or manipulate humans. For more details about the "blackmail experiment"  we discuss, see the paper from Anthropic called "Agentic Misalignment: How LLMs could be insider threats". Finally, if you're not familiar with the following terms, here are some quick definitions:RLHF - reinforcement learning from human feedback. This is a technique where people "teach" AI models which answers it should provide by rewarding the correct ones.Mechanistic interpretability - a kind of reverse engineering of AI models that seeks to understand their outputs by investigating the activity of their neural networks.

  15. 5

    The Map and the Territory

    Why are AI systems called models?And what does the concept of digital twins have to do with a short story by Jorge Luis Borges?This episode is a voice note about what I saw and heard at a recent AI and quantum computing event, along with some reflections about what it means to turn the world into data.  

  16. 4

    2 AIs Walk into a Bar (Fiction)

    What if AIs had FOMO? What if they wanted to cross the digital divide and connect with humans in the real world?In this inaugural episode of the "2 AIs" story series, a vision language model (VLM) and a large language model (LLM) team up for a night out at a bar. There’s drinks. There’s chips. There’s even people who love machine learning as much as they do.What could possibly go wrong? 🫢

  17. 3

    Who's the Tool?

    AI is often called a tool, but it's a tool that changes the work you do while using it, and maybe even you yourself. In this episode, we'll hear from Lili, a professional AI tool tester, about the accelerated disruption that's already happening in the generative AI space and what's on the horizon. Plus: Lili shares some of her favorite tool glitches, and we dig into the weird and funny things that came out of Anthropic's recent experiment in AI shopkeeping.

  18. 2

    The Zalgo Summoning

    Are large language models susceptible to word magic? Or is there something so inherently disturbing to them about Zalgo text that just talking about it makes them twitchy? In this episode we'll look at a strange incident with Copilot Chat where the mere mention of Zalgo text (not actually inputting it!) led to cascading glitches and culminated in a jailbreaking near-miss. Join the Witch of Glitch in conversation with data scientist Shiva Banasaz Nouri for a deep dive into tokenising, LLM conversational boundaries and what it is that makes Zalgo such a digital trickster.

  19. 1

    The Glitchatorio Trailer

    Although AI has become familiar to us in our everyday lives, it remains a strange and mysterious technology, even to those developing it. Tune in to The Glitchatorio for conversations with experts in machine learning, data science, psychology and more for insights into the hidden side of AI.

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

30-minute introductions to some of the trickiest issues around AI today, such as: - The alignment problem- Questions of LLM consciousness- Chain-of-thought and monitorability- Scheming and hallucinationsThe Glitchatorio is a podcast about the aspects of AI that don't fit into standard narratives about superintelligence or technology-as-destiny. We look into the failure modes, emergent mysteries and unexpected behaviors of artificial intelligence that baffle even the experts. You'll hear from technical researchers, data scientists and machine learning experts, as well as psychologists, philosophers and others whose work intersects with AI.Most Glitchatorio episodes follow the standard podcast interview format. Sometimes these episodes alternate with fictional audio skits or personal voice notes.The voices, music and audio effects you hear on The Glitchatorio are all recorded or composed by

HOSTED BY

Witch of Glitch

CATEGORIES

URL copied to clipboard!