EPISODE · Jan 31, 2024 · 1H 23M
Episode 7: The Old Parser
from core.py · host Pablo Galindo and Łukasz Langa
Context-free grammars, non-deterministic finite automatons, left-to-right leftmost derivations... what even is all that?! Today we're talking about how Python parses your source code. We start gently with how this worked in the past. Come listen to Łukasz's high-level explanations and Pedantic Pablo's "well actuallys". # Timestamps (00:00:00) INTRO (00:01:35) You can still download Python 1.0! (00:02:19) The original tokenizer (00:03:10) What even is a tokenizer? (00:04:08) FUN FACTS ABOUT THE TOKENIZER (00:04:34) Circumflex (00:05:16) Python's invisible braces (00:08:29) Backticks in the syntax (00:11:00) Where are the comments stored? (00:12:27) GRAMMAR (00:13:37) What is a grammar? (00:16:25) The long-forgotten 'access' keyword (00:20:25) Making LL1 do things it wasn't meant to do (00:23:24) SURPRISE QUESTION 1: soft keywords (00:24:46) What's a context-free grammar? (00:26:51) A note about backslashes (00:29:33) The Dragon Book(s) (00:31:27) PARSING: What is it? (00:35:23) How to generate a parser? (00:39:00) LL Cool Parser (00:41:15) What if we used LR? (00:44:01) Let's have three tokenizers! (00:47:50) 2to3 and its legacy (00:52:38) Black and its blib2to3 (00:54:04) The pesky 'with' statement and the death of LL1 (01:00:05) PR OF THE WEEK: GH-113745 (01:05:41) SURPRISE QUESTION 2: Subclasses of SyntaxError (01:07:02) WHAT'S GOING ON IN CPYTHON? (01:09:16) Sam Gross nominated as a core dev (01:10:13) Free-threading progress (01:13:11) Faster CPython changes (01:17:29) ntpath.isreserved() (01:20:11) Pablo and the DWARF (01:22:02) OUTRO
What this episode covers
Context-free grammars, non-deterministic finite automatons, left-to-right leftmost derivations... what even is all that?! Today we're talking about how Python parses your source code. We start gently with how this worked in the past. Come listen to Łukasz's high-level explanations and Pedantic Pablo's "well actuallys". # Timestamps (00:00:00) INTRO (00:01:35) You can still download Python 1.0! (00:02:19) The original tokenizer (00:03:10) What even is a tokenizer? (00:04:08) FUN FACTS ABOUT THE TOKENIZER (00:04:34) Circumflex (00:05:16) Python's invisible braces (00:08:29) Backticks in the syntax (00:11:00) Where are the comments stored? (00:12:27) GRAMMAR (00:13:37) What is a grammar? (00:16:25) The long-forgotten 'access' keyword (00:20:25) Making LL1 do things it wasn't meant to do (00:23:24) SURPRISE QUESTION 1: soft keywords (00:24:46) What's a context-free grammar? (00:26:51) A note about backslashes (00:29:33) The Dragon Book(s) (00:31:27) PARSING: What is it? (00:35:23) How to generate a parser? (00:39:00) LL Cool Parser (00:41:15) What if we used LR? (00:44:01) Let's have three tokenizers! (00:47:50) 2to3 and its legacy (00:52:38) Black and its blib2to3 (00:54:04) The pesky 'with' statement and the death of LL1 (01:00:05) PR OF THE WEEK: GH-113745 (01:05:41) SURPRISE QUESTION 2: Subclasses of SyntaxError (01:07:02) WHAT'S GOING ON IN CPYTHON? (01:09:16) Sam Gross nominated as a core dev (01:10:13) Free-threading progress (01:13:11) Faster CPython changes (01:17:29) ntpath.isreserved() (01:20:11) Pablo and the DWARF (01:22:02) OUTRO
NOW PLAYING
Episode 7: The Old Parser
No transcript for this episode yet
Similar Episodes
Jun 5, 2026 ·12m
May 29, 2026 ·14m
May 27, 2026 ·14m
May 25, 2026 ·15m