EPISODE · Dec 8, 2020 · 36 MIN
The Case of the Blah Blah Blahs
from Underunderstood · host Select Works
A famous datset of Reuters articles from the 1980s includes “Blah blah blah.” in place of some stories. Why? We have a Patreon now! Sign up to support the show and get access to our bonus podcast, Overunderstood. Show notes: 00:31 - The link Jess sent 8:31 - SGML 8:46 - This is what the blahs look like and this is what all the entries look like. 24:00 - FTP 24:34 - Linguistic Data Consortium 29:00 - RCV1 at NIST and David D. Lewis’s README 30:22 - Construe-TIS: A System for Content-Based Indexing of a Database of News Stories (Phil Hayes and Steven Weinstein)
NOW PLAYING
The Case of the Blah Blah Blahs
No transcript for this episode yet
Similar Episodes
No similar episodes found.
Similar Podcasts
No similar podcasts found.