The Case of the Blah Blah Blahs episode artwork

EPISODE · Dec 8, 2020 · 36 MIN

The Case of the Blah Blah Blahs

from Underunderstood · host Select Works

A famous datset of Reuters articles from the 1980s includes “Blah blah blah.” in place of some stories. Why? We have a Patreon now! Sign up to support the show and get access to our bonus podcast, Overunderstood. Show notes: 00:31 - The link Jess sent 8:31 - SGML 8:46 - This is what the blahs look like and this is what all the entries look like. 24:00 - FTP 24:34 - Linguistic Data Consortium 29:00 - RCV1 at NIST and David D. Lewis’s README 30:22 - Construe-TIS: A System for Content-Based Indexing of a Database of News Stories (Phil Hayes and Steven Weinstein)

NOW PLAYING

The Case of the Blah Blah Blahs

0:00 36:32

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

No similar episodes found.

No similar podcasts found.

Frequently Asked Questions

How long is this episode of Underunderstood?

This episode is 36 minutes long.

When was this Underunderstood episode published?

This episode was published on December 8, 2020.

What is this episode about?

A famous datset of Reuters articles from the 1980s includes “Blah blah blah.” in place of some stories. Why? We have a Patreon now! Sign up to support the show and get access to our bonus podcast, Overunderstood. Show notes: 00:31 - The link Jess...

Can I download this Underunderstood episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!