EPISODE · Aug 19, 2019 · 22 MIN
Building the howto100m Video Corpus
from Data Skeptic
Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen. This episode is a discussion of the HowTo100m dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities. Related Links The paper will be presented at ICCV 2019 @antoine77340 Antoine on Github Antoine's homepage
NOW PLAYING
Building the howto100m Video Corpus
No transcript for this episode yet
Similar Episodes
May 11, 2026 ·66m
May 11, 2026 ·67m
May 5, 2026 ·4m
May 4, 2026 ·4m