How We Investigated the YouTube videos Swiped to Train AI

YouTubers have long wondered whether their work has been scraped by AI companies to train their models — but Proof News investigative reporter Annie Gilbertson has proven it. She found that big companies, including Apple, Anthropic, Nvidia, and Bloomberg have all used a dataset containing the transcripts to more than 170,000 YouTube videos, including videos by megastars like Mr. Beast, Marques Brownlee, and PewDiePie.

In this interview, Proof founder Julia Angwin talks to Annie about the investigation and what went into it.

The interview is the first in our new series, Proof Ingredients. In this series, Julia will talk to journalists, researchers and content creators about what their investigations are made of, walking through the hypothesis, sample size, techniques, key findings, and limitations. Hopefully these ingredients help you evaluate our work and give you a framework for judging other news, too.

Republish This Article