Runway, another AI company crawling the tech industry, has a powerful video generation model called Gen-3 Alpha. While it’s powerful, some folks aren’t happy with how the company obtained videos to train it. According to a new report, Runway may have pirated a ton of videos to train its AI model, and that includes YouTube videos.
Let’s not play dumb; pretty much any bit of media we see on the internet has probably been scraped and used to train an AI model. This includes articles, books, social media posts, images, podcasts, videos, etc. Companies are scraping all of this content under our noses, and no one knows until stories like these come to the surface. It’s pretty sad.
There was a bit of drama a few months ago regarding whether OpenAI scraped YouTube to train Sora, its video generation tool. This just shows that YouTube and Google won’t tolerate companies scraping data from YouTube. Since then, the feud has been silent.
Runway may have pirated videos to train its AI model
Runway’s model is impressive, but it would need a ton of video data to train it. That video data had to come from somewhere, and 404 Media has revealed where that data could have come from. The company uncovered a spreadsheet containing links to a ton of YouTube channels. These channels include Mr. Beast, MKBHD, The Try Guys, Nintendo, BuzzFeed, Netflix, Linus Tech Tips, Sam Kolder, and many more.
Runway didn’t stop at YouTube. The spreadsheet also contains links to sites like KissCartoon, which is a piracy website. All in all, the spreadsheet contains nearly 4,000 links. Each row in the spreadsheet contains information about the YouTube channels like the number of videos and the content they make.
According to reports, the company used a crawler to actually download these videos and feed them into the model. As if that wasn’t bad enough, Runway, allegedly, used a proxy to avoid being detected by Google. So, the company knew that Google would be miffed at it scraping video data.
We’re not sure just how much of the data in the spreadsheet was actually used to train the model. We may never know, unfortunately.
The legal ramifications
This is something that might have some rather heavy legal consequences. Companies like Microsoft and OpenAI are already being dragged into court for scraping data from the New York Times. YouTube may have the legal ground to sue Runway depending on how much raw video data the company scraped.
Also, the YouTube channels on the list include channels from some rather big companies like Disney, Netflix, and Nintendo. We’re sure that those companies have some copyrighted videos on their channels. History has taught us that, if you’re messing with Nintendo, you’re just begging for a lawsuit.
Lastly, we can’t overlook the fact that it may have downloaded videos from a pirate website. If that’s true, then that will be a direct violation of the law.
Now that this information is public, we’re just going to have to see what happens to the company and its video model.
2024-07-27 15:04:55