- By JeffkomStory Team
- Published on
Zuckerberg Turns to YouTube for Help with AI Copyright
Meta CEO Cites Fair Use in AI Training Data Dispute
Meta CEO Mark Zuckerberg has drawn parallels to YouTube’s handling of pirated content to defend Meta’s use of copyrighted e-books in training its AI models. Newly released snippets of Zuckerberg’s deposition in the Kadrey v. Meta copyright case shed light on the company’s stance on the contentious issue.
The lawsuit, filed in the U.S. District Court for Northern California, accuses Meta of utilizing copyrighted works from the LibGen data set to train its Llama AI models. LibGen, often referred to as a “pirated content aggregator,” hosts copyrighted materials from major publishers, including Pearson and McGraw Hill. Despite multiple lawsuits, LibGen has continued to provide access to these works.
Zuckerberg’s Argument
During his deposition, Zuckerberg likened Meta’s actions to YouTube’s efforts to address pirated content while maintaining legitimate use. “YouTube may end up hosting some stuff that people pirate for some period of time, but YouTube is trying to take that stuff down,” he said. He also emphasized that the majority of YouTube’s content is licensed and legitimate.
Zuckerberg denied direct knowledge of LibGen but argued against blanket bans on data sets like it. “Would I want to have a policy against people using YouTube because some of the content may be copyrighted? No,” he said, while acknowledging the need for caution when using potentially infringing materials.
New Allegations Against Meta
The plaintiffs, including authors Sarah Silverman and Ta-Nehisi Coates, allege that Meta knowingly trained its AI models on pirated data. Internal Meta communications reportedly described LibGen as a “data set we know to be pirated” and warned that its use could “undermine [Meta’s] negotiating position with regulators.”
According to amended legal filings, Meta allegedly cross-referenced LibGen’s pirated books with copyrighted books available for licensing, using this to decide whether to pursue agreements with publishers. The complaint further claims that Meta researchers attempted to obscure the use of copyrighted materials by inserting “supervised samples” during model fine-tuning. Meta also reportedly used Z-Library, another platform known for hosting pirated content, as recently as April 2024 for training its AI.
Implications for AI and Copyright Law
The Kadrey v. Meta case is one of many lawsuits testing the boundaries of “fair use” in AI training. While AI companies argue that training on copyrighted materials constitutes fair use, copyright holders strongly disagree. The outcome of these cases could have significant implications for AI development and intellectual property law.
As the case unfolds, Meta’s reliance on controversial data sets like LibGen and Z-Library could face intensified scrutiny from courts and regulators. For now, Zuckerberg’s YouTube analogy highlights the complex interplay between innovation and intellectual property rights in the digital age.
Here are some related articles you may find interesting:
Self-Driving Car Controversy in Texas: Duck Incident Sparks Debate on Autonomous Vehicles
The promise of safer roads and smarter transportation that comes with autonomous vehicles has always...
Chrome Introduces Vertical Tabs: A Smarter Way to Manage Too Many Open Tabs
Introduction
Google Chrome’s been struggling to keep up with users having dozens of tabs open for...
Delve and Y Combinator Part Ways: Inside the Startup Controversy Shaking the Compliance Industry
The compliance startup scene has been sent into a tailspin by the sudden fallout between Delve and Y...
Amazon Adds Fuel Surcharge as Iran War Disrupts Global Oil Markets
Introduction
The global economy is once again feeling the ripple effects of geopolitical conflict. The...
Uber Acquires Blacklane to Expand Luxury Travel with Uber Elite Services
Uber is making a strategic move into the premium mobility segment by acquiring Berlin-based startup Blacklane....
Shield AI Hits $12.7B Valuation After Major U.S. Air Force Deal and $1.5B Funding Round
The Defense Tech Sector is really starting to heat up and Shield AI has just made a move that’s...
AI Inference Startup Modal Labs in Talks to Raise at $2.5B Valuation
Modal Labs, an AI inference infrastructure startup, is reportedly in discussions with venture capital...
Amazon May Launch AI Content Marketplace for Media Publishers
Amazon may soon launch a new content marketplace. This platform would allow media companies to sell their...
Waymo Begins Driverless Robotaxi Testing in Nashville Ahead of 2026 Launch
Waymo has officially removed human safety drivers from its autonomous test vehicles in Nashville, marking...
a16z Warns Founders: Don’t Chase Hype-Driven ARR, Build Durable Growth Instead
The AI startup boom has reignited a familiar Silicon Valley pattern: massive venture capital flowing...
Popular Posts

Self-Driving Car Controversy in Texas: Duck Incident Sparks Debate on Autonomous Vehicles
JeffkomStory Team
The promise of safer roads

Chrome Introduces Vertical Tabs: A Smarter Way to Manage Too Many Open Tabs
JeffkomStory Team
Introduction Google Chrome’s been struggling

Delve and Y Combinator Part Ways: Inside the Startup Controversy Shaking the Compliance Industry
JeffkomStory Team
The compliance startup scene has

Amazon Adds Fuel Surcharge as Iran War Disrupts Global Oil Markets
JeffkomStory Team
Introduction The global economy is
Join Our Newsletter
Start your day with impactful startup stories and concise news! All delivered in a quick five-minute read in your inbox.