Nvidia Paid ‘Tens of Thousands’ for Pirated Books After Being Warned They Were Illegal

NVIDIA Corp. (Nasdaq: NVDA) contacted a controversial online repository of pirated books to obtain high-speed access to copyrighted material for training artificial intelligence models, according to internal company documents filed in federal court.

The correspondence, revealed in an amended complaint filed January 16 in US District Court for the Northern District of California, shows an Nvidia data strategy team member wrote to Anna’s Archive stating the company was “exploring including Anna’s Archive in pre-training data for our LLMs,” or large language models.

Anna’s Archive warned Nvidia about the illegal nature of its collections, according to the court filing. The pirate library asked whether the company had obtained internal authorization before proceeding, noting it had “wasted too much time on people who could not get internal buy-in.”

Within a week of receiving the warning about the illegal nature of the materials, Nvidia management gave “the green light” to proceed, the complaint alleges. Anna’s Archive then offered access to approximately 500 terabytes of data, which included millions of copyrighted books.

The shadow library charged tens of thousands of dollars for high-speed access to its collections, according to court documents.

Five authors — Abdi Nazemian, Brian Keene, Stewart O’Nan, Andre Dubus III and Susan Orlean — filed the expanded class-action lawsuit. The authors claim Nvidia used their copyrighted works without permission to train AI models including NeMo Megatron and Nemotron-4.

The lawsuit also alleges Nvidia downloaded copyrighted material from other shadow libraries including LibGen, Sci-Hub and Z-Library. Additionally, the complaint claims NVIDIA provided scripts and tools that allowed corporate customers to automatically download datasets containing pirated books.

Nvidia previously trained its AI models on the Books3 dataset, which contains approximately 196,640 books copied from the pirate site Bibliotik, according to the complaint. Books3 forms part of a larger dataset called The Pile.

The chip manufacturer defends its actions as fair use under copyright law. Nvidia has argued that AI training on copyrighted material differs from traditional copying because the models use books as statistical data rather than reproducing them directly.

The case marks the first time correspondence between a major US technology company and Anna’s Archive has been publicly revealed in court proceedings, according to copyright news site TorrentFreak, which first reported the internal emails.

The authors seek statutory damages, actual damages and compensation for what they describe as willful copyright violations. Hundreds of additional authors whose works appear in the pirated libraries could join the class-action suit.

Anna’s Archive describes itself as a preservation project aiming to catalog all books in existence and make them freely available. Copyright holders and publishers characterize the site as a piracy operation that undermines intellectual property rights.

Other major AI companies including Meta and Anthropic have also faced lawsuits alleging they trained models on pirated books from shadow libraries.

The case is Nazemian et al. v. NVIDIA Corporation, Case No. 4:24-cv-01454-JST, in the US District Court for the Northern District of California.



Information for this story was found via the sources and companies mentioned. The author has no securities or affiliations related to the organizations discussed. Not a recommendation to buy or sell. Always do additional research and consult a professional before purchasing a security. The author holds no licenses.

Leave a Reply

Video Articles

Moon River Moly: The Davidson Moly-Copper-Tungsten PEA

Integra: The DeLamar Heap Leach Feasibility Study

Highlander Silver: The Saviour Of Bear Creek Mining

Recommended

Japan Gold Intersects Gold Mineralization Drilling At Mizobe, Encounters Banded Chalcedony Vein

Antimony Resources Drills 5.10% Sb Over 4.0 Metres At Bald Hill

Related News

US Wants 15% Cut On AI Chip Sales To China In Exchange For License

Nvidia (NASDAQ: NVDA) and AMD (NASDAQ: AMD) have reportedly agreed to remit 15% of revenue...

Tuesday, August 12, 2025, 02:10:00 PM

Nvidia’s Revenue Soars 122% in Q2 2025 Amid AI Boom

Nvidia (NASDAQ: NVDA), the semiconductor giant at the forefront of the AI revolution, has once...

Thursday, August 29, 2024, 07:31:00 AM

Bitcoin Falls Further as China Announces Ban On Institutions Offering Cryptocurrency Services

Wednesday morning saw Bitcoin fall to a three month low as the cryptocurrency fell below...

Wednesday, May 19, 2021, 10:36:00 AM

Verses Tech To Launch AI-Based Personal Assistant

Verses Technologies (NEO: VERS) is set to launch what it refers to as the worlds...

Thursday, February 23, 2023, 09:05:22 AM

OpenAI’s Latest Chatbot Aces IQ Test, Now Smarter than 9 out of 10 People

OpenAI’s latest model o1 just passed the Norwegian Mensa IQ test, achieving a score that...

Tuesday, September 17, 2024, 12:58:26 PM