Nvidia Paid ‘Tens of Thousands’ for Pirated Books After Being Warned They Were Illegal

NVIDIA Corp. (Nasdaq: NVDA) contacted a controversial online repository of pirated books to obtain high-speed access to copyrighted material for training artificial intelligence models, according to internal company documents filed in federal court.

The correspondence, revealed in an amended complaint filed January 16 in US District Court for the Northern District of California, shows an Nvidia data strategy team member wrote to Anna’s Archive stating the company was “exploring including Anna’s Archive in pre-training data for our LLMs,” or large language models.

Anna’s Archive warned Nvidia about the illegal nature of its collections, according to the court filing. The pirate library asked whether the company had obtained internal authorization before proceeding, noting it had “wasted too much time on people who could not get internal buy-in.”

Within a week of receiving the warning about the illegal nature of the materials, Nvidia management gave “the green light” to proceed, the complaint alleges. Anna’s Archive then offered access to approximately 500 terabytes of data, which included millions of copyrighted books.

The shadow library charged tens of thousands of dollars for high-speed access to its collections, according to court documents.

Five authors — Abdi Nazemian, Brian Keene, Stewart O’Nan, Andre Dubus III and Susan Orlean — filed the expanded class-action lawsuit. The authors claim Nvidia used their copyrighted works without permission to train AI models including NeMo Megatron and Nemotron-4.

The lawsuit also alleges Nvidia downloaded copyrighted material from other shadow libraries including LibGen, Sci-Hub and Z-Library. Additionally, the complaint claims NVIDIA provided scripts and tools that allowed corporate customers to automatically download datasets containing pirated books.

Nvidia previously trained its AI models on the Books3 dataset, which contains approximately 196,640 books copied from the pirate site Bibliotik, according to the complaint. Books3 forms part of a larger dataset called The Pile.

The chip manufacturer defends its actions as fair use under copyright law. Nvidia has argued that AI training on copyrighted material differs from traditional copying because the models use books as statistical data rather than reproducing them directly.

The case marks the first time correspondence between a major US technology company and Anna’s Archive has been publicly revealed in court proceedings, according to copyright news site TorrentFreak, which first reported the internal emails.

The authors seek statutory damages, actual damages and compensation for what they describe as willful copyright violations. Hundreds of additional authors whose works appear in the pirated libraries could join the class-action suit.

Anna’s Archive describes itself as a preservation project aiming to catalog all books in existence and make them freely available. Copyright holders and publishers characterize the site as a piracy operation that undermines intellectual property rights.

Other major AI companies including Meta and Anthropic have also faced lawsuits alleging they trained models on pirated books from shadow libraries.

The case is Nazemian et al. v. NVIDIA Corporation, Case No. 4:24-cv-01454-JST, in the US District Court for the Northern District of California.



Information for this story was found via the sources and companies mentioned. The author has no securities or affiliations related to the organizations discussed. Not a recommendation to buy or sell. Always do additional research and consult a professional before purchasing a security. The author holds no licenses.

Video Articles

Higher Gold Prices Are Changing What Counts as a Real Discovery | Mike Bennett – Altamira Gold

Why Silver Still Hasn’t Seen the Real Mania | Craig Hemke

Why Copper Needs a Much Higher Price to Fix the Supply Problem | Greg Ferron – PTX Metals

Recommended

Crossroads Gold Closes Rox-ex Acquisition, Adds Pambula and Club Terrace to Australian Pipeline

Goliath Resources Kicks Off Fully Funded 50,000 Metre Drill Program At Surebet

Related News

Trump Unveils Major AI Investment Initiative During Pennsylvania Summit

President Donald Trump on Tuesday unveiled more than $90 billion in private investments for artificial...

Wednesday, July 16, 2025, 03:49:00 PM

Class Action Accuses OpenAI of Routing ChatGPT Queries to Meta and Google Without User Consent

A California woman has sued OpenAI in federal court, alleging the company embedded Facebook Pixel...

Thursday, May 14, 2026, 07:01:02 AM

It’s Not a Crime to Tweet Stupid Things, Musk’s Lawyers Tell Jury

A civil jury began deliberating Tuesday on whether Elon Musk defrauded Twitter shareholders by using...

Thursday, March 19, 2026, 12:05:00 PM

Coming Soon: AI That Can Self-Replicate

If reading news about artificial intelligence makes you feel like humanity is inching closer and...

Wednesday, January 17, 2024, 04:51:00 PM

Nvidia Props Up Customer CoreWeave’s IPO

Nvidia will reportedly anchor the much-anticipated initial public offering of CoreWeave at $40 per share...

Sunday, March 30, 2025, 09:32:00 AM