Reddit Signs $60 Million Per Year Deal For AI Training Data

$60 million per year, so far — that’s how much licensing Reddit’s vast trove of content is worth for artificial intelligence training. 

Bloomberg reported on Friday that the San Francisco-based company has signed a contract with a still unnamed “large AI company,” citing sources familiar with the matter. The deal comes as Reddit inches closer to the launch of its initial public offering (IPO).

“The Reddit corpus of data is really valuable,” Reddit founder and chief executive Steve Huffman told The New York Times in April. “But we don’t need to give all of that value to some of the largest companies in the world for free.”

Flashback: No More Freeloading: Reddit Wants To Be Paid For AI Training

“Crawling Reddit, generating value, and not returning any of that value to our users is something we have a problem with,” Huffman said at the time. “It’s a good time for us to tighten things up.”

Via Threads

The reported deal brings more excitement around the company’s highly-anticipated IPO, which could launch as early as next month, according to Bloomberg. The outlet also said that Reddit has been “advised to consider a valuation of at least $5 billion.” Reddit brought in $800 million in revenue in 2023, 20% up from the year before.

Related: AI Frenzy Drives S&P 500 to Most Concentrated State in Nearly a Century

Monetizing Data

In a broader context, the deal sets the groundwork for similar agreements for AI training, according to Bloomberg’s source. Large language models (LLMs) like OpenAI’s ChatGPT can do what they do because of the vast amounts of information they’re trained on. To function well (meaning be accurate, relevant, and return up-to-date information,) they’ll need to be constantly trained using new content. 

Related: New Study Reminds Us That ChatGPT Does Not Really *Understand* What You Want It To Do

If they were able to access vast amounts of content for free before they launched, this isn’t the case anymore. Social media companies like Reddit and media publishers are finding ways to stop AI companies from crawling and freeloading off their data.

The New York Times recently filed a federal lawsuit against OpenAI and Microsoft (NYSE: MSFT) alleging that the companies unlawfully used the media organization’s copyrighted content to train ChatGPT. OpenAI has since made updates to respond to this issue. And OpenAI is also reportedly in talks with publishers including CNN, Fox Corp and Time for licensing deals.

Information for this story was found via Bloomberg, and the sources and companies mentioned. The author has no securities or affiliations related to the organizations discussed. Not a recommendation to buy or sell. Always do additional research and consult a professional before purchasing a security. The author holds no licenses.

Leave a Reply