New Study Suggests ChatGPT Is Getting Dumber—But, Is It?

A new study released on Tuesday by researchers from Stanford University and University of California, Berkeley has ignited debate within the AI community regarding the performance of OpenAI’s GPT-4 language model. 

The paper, titled “How Is ChatGPT’s Behavior Changing over Time?” and published on arXiv by Lingjiao Chen, Matei Zaharia, and James Zou, investigates changes in GPT-4’s outputs over a span of a few months, suggesting a potential decline in coding and compositional task abilities.

The study utilized API access to test GPT-3.5 and GPT-4 versions from March and June 2023 on various tasks, including math problem-solving, answering sensitive questions, code generation, and visual reasoning. 

Source: Chen, Zaharia, Zou

Notably, the research found a significant drop in GPT-4’s ability to identify prime numbers, plummeting from an accuracy of 97.6 percent in March to just 2.4 percent in June. Surprisingly, GPT-3.5 displayed improved performance during the same period.

This investigation comes amidst growing concerns expressed by users who have observed a subjective decline in GPT-4’s performance over the past few months. Speculations about the reasons behind this decline abound, including OpenAI’s possible distillation of models to enhance efficiency, fine-tuning to mitigate harmful outputs, and unfounded conspiracy theories suggesting a reduction in GPT-4’s coding capabilities to promote GitHub Copilot usage.

OpenAI has consistently denied the alleged decrease in GPT-4’s capabilities. Peter Welinder, OpenAI’s VP of Product, recently took to Twitter to counter the claims, asserting that each new version of the AI language model is more advanced than its predecessor. He posits that intensive usage may lead to heightened awareness of these perceived issues.

But the company, unlike its large language model, isn’t completely closed to the possibility. Logan Kilpatrick, OpenAI’s head of developer relations, confirmed on Twitter that the team is aware of the reported regressions and is actively investigating the matter.

Arvind Narayanan, a computer science professor at Princeton, also pointed out some problems in the study, arguing that the research’s findings do not definitively prove a decline in GPT-4’s performance and could align with fine-tuning adjustments made by OpenAI. 


Information for this story was found via Twitter, Gizmodo, Ars Technica, and the sources and companies mentioned. The author has no securities or affiliations related to the organizations discussed. Not a recommendation to buy or sell. Always do additional research and consult a professional before purchasing a security. The author holds no licenses.

One Response

  1. This issue isn’t even dumber or smarter – the issue is the performance is changing radically even in a short time. Inconsistency directly correlates to unreliability, and this isn’t a GIGO situation so much as GO potentially happening any time for any reason.

Video Articles

Silver Is in a New Price Regime, and the Market Isn’t Used to It | Keith Neumeyer – First Majestic

Agnico Eagle Just Made a Massive Gold Land Grab

A Copper-Gold Deposit Caught the White House’s Attention | Rob McLeod – Cambria Gold

Recommended

Mercado Drills 256 g/t Silver Over 6.5 Metres In First Drill Hole of Inaugural Program

Antimony Resources Drills 4.38% Sb Over 7.05 Metres At Bald Hill In Final Hole Of 2025 Program

Related News

Stanford Student Finds That Academics Are Abusing ChatGPT

Software developer and student at Stanford University Andrew Gao has spotted something peculiar about recently...

Tuesday, August 15, 2023, 08:15:00 AM

OpenAI’s ChatGPT Reportedly In Talks For Tender Offer Putting Firm At $29 Billion Valuation

OpenAI, the research lab behind the ubiquitous ChatGPT chatbot, is in talks to sell current...

Friday, January 6, 2023, 11:55:00 AM

OpenAI Locked Up 40% of Global RAM With No Obligation to Buy Any of It

In October 2025, OpenAI CEO Sam Altman flew to Seoul and signed letters of intent...

Tuesday, March 31, 2026, 03:03:00 PM

Robot Lawyer DoNotPay Claims It Will Use GPT-4 for ‘One-Click Lawsuits,’ But GPT-4 Has A Dissenting Opinion

DoNotPay, Inc, the New York-based startup behind the app that claims to be “the world’s...

Thursday, March 16, 2023, 03:01:00 PM

Love ChatGPT? Buy Tesla Says Cathie Wood – Sees Stock Climbing To $1,500 In The Next 5 Years

Ark Invest chief Cathie Wood is putting a lot more faith in Tesla (Nasdaq: TSLA)...

Monday, February 13, 2023, 02:22:00 PM