Washington State University Study: ChatGPT Shows Serious Contradictions in Complex Scientific Judgments

Washington State University (WSU) recently released a study revealing that although ChatGPT's responses are confident in tone, it performs more like "random guessing" when dealing with complex scientific statements. The study points out that the model not only has limited accuracy but also often provides contradictory answers to the same question.

A team led by Professor Mesut Cicek extracted 719 research hypotheses from business journals since 2021 and repeatedly submitted them to the model for truth verification:

Although ChatGPT's surface-level accuracy is around 80%, after removing the factor of random guessing, its actual performance was only about 60% higher than a 50% "coin flip" probability. Researchers evaluated it as a "low D-grade score." The model performed extremely poorly in identifying false statements, with a correct judgment rate of only 16.4% for "false propositions."

The researchers submitted each hypothesis to the model 10 times and found that the model struggled to maintain consistent positions:

Answers fluctuate: In about 73% of cases, the model maintained consistent conclusions across 10 repeated questions.
Extreme contradictions: In some cases, the model would alternate between "true" and "false" answers, even showing extreme situations where half of the answers were true and the other half were false, despite using the exact same prompt.

The study points out that users are easily misled by AI's fluent and persuasive language, but this does not mean it has real reasoning ability:

Lack of real "brain": The model essentially performs memory and pattern matching, unlike humans who truly understand the world or know what they are saying.
Limited progress in versions: Testing showed that the updated version of ChatGPT-5 mini tested in 2025 performed similarly to earlier versions on this specific task, without showing significant improvements.

Based on the study results, Cicek advises business managers to maintain high skepticism when making complex decisions: they should not view generative AI as an "authority" that can replace professional judgment, and must manually verify all output results. Organizations should enhance training to help employees understand the advantages and limitations of AI tools, avoiding decision biases caused by blind trust.

This study once again reminds the public that, in the context of rapid AI technological iteration, its deep logical judgment and evidence weighing capabilities still need improvement.

ChatGPT Launches Advertiser Platform, Marking a Turning Point in AI Commercialization

OpenAI recently launched a full-scale advertising platform and introduced ChatGPT's default model, GPT-5.5 Instant, which reduces hallucinations and improves response conciseness. Crucially, ChatGPT's chat interface has shifted from closed testing to open access, becoming a new arena for brand marketing. This commercial transformation began in January in the US and Australia.....

OpenAI Officially Launches GPT-5.5 Instant, Significantly Enhancing Model Performance

OpenAI officially released the GPT-5.5Instant model and set it as the default version for ChatGPT, replacing GPT-5.3Instant. The new version focuses on three major optimizations: more accurate responses, more concise expression, and more personalized perception. Especially in mathematical and professional fields, it shows remarkable performance, effectively reducing 'hallucinations' and improving user interaction experience.

Rejecting a Single Dominance, iOS 27 May Support Custom Third-Party AI Models

Apple plans to introduce the "Extensions" feature in the upcoming iOS 27, iPadOS 27, and macOS 27 releases this fall, allowing users to independently select third-party AI models to power intelligent functions such as text generation, editing, or image creation. This move breaks down the closed boundaries of its AI ecosystem, offering users more personalized choices and no longer limiting them to system-default options.

OpenAI Launches ChatGPT Ad Self-Service Tool for Easier Brand Promotion!

OpenAI recently launched a test version of the ChatGPT ad self-management tool, aiming to help U.S. advertisers purchase ads more conveniently on the platform. Brands can directly manage their ad campaigns for efficient promotion. In addition, OpenAI has partnered with ad technology companies such as Pacvue, Kargo, and StackAdapt to support advertisers.

Washington State University Study: ChatGPT Shows Serious Contradictions in Complex Scientific Judgments

Related Recommendations

ChatGPT Launches Advertiser Platform, Marking a Turning Point in AI Commercialization

OpenAI Officially Launches GPT-5.5 Instant, Significantly Enhancing Model Performance

OpenAI Releases GPT-5.5 Instant: A More Concise and Accurate Chatbot Upgrade

Rejecting a Single Dominance, iOS 27 May Support Custom Third-Party AI Models

OpenAI Launches ChatGPT Ad Self-Service Tool for Easier Brand Promotion!