On April 28, 2026, researchers from Imperial College London, the Internet Archive, and Stanford University released a joint
Technical analysis shows that AI-generated text is causing significant "semantic contraction" and "positive shift." Due to language models' tendency to converge toward the average of their training data, the semantic similarity of AI-generated content is 33% higher than that of human-created content. Over time, this could lead to a narrowing of the range of ideas in cyberspace. At the same time, the positive sentiment score of AI-generated text is 107% higher than that of human content, showing an artificialized optimistic tendency. This tone shift, caused by the model's "excessive compliance," is believed to potentially marginalize human dissent and distinctive viewpoints without people realizing it. Although the public generally worries that AI may exacerbate factual errors or lead to the disappearance of writing styles, no significant negative correlation has been found at the data level.
Researchers warn that the homogenization and optimism of online content are inducing the public's "apathy toward reality," where users may doubt the overall credibility of online information due to the inability to distinguish between real and fake. Additionally, the high proportion of AI content significantly increases the risk of "model collapse," where subsequent models may experience performance degradation due to training on their own outputs. This trend is prompting the industry to rethink the logic of search and recommendation algorithms, and in the future, there may be a greater focus on identifying semantic diversity and establishing encryption traceability standards.
