Kevin Weil, head of the OpenAI Science team, recently stated that just as 2025 will be a year when AI profoundly transforms software engineering, 2026 will become a pivotal moment for breakthroughs in the field of science. Weil pointed out that the latest GPT-5.2 model has already shown remarkable potential in research efficiency, evolving from a simple tool into a "digital brainstorming partner" for researchers.
In the GPQA benchmark test, which measures doctoral-level scientific knowledge, the previous GPT-4 scored only 39%, far below the 70% threshold of human experts; while the updated GPT-5.2 released at the end of last year achieved a high score of 92%. This means AI is now standing at the edge of human capability. Weil believes that within the next year, scientists who do not deeply use AI will miss an excellent opportunity to improve the quality of their thinking and the speed of their research.
To better serve the rigorous scientific community, OpenAI is working on instilling "epistemological humility" into the model. Rather than having AI act as an all-knowing prophet, it should become a modest discussion partner. When scientists present ideas, AI will step in with a "here are some suggestions for reference" attitude, helping researchers uncover potential connections that are difficult for humans to detect by providing interdisciplinary analogies and parallel logical reasoning.
Despite previous controversies involving OpenAI executives over misreporting AI solving mathematical problems, Weil expressed a more practical view in his latest interview: the core mission of AI is not to replace Einstein, but to help humanity stand on the shoulders of giants by digesting all academic papers from the past 30 years, accelerating the process of scientific discovery.
Key Points:
🧪 OpenAI predicts 2026 will be the breakout year for AI in scientific research, and scientists who do not use AI will be at a disadvantage in the research competition.
📈 GPT-5.2 achieved a high score of 92% in advanced scientific knowledge tests, with performance significantly surpassing the human expert benchmark.
🤝 The focus of development has shifted toward "humble AI," aiming to make the model a supportive training partner for scientists, rather than a tutor who provides absolute answers.
