Famous mathematician Terence Tao stated that although artificial intelligence (AI) has demonstrated impressive capabilities in many fields, it still lacks a crucial "sense"—the human intuition to identify wrong directions and flawed proofs. He believes this intuition is something AI currently cannot replicate, making humans irreplaceable in mathematical judgment.

Tao pointed out that when generating mathematical proofs, even with flaws, AI-generated content often appears "flawless" on the surface. However, these errors are usually "subtle" and seem "stupid" when discovered, which are mistakes humans would not make in practice. He referred to this unique human ability as "metaphorical mathematical scent," which immediately warns of something being off. Tao emphasized, "It's unclear how AI can ultimately replicate this capability."

Robot competition, answering math questions

He further explained that current AI, especially generative models, tends to get stuck when adopting incorrect approaches. In his view, the real challenge for AI lies in determining "when it takes a wrong turn." This differs from hybrid AI systems that combine neural networks with symbolic reasoning.

Despite this, Tao acknowledged that systems like AlphaZero have made significant progress in games such as Go and chess. In his opinion, these systems have developed a sense of "smell" for the gameboard, capable of judging whether a particular situation benefits one side. Although they cannot deduce specific reasons, this "sense" allows them to formulate strategies. Tao envisioned that if AI could acquire the ability to perceive the feasibility of certain proof strategies, it could offer constructive suggestions when breaking down problems, such as, "Hmm, this looks promising; these two tasks seem simpler than your main task and still have a high probability of being correct."

It is known that AlphaZero selects moves during gameplay and training through Monte Carlo Tree Search (MCTS) as a "symbolic framework" to explore possible game paths as symbolic states. However, its essence remains a deep reinforcement learning system driven by neural networks, learning from self-play and millions of parameters.

Some researchers believe that combining large language models with symbolic reasoning advantages holds the potential to drive significant breakthroughs in AI in the field of mathematics. Pure LLMs (even those with some reasoning capabilities) may also hit dead ends on complex mathematical problems. Tao previously described OpenAI's reasoning model o1 as "mediocre but not entirely incapable," comparing it to a research assistant capable of handling everyday tasks but still lacking creativity and flexibility. He also participated in the development of the FrontierMath benchmark, which sets highly challenging mathematical problems for AI systems to promote progress in this field.