
Looking back at the technical path, Yang Zhilin divided the evolution of large models into three periods: the first stage three years ago mainly relied on natural internet data and a small amount of manually annotated value alignment; the second stage last year focused on large-scale reinforcement learning, with researchers selecting high-quality tasks to improve model performance. Entering 2026, there has been a fundamental change in AI research methods, and the role of researchers is shifting towards "AI compute scheduler." In this new stage, the research process will be driven by AI using a large number of Tokens to autonomously synthesize new tasks and environments, define the most suitable reward parameters, and even deeply participate in exploring new network architectures.
This trend indicates that AI research and development efficiency will enter an exponential acceleration period. Moonshot stated that its core product
