On March 4, Ant Group jointly with Tsinghua University released the stable version of the open-source reinforcement learning training framework AReaL v1.0. This version focuses on "Agent one-click access to RL training": no code changes are needed, it is compatible with various Agent frameworks, and enables intelligent agents to train with reinforcement learning out-of-the-box.
Since the beginning of 2026, Agents have continued to gain momentum. Smart agent frameworks such as LangChain, Claude Code, and OpenClaw have developed rapidly, but they have also exposed two major bottlenecks. First, the cost of accessing training is high: existing agent frameworks have different interfaces, and often require writing a complete set of adaptation code when connecting to each one. Second, Agents lack the ability to continuously evolve: most Agents' capabilities depend on fixed weights learned by the underlying model during the training phase, and after deployment, they cannot be continuously optimized for specific scenarios, with their capability limits determined at delivery.
AReaL is the first full asynchronous training and inference decoupling large model reinforcement learning training system, which allows Agents to receive feedback and continuously optimize decisions through real task interactions. The released v1.0 version makes it possible for any Agent to access RL training without modification - by adding a Proxy Worker middle layer between the agent and the training system, developers only need to modify one request address to access the training.

(Figure: AReaL's asynchronous training architecture seamlessly integrated into agents)
Take the currently popular OpenClaw as an example. Developers just need to point the base_url and api_key in the OpenClaw configuration file to AReaL gateway, and their own OpenClaw can be connected to reinforcement learning training. The agent continues to perform tasks as usual, and users periodically rate how well the Agent completes the tasks. AReaL automatically collects training data and updates the model in the background, allowing the agent to evolve automatically during continuous use.
The v1.0 version of AReaL also introduced the native training engine Archon. It realizes complete 5D parallelism (data parallelism, pipeline parallelism, tensor parallelism, context parallelism, and expert parallelism) based on PyTorch native capabilities, reducing the installation and debugging threshold. At the same time, it provides multiple backend options for training and inference, making it easy to flexibly deploy in different environments. Surprisingly, this complex distributed system was implemented from scratch to verify correctness in only 1 person·month - within 32 days, nearly one million lines of code were modified to fully implement the Archon engine, enabling it to train billion-parameter MoE models.
The secret to this efficiency miracle lies in AReaL's integrated AI-assisted development system, which achieves highly automated complex engineering development.

AReaL v1.0 introduces an AI-assisted development process that provides end-to-end support for developers, from planning, coding, verification, to PR creation. Especially when handling core modules such as MoE parallelism, memory optimization, and algorithm implementation, the dedicated AI programming assistant acts like a senior expert, appearing in time when code changes occur and providing targeted guidance, ensuring every code change. AReaL's AI-assisted programming is not just an efficiency tool, but can also take on "deliverable" R&D work in complex infrastructure engineering, leading the innovation of the next generation of AI infrastructure engineering paradigms.
The AReaL team stated that they will continue to iterate around training engines, usability, and multimodal agent training. Currently, the code and documentation of AReaL v1.0 are open-sourced in the inclusionAI community.
· GitHub repository: https://github.com/inclusionAI/AReaL
· Related paper: https://arxiv.org/abs/2505.24298
