OpenClaw Can Train While Running: AReaL v1.0 Stable Version of the Intelligence Agent Reinforcement Learning Training Framework Released

On March 4, Ant Group jointly with Tsinghua University released the stable version of the open-source reinforcement learning training framework AReaL v1.0. This version focuses on "Agent one-click access to RL training": no code changes are needed, it is compatible with various Agent frameworks, and enables intelligent agents to train with reinforcement learning out-of-the-box.

Since the beginning of 2026, Agents have continued to gain momentum. Smart agent frameworks such as LangChain, Claude Code, and OpenClaw have developed rapidly, but they have also exposed two major bottlenecks. First, the cost of accessing training is high: existing agent frameworks have different interfaces, and often require writing a complete set of adaptation code when connecting to each one. Second, Agents lack the ability to continuously evolve: most Agents' capabilities depend on fixed weights learned by the underlying model during the training phase, and after deployment, they cannot be continuously optimized for specific scenarios, with their capability limits determined at delivery.

AReaL is the first full asynchronous training and inference decoupling large model reinforcement learning training system, which allows Agents to receive feedback and continuously optimize decisions through real task interactions. The released v1.0 version makes it possible for any Agent to access RL training without modification - by adding a Proxy Worker middle layer between the agent and the training system, developers only need to modify one request address to access the training.

(Figure: AReaL's asynchronous training architecture seamlessly integrated into agents)

Take the currently popular OpenClaw as an example. Developers just need to point the base_url and api_key in the OpenClaw configuration file to AReaL gateway, and their own OpenClaw can be connected to reinforcement learning training. The agent continues to perform tasks as usual, and users periodically rate how well the Agent completes the tasks. AReaL automatically collects training data and updates the model in the background, allowing the agent to evolve automatically during continuous use.

The v1.0 version of AReaL also introduced the native training engine Archon. It realizes complete 5D parallelism (data parallelism, pipeline parallelism, tensor parallelism, context parallelism, and expert parallelism) based on PyTorch native capabilities, reducing the installation and debugging threshold. At the same time, it provides multiple backend options for training and inference, making it easy to flexibly deploy in different environments. Surprisingly, this complex distributed system was implemented from scratch to verify correctness in only 1 person·month - within 32 days, nearly one million lines of code were modified to fully implement the Archon engine, enabling it to train billion-parameter MoE models.

The secret to this efficiency miracle lies in AReaL's integrated AI-assisted development system, which achieves highly automated complex engineering development.

AReaL v1.0 introduces an AI-assisted development process that provides end-to-end support for developers, from planning, coding, verification, to PR creation. Especially when handling core modules such as MoE parallelism, memory optimization, and algorithm implementation, the dedicated AI programming assistant acts like a senior expert, appearing in time when code changes occur and providing targeted guidance, ensuring every code change. AReaL's AI-assisted programming is not just an efficiency tool, but can also take on "deliverable" R&D work in complex infrastructure engineering, leading the innovation of the next generation of AI infrastructure engineering paradigms.

The AReaL team stated that they will continue to iterate around training engines, usability, and multimodal agent training. Currently, the code and documentation of AReaL v1.0 are open-sourced in the inclusionAI community.

· GitHub repository: https://github.com/inclusionAI/AReaL

· Related paper: https://arxiv.org/abs/2505.24298

OpenClaw Can Train While Running: AReaL v1.0 Stable Version of the Intelligence Agent Reinforcement Learning Training Framework Released

Related Recommendations

The Next Puzzle Piece in AI Evolution: GPT-5.6 May Launch Next Week, Focusing on Agent-Level Operational Capabilities

WeChat Pay Officially Launches AI Dedicated Card: Supports Agent Closed-Loop Consumption, Main Account Fully Isolated

The Game of Capabilities and Security! OpenAI Launches ChatGPT Block Mode, Willing to Cut Off the Internet to Prevent Data Leaks

Tencent Meeting Upgrades Multiple AI Features, Baobao Minutes Monthly Usage Time Increases Nearly 5 Times

Cozy 3.0 Officially Launched, Supporting Multi-Person and Multi-Agent Collaboration