With the rapid development of large language models (LLMs) and vision-language models (VLMs), agents are undergoing a revolutionary change in the way they discover knowledge and solve problems. However, many existing open-source agent frameworks rely too heavily on expensive paid tools, which greatly limits their reproducibility and universality. To address this, the Tencent AI Lab has introduced a brand-new open-source agent framework - Cognitive Kernel-Pro, aiming to minimize external dependencies, allowing more researchers and developers to easily participate in the development and training of agents.
Cognitive Kernel-Pro adopts a multi-module, hierarchical design, mainly consisting of a main agent and multiple sub-agents. The main agent is responsible for task decomposition and information integration, while the sub-agents focus on specific tasks such as web browsing and file processing. This modular structure ensures the independence and scalability of each part.
To improve the efficiency of handling complex tasks, Cognitive Kernel-Pro introduces a "progress status" mechanism, allowing agents to record completed steps and pending tasks. In addition, the framework enables efficient communication between the main agent and sub-agents through a simple text interface, facilitating collaboration and debugging. Furthermore, the introduction of reflection and voting mechanisms further optimizes the quality of task completion by agents, especially in high-randomness tasks such as web browsing.
In terms of performance, Cognitive Kernel-Pro performs well on the GAIA benchmark, surpassing other open-source frameworks like SmolAgents and approaching those that rely on paid tools. This achievement is attributed to its innovative training methods, covering areas such as web navigation, file processing, and reasoning.
In addition to its powerful framework design, the Tencent AI Lab also provides a training recipe for the Agent Foundation Model, further promoting research and development within the community. The relevant code and technical reports are publicly available on GitHub for everyone to explore and utilize together.
Project URL: https://github.com/Tencent/CognitiveKernel-Pro