Article

Tsinghua University Develops New Visual Language Model CogAgent to Enhance GUI Understanding and Navigation

Published in Latest AI News

Time :Dec 27, 2023

Read :1minute

The Tsinghua University Zhipu AI team has introduced CogAgent, a vision-language model focused on enhancing the understanding and navigation of graphical user interfaces (GUIs). Utilizing a dual-encoder system to handle complex GUI elements, the model excels in processing high-resolution inputs, navigating GUIs on both PC and Android platforms, and performing tasks involving text and visual question-answering. Potential applications of CogAgent include automating GUI operations, providing GUI assistance and guidance, and driving new GUI designs and interaction methods. Although still in its early development stages, the model is expected to bring significant changes to computer interaction methods.

Related Recommendations

OpenAI Talent Mobility: Former Researcher Tian Yonglong Joins Tencent, Focused on Visual Language Model Development

Tian Yonglong, a former researcher at OpenAI, has joined Tencent's Large Language Model Department, focusing on the development of visual language models. This move is seen as a key recruitment for Tencent to strengthen its multi-modal large model strategy, highlighting the intense competition for cutting-edge talent.

Jul 9, 2026

212.7k

Suno is Under Pressure! Tencent Collaborates with Tsinghua University to Launch SongGeneration 2 with a Phonetic Error Rate as Low as 8.55%

Tencent and Tsinghua University jointly launched the AI music model SongGeneration 2, achieving significant breakthroughs in technical architecture and music quality, significantly surpassing existing open-source models and even rivaling top commercial products, effectively solving the 'plastic feel' problem in AI music.

Mar 10, 2026

197.7k

Unlocking PB-Level Video Assets! InfiniMind, Founded by a Former Google Employee, Helps Enterprises Mine Video Dark Data

Tokyo startup InfiniMind secures $5.8 million in seed funding, founded by a former Google employee, dedicated to developing AI infrastructure that transforms massive unused video and audio dark data into searchable structured business intelligence to address enterprise data processing challenges.

Feb 10, 2026

219.1k

Tsinghua University Releases AI Application Guidelines: Prohibiting the Use of AI-Generated Content as Academic Work

Tsinghua University releases the "Guidelines for the Application of Artificial Intelligence in Education", systematically regulating the use of AI on campus, covering core scenarios such as teaching and academic research. The content is divided into three parts: General Provisions, Teaching Section, and Section on Thesis and Practical Achievements. It emphasizes positive guidance and tiered management, aiming to promote the proper application of AI in the field of education.

Nov 27, 2025

193.9k

Tsinghua University's New Discovery: AI Large Models Are Not Just About Size, but Also Density

Tsinghua University published a study in "Nature Machine Intelligence", introducing the new concept of "ability density", challenging traditional AI evaluation standards. The research emphasizes that attention should not only be paid to the number of model parameters, but also to the level of intelligence within each parameter, questioning the scale rule that larger models are necessarily more capable.

Nov 24, 2025

162.1k

Intelligent Future, Your Artificial Intelligence Solution Think Tank

English 简体中文繁體中文にほんご