Tsinghua University Develops New Visual Language Model CogAgent to Enhance GUI Understanding and Navigation


Tsinghua University and the Kuaishou Kirin team have collaborated to launch an SVG model that replaces VAE, solving the issue of semantic entanglement, with a 6200% improvement in training efficiency and a 3500% increase in generation speed, marking the gradual phase-out of VAE in the field of image generation.
Recently, Tencent Charity officially launched the "Ask AI" function, which is the first time that large artificial intelligence models have been applied in the charity sector on this platform. This innovative feature allows users to ask questions about various projects and organizations of Tencent Charity, aiming to enhance the interaction and transparency between the public and charitable organizations. The launch of the "Ask AI" function marks another breakthrough for Tencent in the field of charity. Users only need to input their questions simply, and the system can instantly provide relevant information, helping users better understand and participate in various charitable activities. This convenient communication method...
On January 23, 2025, the world's first publicly accessible, ready-to-use computer intelligent agent, GLM-PC, was upgraded again, attracting widespread attention. GLM-PC is based on the multimodal large model CogAgent, capable of 'observing' and 'operating' the computer like a human, assisting users in efficiently completing various computer tasks.
Recently, researchers from ByteDance Research Institute and Tsinghua University jointly released a new study, pointing out that current AI video generation models, such as OpenAI's Sora, while capable of creating stunning visual effects, have significant flaws in understanding basic physical laws. This study has sparked widespread discussion on the capabilities of AI in simulating reality. The research team tested AI video generation models under three different scenarios: predictions under known patterns and predictions under unknown patterns.
AI startup Moondream has officially announced the completion of $4.5 million in seed funding and presents a disruptive viewpoint: in the world of AI models, smaller models may hold advantages. The company is backed by Felicis Ventures, Microsoft's M12 GitHub Fund, and Ascend, launching a visual language model with only 1.6 billion parameters that can compete with models four times its size in terms of performance.