SenseTime Open Sources SenseNova U1, Achieving a Multimodal Native Unified Architecture

SenseTime officially released and open-sourced the SenseNova U1 series of native understanding and generation unified models on the 28th. This model is based on the self-developed NEO-unify architecture by SenseTime in March this year. It achieves deep integration of multi-modal understanding, reasoning, and generation within a single model framework, marking a significant breakthrough in the multi-modal AI paradigm from "integrated" to "native unification."

The NEO-unify architecture adopted by SenseNova U1 completely discards the common modular design found in mainstream models. By removing the visual encoder (VE) and variational autoencoder (VAE), it reconstructs a unified representation space. This architecture deeply integrates multi-modal processing into every layer of computation, allowing language and visual information to be modeled as a unified composite. It maintains pixel-level visual fidelity while preserving semantic richness. With this technology, the model demonstrates remarkable performance in logical reasoning and spatial intelligence, accurately understanding the complex layout and intricate relationships of the physical world.

SenseTime Enters the Intelligent Agent Arena: New All-Modal Base Model is Ready to Launch

The competition in large models is shifting towards agents. SenseTime is developing the industry's first natively multimodal agent base, integrating a unified core of "understanding, generation, and action", directly benchmarking against GPT-Image 2, and pushing AI from passive Q&A to active execution.....

SenseTime Secretly Developing Multimodal Model U1Pro: Led by Lin Dahua, Expected to Launch Internal Testing in July, Targeting OpenAI

SenseTime is secretly developing the multimodal large model U1Pro, targeting design scenarios, led by Chief Scientist Lin Dahua. The model belongs to the "Ri Ri Xin" family, aiming to compete with OpenAI's GPT-Image2, emphasizing long-range logic and thinking capabilities, and expected to launch internal testing and commercial use in July.

SenseNova U1 - SencePhoto's Native Understanding and Generative Unified Model, Say Goodbye to Plugin-Based AI

On April 28, SenseTime open-sourced the 'SenseNova U1' series, a 'native understanding and generation unified model' that overcomes traditional multimodal models' reliance on modular splicing, achieving deep integration of vision and language through a unified architecture, marking a significant domestic AI breakthrough in multimodal technology.....

SenseTime Launches the Industry's First Multi-Series Generative AI Agent Seko2.0, Domestic AI Chip Successfully Integrates the Full Multimodal AIGC Pipeline

SenseTime launches Seko2.0, the world's first AI agent for multi-scene video generation, enabling continuous narratives from single clips. It ensures high consistency in characters, scenes, and style, advancing plot coherence and visual uniformity, scalable for short videos, ads, and education, powered by its proprietary multimodal model.....

New GoT-R1 Multimodal Model Released: Making AI Drawing Smarter, the New Era of Image Generation!

Recently, a research team from the University of Hong Kong, The Chinese University of Hong Kong, and SenseTime has released a groundbreaking framework - GoT-R1. This new multimodal large model significantly enhances the semantic and spatial reasoning capabilities of AI in visual generation tasks by introducing reinforcement learning (RL), successfully generating high-fidelity and semantically consistent images from complex text prompts. This advancement marks another leap in image generation technology. Currently, although existing multimodal large models have made significant progress in generating images based on text prompts