Alphabet's Google officially launched Gemini 3, offering a context window of up to 1 million tokens for the first time, supporting native multimodal reasoning for text, images, videos, and code. According to the official, Gemini 3 Pro achieved an accuracy rate of 91.9% on the GPQA Diamond graduate-level test, and scored 1501 Elo on LMArena, surpassing GPT-5.1 and Claude 4.5, becoming the highest-scoring model on the current public leaderboard.

Gemini 3 features a new Deep Think enhanced reasoning mode, which productizes the reasoning chain through "thought signatures" and "thinking levels." It achieved a score of 45.1% on ARC-AGI-2, setting new SOTA in multi-step logic, factual accuracy, and understanding of scientific charts. Google also launched the Google Antigravity development platform, supporting "agent-based coding" and "visual coding." LiveCodeBench Pro has an Elo of 2439, and Terminal-Bench 2.0 terminal operation accuracy is 54.2%, enabling autonomous completion of the entire data crawling, analysis, reporting, and deployment workflow.

Gemini 3 is now available to Google AI Ultra subscription users, and will be gradually rolled out to the Gemini app, AI Mode search, and enterprise-grade Vertex AI over the next few weeks. Google stated that the model was trained on its self-developed TPU v6 Pods, combined with a 90% search market share and 2 billion monthly active users for "AI Overviews," accelerating the transition of AI from laboratories to production lines.