Microsoft Launches the Small Multimodal AI Model Phi-4: The Perfect Combination of Thinking and Perception!

Microsoft recently released a new open-source AI model called Phi-4-Reasoning-Vision-15B in its developer community. This model not only has high-resolution visual perception capabilities but can also perform deep reasoning, marking an important breakthrough in the Phi-4 series. As the first "small language model" (SLM) with the characteristics of "seeing clearly" and "thinking deeply," the release of Phi-4 will open up new intelligent application scenarios for developers.

Different from traditional vision models, Phi-4 does not merely passively identify content in images; it can perform structured and multi-step reasoning. It can understand the visual structure in images and combine it with text context to draw actionable conclusions. This capability allows developers to create various intelligent applications, ranging from data chart analysis to user interface automation.

Phi-4's design features include its flexible reasoning mode. When faced with tasks requiring in-depth analysis, such as math problems or logical reasoning, the model switches to "reasoning mode," activating a multi-step reasoning chain. In scenarios requiring quick responses, such as OCR (optical character recognition) or element positioning, it can quickly output results to reduce latency. This flexibility greatly enhances the model's practicality and efficiency.

Non-reasoning mode

Additionally, Phi-4 has significant application potential, especially in scenarios involving computer agents. Users need only provide a screenshot and natural language instructions, and the model can output standardized bounding box coordinates for the required UI elements. Other intelligent agent models can then perform interactive operations such as clicking or scrolling based on this information. Thus, Phi-4 will offer users a more convenient experience.

Reasoning mode

In summary, Phi-4-Reasoning-Vision-15B not only represents a technical breakthrough but also provides strong support for the development of intelligent applications. With the release of this model, we look forward to more developers utilizing its advanced features to create more amazing application scenarios.

Fully Recovered After 12 Hours! Notion Clarifies the Rumors of Anthropic Model Shutdown: It Was Just a Technical Glitch

Notion officially announced that access to Anthropic's Claude series AI models has been fully restored. Previously, due to performance degradation in the Opus4.7 and 4.8 models, the failure rate of user requests increased, leading Notion to temporarily disable all Anthropic models. After approximately 12 hours of fixing underlying infrastructure issues, the integrated service has now returned to normal.

NVIDIA Spends $400 Million to Acquire Kumo, Further Strengthening Its Full-Stack AI Ecosystem with Customized Forecasting Tools

NVIDIA has acquired AI startup Kumo for at least $400 million, incorporating its customized model technology and top talent. Kumo was founded in 2022 and focuses on enterprise-level AI model customization. The deal has been finalized, and NVIDIA executives had briefly publicly welcomed the Kumo team.

Reject Space Anxiety! Microsoft Hides AI Uninstall Option in Win11, Frees Over 2.5GB of Hard Drive Space in One Click

Microsoft has quietly added an AI model uninstallation feature in the Win11 Experience Preview version 26300.8553, allowing users to directly remove AI components, such as the Phi Silica model (which occupies over 2.59GB). This feature is not mentioned in the update log and is particularly useful for users with limited storage space.

35 Billion Parameters Competing with Industry Leaders! Microsoft Build Conference Introduces Multiple Self-Developed MAI Models

At the Build 2026 conference, Microsoft released its first advanced reasoning model, MAI-Thinking-1, with 35 billion parameters, achieving top performance in software engineering benchmark tests. The model was trained from scratch using clean data and did not use external sources, marking a significant step forward for Microsoft in self-developed AI and building a comprehensive matrix of scenarios.

Microsoft to Launch New Self-Developed Code and Multi-Scenario AI Models at Next Week's Build Conference

Microsoft plans to launch multiple self-developed AI models at next week's Build conference in San Francisco, with a focus on introducing a high-cost-performance code-specific model to counter the erosion of GitHub Copilot's market share by Cursor and Claude Code. This model aims to attract price-sensitive developers by reducing operational costs, while also launching models with various parameter specifications to complete its own AI ecosystem and gain more developer support.