Moonshot Responds to Slow Speed of Kimi K2 API: Working Hard to Optimize

AIbase基地

Published in AI News · 3 minute read · Jul 16, 2025

On July 11, Moonshot AI officially released the Kimi K2 model, which has stronger coding capabilities and general agent task processing abilities. The company also chose to open-source it. This base model, based on a MoE architecture, has a total of 1T parameters and 32B activated parameters, and it has attracted widespread attention since its release.

However, recently some users have reported that the API service for the Kimi K2 model is slow. In response, Moonshot AI stated that the main reasons for the slow speed are the high volume of traffic and the large size of the model. To address this issue, the company is working hard to optimize inference efficiency and is accelerating the addition of hardware resources such as computing cards and servers. It is expected that the API service speed will see a significant improvement in the coming days.

WeChat screenshot_20250716083920.png

Moonshot AI also stated that the Kimi K2 model is fully open-source. Users can not only use it through the official channels of Moonshot AI but also choose to access it through other model providers, such as Silicong and Wuwen Xinqiong. In addition, the company also welcomes users with the ability to deploy the model themselves.

Currently, the API service for the Kimi K2 model is fully available, supporting a maximum context length of 128K, and it has stronger generality and tool calling capabilities. In terms of billing, it charges 4 yuan per million input tokens and 16 yuan per million output tokens. Moonshot AI will continue to optimize its services to provide users with a more efficient and stable experience.

Apple MLX Supports NVIDIA CUDA, AI Developers Benefit from Cost and Efficiency

Apple's MLX framework adds CUDA support, enabling seamless AI app migration from Mac to NVIDIA platforms. This breakthrough resolves Metal framework limitations, achieved by @zcbenz through months of integration. Feature ensures code compatibility, lowering development barriers by allowing Apple Silicon prototyping with optional NVIDIA deployment.....

Perplexity Partners with SheerID: AI Search Engine Opens Free to 264 Million Students Worldwide, Targeting the Education Market

Perplexity AI partners with SheerID to offer 264M students globally 2 years of free Pro membership ($20/month value). SheerID verifies student status via 200K data sources in 190 countries within 3 minutes, preventing fraud. 86% US students use AI tools, driving demand for academic integrity-focused services. Perplexity emphasizes accuracy and data privacy, gaining academic favor. SheerID delivers 337% ROI for clients.....

Ima Web Version Launches New, Making It Easy to Access the Knowledge Base

The Ima web version is officially launched. Users can access the knowledge base through a browser, solving issues with software downloads and system compatibility. The new version supports features such as highlighted notes and pop-up Q&A, making it more convenient and efficient. The team will continue to introduce convenient features, and users can mark it as a favorite to stay updated. The web version makes accessing knowledge more flexible and efficient, suitable for various usage scenarios.

Google Discover Launches AI Summary Feature, News Websites May Face New Challenges!

Google's AI summaries in search raise publisher traffic concerns. The feature, now live in the U.S., generates news snippets with citations, mainly for trending topics. While Google claims it helps users filter content, publishers report organic traffic drops as non-click searches rose from 56% to 69% since May. Despite new monetization efforts like Offerwall, traffic decline persists. Media reactions remain mixed as outlets like WSJ explore AI a....

Moonshot Responds to Slow Speed of Kimi K2 API: Working Hard to Optimize

Related AI News

Apple MLX Supports NVIDIA CUDA, AI Developers Benefit from Cost and Efficiency

Mistral Launches Voxtral: The Dawn of a New Era for Open-Source AI Audio Models!

Perplexity Partners with SheerID: AI Search Engine Opens Free to 264 Million Students Worldwide, Targeting the Education Market

Ima Web Version Launches New, Making It Easy to Access the Knowledge Base

TRAE Launches Kimi-K2 Model Service International Version Supports Grok-4 (Beta) Function Upgrade

Google Discover Launches AI Summary Feature, News Websites May Face New Challenges!