On July 11, Moonshot AI officially released the Kimi K2 model, which has stronger coding capabilities and general agent task processing abilities. The company also chose to open-source it. This base model, based on a MoE architecture, has a total of 1T parameters and 32B activated parameters, and it has attracted widespread attention since its release.
However, recently some users have reported that the API service for the Kimi K2 model is slow. In response, Moonshot AI stated that the main reasons for the slow speed are the high volume of traffic and the large size of the model. To address this issue, the company is working hard to optimize inference efficiency and is accelerating the addition of hardware resources such as computing cards and servers. It is expected that the API service speed will see a significant improvement in the coming days.
Moonshot AI also stated that the Kimi K2 model is fully open-source. Users can not only use it through the official channels of Moonshot AI but also choose to access it through other model providers, such as Silicong and Wuwen Xinqiong. In addition, the company also welcomes users with the ability to deploy the model themselves.
Currently, the API service for the Kimi K2 model is fully available, supporting a maximum context length of 128K, and it has stronger generality and tool calling capabilities. In terms of billing, it charges 4 yuan per million input tokens and 16 yuan per million output tokens. Moonshot AI will continue to optimize its services to provide users with a more efficient and stable experience.