The leading domestic large model, DeepSeek, has recently announced a significant price reduction, lowering the input cache hit price across all API series to 1/10 of the initial price. This move marks a new phase in cost control for domestic AI, aiming to attract more developers and enterprises by offering extreme value for money.

Core Reductions Address Industry Pain Points

This price adjustment covers the entire V4-Pro and V4-Flash series. The input cache price for V4-Pro has been reduced to 0.1 yuan per million Tokens, and with a limited-time promotion, the actual payment is only 0.025 yuan. Compared to overseas competitors, the input cache price is just 1/700 of GPT-5.5Pro, demonstrating strong market competitiveness.

In addition to cache hit scenarios, the prices for cache miss scenarios and output have also been reduced to 1/4 of the original price. This pricing strategy precisely targets high-frequency usage scenarios such as RAG knowledge bases, intelligent customer service, and document analysis, potentially reducing enterprise operational costs by over 90%.

image.png

DeepSeek's ability to significantly reduce prices is due to its self-developed sparse attention architecture. This technology supports ultra-long context processing of up to 160k, improving the efficiency of long text processing while effectively reducing underlying computing power consumption and storage costs.