Recently, Google officially announced that its latest Gemini 2.5 Flash-Lite model has entered the general availability (GA) stage. This version is considered the fastest and most cost-effective model, marking another important advancement in Google's artificial intelligence field. Gemini 2.5 Flash-Lite achieves a good balance between performance and cost, and it natively supports up to 1 million tokens of context, bringing many advanced features.

image.png

The pricing strategy of Gemini 2.5 Flash-Lite is also quite notable: the cost is only $0.10 per million input tokens, and $0.40 per million output tokens, which is comparable to the price of the competitor GPT-4.1 Nano. In addition, compared to the previous preview version, the pricing for audio input has been reduced by 40%, showing its sensitivity to user needs and response to market competition.

In various benchmark tests, the performance of Gemini 2.5 Flash-Lite surpassed the previous 2.0 version, covering areas such as coding, mathematics, reasoning, and multimodal understanding. The model supports a context window of 1 million tokens, has controllable thinking budgets, and offers multiple native tools, such as integration with Google search, code execution, and URL context functionality.

Developers can use the Gemini 2.5 Flash-Lite model through simple code instructions, specifically by specifying the model as gemini-2.5-flash-lite. It should be noted that the original preview version alias plan will be removed on August 25th, and developers should adapt to the new version as soon as possible.

The release of Gemini 2.5 Flash-Lite marks Google's determination to continuously innovate and optimize in the field of artificial intelligence technology, providing developers with a more efficient and cost-effective option. It is undoubtedly going to play a greater role in various application scenarios in the future.

Key points:

🌟 Gemini 2.5 Flash-Lite is Google's latest AI model, the fastest and most cost-effective, and has now entered the general availability (GA) stage.

💰 The model is priced at $0.10 per million input tokens and $0.40 per million output tokens, with a 40% reduction in the price of audio input compared to the preview version.

🔧 Developers can use the new version by specifying the model name as gemini-2.5-flash-lite. The original preview version alias will be removed on August 25th.