Google recently released an update to Gemini 2.5 Flash Native Audio, significantly enhancing the capabilities of its voice assistant. This version is designed to better handle complex workflows, improve the accuracy of executing user instructions, and make conversations more natural and smooth. According to Google's feedback, the new version has increased the compliance rate with developer instructions from 84% to 90%, indicating significant progress in the voice assistant's ability to understand and execute user requests.
The update also brings noticeable improvements in the quality of multi-step conversations. Users will experience smoother communication when interacting with the voice assistant. This improvement allows the assistant to better adapt to complex questions and tasks, providing a more efficient service experience.
Google also revealed that the updated audio model achieved a function call accuracy of 71.5% on the ComplexFuncBench benchmark test, compared to 66.5% for OpenAI's gpt-realtime. However, it should be noted that Google may not have used the latest version of OpenAI in the test.
This update is already available in Google AI Studio, Vertex AI, Gemini Live, and Search Live, and Google Cloud customers have started using this new technology. Developers can test the model through the Gemini API to further explore its potential.
This update is not just about improved features; it also reflects Google's determination and efforts to continuously advance in the field of artificial intelligence, offering users a better experience.
Key Points:
🌟 The updated voice assistant has improved its accuracy in following user instructions from 84% to 90%.
📈 The new version achieved a function call accuracy of 71.5% on the ComplexFuncBench benchmark test.
💻 Developers can test the new model through the Gemini API and experience its enhanced features.
