In the AI circle, Musk's speed always surprises people. On March 16 local time, Musk's artificial intelligence startup xAI officially announced on a social platform that its large model Grok's text-to-speech (TTS) API is now officially launched.
This means developers can now integrate Grok's distinctive personal tone, even with a bit of "sarcastic" and humorous style, into various applications. From smart assistants to immersive podcast generation, Grok is no longer just limited to text on the screen but now has a real "voice".
As a key part of xAI's ecosystem layout, the addition of voice functions marks that Grok is evolving from a single text interaction engine into a more human-like multimodal assistant. Previously, OpenAI's GPT-4o amazed the world with extremely smooth voice interaction, and Musk clearly doesn't want to fall behind in this "auditory competition".
Aside from the frequent API updates, the competitive situation in the large model industry has also become increasingly intense. In the 24-hour hot list of 36Kr, the black industry of "poisoning" large models exposed by 315 and the suspense of DeepSeek V4 not being released yet still remain high. While the industry is still struggling with data authenticity and model iteration speed, xAI has chosen to accelerate the interaction experience violently.
When your app starts echoing Grok's distinctive sharp statements, this might become the most unique personalized tag of the AI era. With the release of the voice API, a competition about "who's AI sounds better and understands communication more" has already begun in full swing.