Stability AI Launches AI Music Generation Tool 'Stable Audio'


AI music generation tool Udio has released version 1.5, featuring a significant improvement in sound quality, with the introduction of key pitch control functions making music creation more precise. The new model supports multiple languages, broadening its audience reach. Udio has also enhanced its product features, including dedicated creation pages, music segment downloads, audio-to-audio remix capabilities, and shareable lyric videos, providing more possibilities for music creation. This update not only improves music quality but also enriches user experience, making it a powerful assistant for music creators.
July 10, 2024 News Stability AI has announced that its user-friendly chatbot, Stable Assistant, has added two new features: Search and Replace, and music generation through Stable Audio. These new features further expand the capabilities of Stable Assistant, making it even more powerful in image editing and creative production.New Feature Highlights: Search and Replace:Users can now specify an object in an uploaded image and seamlessly replace it with another. This feature is particularly useful
The robotics research team at Google DeepMind recently released a robotics project called RT-2. This project took 7 months to develop and uses a large model for training. RT-2 has capabilities such as symbol understanding, reasoning, and human recognition, and can think and complete tasks based on human instructions. By combining the large model with the robot's operational capabilities, RT-2 can accomplish tasks that involve logical leaps, such as from 'extinct animals' to 'plastic dinosaurs'. The results of this project performed well in various sub - category tests, with performance up to three times that of the previous generation of robot models. This research result demonstrates the potential of large models in robotics research and is expected to drive the development of robots in the future.
Meta Intelligence OS is a startup founded by Bloomberg. It has developed a series of large models based on the open-source model RWKV and aims to become the Android in the era of large models. The RWKV model has superior performance and low cost in inference tasks, thus attracting customers from industries such as finance, law firms, and smart hardware. The business model of Meta Intelligence OS is model customization based on private data and internal AI Agent development. The company hopes to solve the problems of API call latency and data security by deploying large models on terminal devices. Currently, RWKV versions are available on Windows, Mac, and Linux computers, and Android and iOS versions are also in development. Meta Intelligence OS is raising funds and collaborating with chip companies and computing power platforms to create benchmark customers. Luo Xuan said that the decisive battlefield for large models is on hardware, and both terminal devices and the cloud require dedicated chips.
Imagine being able to see a person speaking, moving, and even performing, all from just a single photo within seconds. This is the allure of OmniHuman-1, recently launched by ByteDance. This artificial intelligence model, which has gone viral online, can create highly realistic videos that bring static images to life. Combined with audio clips, it achieves lip synchronization, full-body movements, and rich facial expressions. Unlike traditional deepfake technologies, OmniHuman-1 is not limited to facial replacement; it can animate the entire person.