AI Daily: Hedra Launches Free Text-to-Speech Video; Deepmind Unveils Advanced Auto Video Dubbing Tech V2A; Meitu's WHEE V2 Goes Live; Open-Source Sora Enables One-Click 720p HD Video Generation

Welcome to the AI Daily section! Here, you'll find your daily guide to exploring the world of artificial intelligence. Each day, we bring you the hottest topics in the AI field, focusing on developers to help you understand technological trends and discover innovative AI product applications.

Explore Fresh AI Products Click to Learn More: https://top.aibase.com/

1. Hedra's Character-1 Now Available

Hedra's Character-1 is now open for use, offering creators a tool to generate talking and singing videos from text and images, revolutionizing the creation process. It's not just a tool but a new platform for creative possibilities, enabling everyone to have unlimited opportunities for video creation.

AiBase Summary:

⭐️ Dynamic Video Generation: Upload photos and audio to make characters talk or sing vividly.

⭐️ Multi-platform Compatibility: Works seamlessly on both desktop and mobile devices.

⭐️ High-Quality Assurance: Synchronized expressions, gestures, and voices for realistic results.

Details Link: https://top.aibase.com/tool/hedra

2. Deepmind's V2A Technology: Automated Video to Audio Conversion

Google Deepmind has introduced V2A technology, which generates rich audio tracks from video pixels and text prompts, achieving synchronized audio-visual creation. Users can guide audio output through text descriptions, with the system using autoregressive and diffusion methods to ensure perfect synchronization with video content. AI-generated annotations are used in training to help the model understand the relationship between audio events and visual scenes. Despite lip-sync challenges, V2A technology will be rigorously tested before public release.

AiBase Summary:

🔊 Automated Video to Audio Conversion

🎶 Generates rich audio tracks from video pixels and text prompts

🤖 Uses AI-generated annotations during training

Details Link: https://top.aibase.com/tool/deepmind-v2a

3. Bilibili Open-Sources Lightweight AI Language Model Index-1.9B

Bilibili's latest open-source Index-1.9B model has garnered significant attention. The model includes a base model, a control group, and a dialogue model, with 1.9 billion non-embedding parameters, leading in performance on multiple evaluation benchmarks.

AiBase Summary:

🔍 Index-1.9B base: The base model has 1.9 billion non-embedding parameters, pre-trained on 2.8T of Chinese and English corpora, leading in its class.

🔍 Index-1.9B pure: The control group is identical to the base model but filters out instruction-related data to verify its impact on benchmarks.

🔍 Index-1.9B chat: The dialogue model is aligned with the base model through SFT and DPO, introducing internet community corpora for enhanced conversational趣味性.

Details Link: https://top.aibase.com/tool/index-1-9b

4. Meitu's WHEE V2 Officially Launched

Meitu has introduced the new AI-powered image editor WHEE V2, offering a variety of practical features and AI technology for a convenient and efficient one-stop processing experience. New AI painting and AI editing functions enrich user editing choices, supporting creative presentation of various types of materials. Intelligent selection and prompt functions facilitate natural modifications, supporting custom image sizes and layer contents, and various forms of image expansion. Features include visual multi-layer management, precise semantic recognition, diverse styles, and detailed control, achieving personalized high-quality image processing.

AiBase Summary:

✨ New AI painting and AI editing functions enrich user choices, supporting creative presentation of various materials.

💡 Intelligent selection and prompt functions facilitate natural modifications, supporting custom image sizes and layer contents.

🎨 Features visual multi-layer management, precise semantic recognition, diverse styles, and detailed control.

5. Luchen Open-Sora Team Achieves Breakthrough in 720p HD Video Quality and Generation Duration

The Luchen Open-Sora team has made a significant breakthrough in 720p HD video quality and generation duration, making video generation simple with their open-source project, which has been warmly welcomed by the community. NVIDIA-backed AI company Lambda Labs has also built a digital LEGO universe based on the Open-Sora model weights, opening up new creative possibilities. The technical report delves deeply into the core of model training and key points, addressing the pain points of video model training and enhancing generation quality and speed.

AiBase Summary:

⚙️ Open-Sora team achieves breakthrough in 720p HD video quality and generation duration, simplifying video generation process.

🌟 Lambda Labs creates a digital LEGO universe based on Open-Sora model weights, offering limitless creativity.

🔬 Technical report reveals core details of model training, addressing pain points and enhancing quality and speed.

Details Link: https://github.com/hpcaitech/Open-Sora

6. Baidu's Xiling Digital Human Platform Upgrades to Support Text-to-3D Digital Humans and Voice Cloning

Baidu's Intelligent Cloud Xiling Digital Human Platform is set to undergo a major upgrade, offering efficient and low-cost 2D/3D digital human generation, fully integrating live streaming, short videos, dialogue, and other scenarios, significantly enhancing user experience. The Xiling platform showcases remarkable digital human generation capabilities, quickly and accurately generating realistic digital humans, bringing new IP creation possibilities to enterprises, cultural tourism, entertainment, and other fields.

AiBase Summary:

🌟 Efficient and low-cost 2D/3D digital human generation, enhancing user experience.

🎨 Quickly and accurately generates realistic digital humans, offering IP creation possibilities in multiple fields.

🔊 Provides voice cloning functions, generating custom voices for digital human broadcasting and content production.

7. Meta Releases Multiple Models: Multimodal Model Chameleon, Text-to-Music Model JASCO, Audio Watermarking Technology AudioSeal

Meta has recently released several research achievements, including the multimodal model Chameleon, the text-to-music model JASCO, and the audio watermarking technology AudioSeal, bringing new technological breakthroughs and application prospects to the AI field. These achievements will promote the development and application of AI technology, with significant importance.

AiBase Summary:

🌟 Meta releases the multimodal model Chameleon, supporting mixed text and image input-output, offering new solutions.

🎶 New language model training method Multi-Token Prediction improves model capabilities and training efficiency.

🔊 Text-to-music model JASCO can accept various conditional inputs, providing better and more flexible music control.

Details Link: https://top.aibase.com/tool/meta-chameleonMulti-Token Prediction

8. Google Introduces Alphabet Generator GenType for Creating Cover Art Fonts

GenType is Google's experimental product driven by the Imagen2 model, allowing users to create personalized letterforms for various content, especially suitable for creating titles or cover art. The tool offers a simple and intuitive interface, enabling users to get started quickly and stimulate creativity and imagination. Users can share and save generated alphabet images and browse other users' works in the online gallery for inspiration and creativity.

AiBase Summary:

🎨 Personalized Letter Creation: Users can input any prompt, and GenType transforms it into a unique alphabet, showcasing personal creativity.

🖌 Art Creation Tool: GenType is not just a generator but an art creation tool, allowing users to create limitless possibilities of letter art.

📷 Share and Save: Offers convenient sharing and saving options, allowing users to save alphabets as PNG format images and share on social media.

Details Link: https://top.aibase.com/tool/gentype

9. NVIDIA Surpasses Microsoft as the World's Most Valuable Company

NVIDIA's stock has soared, surpassing Microsoft, Apple, and Google to become the world's most valuable company. The company plans to launch the new Blackwell GPU architecture, with the CEO stating it will be the world's most powerful chip, and new AI chips will be released annually. NVIDIA's stock price has risen 160% in 2024, with a market cap of $3.335 trillion.

AiBase Summary:

📈 NVIDIA surpasses Microsoft, Apple, and Google to become the world's most valuable company.

💻 NVIDIA plans to launch the Blackwell GPU architecture, with the CEO claiming it will be the world's most powerful chip, releasing new AI chips annually.

💰 NVIDIA's stock price has risen 160% in 2024, with a market cap of $3.335 trillion.

10. Apple Announces New AI Training for Developers Following AI Feature Launch

Apple has announced new AI training courses for students, mentors, and alumni of developer academies. This marks Apple's increasing openness and emphasis on the AI technology field.

AiBase Summary:

🍎 Apple introduces new AI training courses, focusing on cultivating students' professional programming skills.

📚 New courses will teach how to build, train, and deploy machine learning models on Apple devices.

💡 Apple's AI tools will be integrated into multiple platforms, including Xcode, helping developers code more intelligently.

11. Luma AI's Dream Machine Generates Works Accused of Plagiarizing Disney IP

Luma's Dream Machine video generation tool has raised questions about model transparency and data sources, especially accusations of plagiarism from Disney works. This has sparked concerns about the lack of transparency in such models.

AiBase Summary:

🔍 Questions about model transparency and data sources, whether created in the style of Disney.

🚫 Characters in the video accused of plagiarizing Disney Pixar works, sparking controversy.

💡 Dream Machine touted as the future of film production, offering high-quality, realistic shot creation.

12. AI Painter Caught Taking Orders, XHS Blogger's "AI Detection" Video Gets 29K Likes

AI Daily: Hedra Launches Free Text-to-Speech Video; Deepmind Unveils Advanced Auto Video Dubbing Tech V2A; Meitu's WHEE V2 Goes Live; Open-Source Sora Enables One-Click 720p HD Video Generation

Related Recommendations

Microsoft Q-Sparse Model: 8B Parameters Outperform 7B Models with Effortless Training and Fine-Tuning!

AI Learns from YouTube: Apple, Nvidia, and Anthropic in Controversy

Anthropic Launches $100 Million AI Startup Fund in Partnership with Venture Capital Firm