Tencent HunYuan Open-Sources Customizable Image Generation Plugin InstantCharacter

Tencent HunYuan announced the open-source release of its customizable image generation plugin, InstantCharacter, achieving compatibility with the open-source text-to-image model, Flux. This launch marks a significant breakthrough in character consistency and image generation accuracy, providing content creators with more efficient and flexible tools.

InstantCharacter's core advantage lies in its ability to ensure character consistency and realism across different scenes, while maintaining high image quality, precision, and flexible text editing capabilities. Users can easily place any character in any desired pose using simple prompts. For example, with just an image and a description like "a rabbit in the kitchen drinking soup with a spoon," the corresponding image can be generated. This capability is particularly crucial in multi-round text-to-image scenarios, solving the challenge of character consistency.

微信截图_20250418113416.png

Technically, InstantCharacter utilizes a novel framework built upon the DiT model. It introduces a scalable adapter employing multiple transformer encoders to effectively handle open-domain character features and seamlessly interact with the latent space of modern diffusion transformers. This design allows the system to flexibly adapt to different character traits while maintaining high consistency.

To effectively train this framework, the Tencent HunYuan team constructed a large-scale character dataset containing tens of millions of samples. The dataset is systematically organized into paired (multi-view characters) and unpaired (text-image combinations) subsets, enabling simultaneous optimization of identity consistency and text editability through different learning pathways. This dual data structure design further enhances the model's generalization ability and image quality.

In practical evaluations, InstantCharacter's performance rivals leading models like GPT-4o. It can handle images of various styles and complexities, suitable for applications such as comic creation and film production. InstantCharacter allows content creators to maintain high character consistency and more efficiently create visual works that meet their needs.

- Project Website: https://instantcharacter.github.io/

- Code: https://github.com/Tencent/InstantCharacter

- Hugging Face Demo: https://huggingface.co/spaces/InstantX/InstantCharacter

- Paper: https://arxiv.org/abs/2504.12395

Tencent Hunyuan New Technology Makes Large Models 'Less Oily' to Make AI-Generated Images More Realistic!

Recently, the Tencent Hunyuan team released their latest research findings on their official WeChat account —— SRPO (Semantic Relative Preference Optimization), aimed at improving the realism of AI-generated images, especially addressing the 'oily' issue in the skin texture of the open-source text-to-image model Flux. This innovative technology is expected to bring about revolutionary changes in the image generation field. In today's era where digital art is becoming increasingly popular, the quality of AI-generated images has become particularly important. The Flux model, as a popular foundation model in the open-source text-to-image community, is often criticized for its

X Launches Aurora Image Generator, Mysteriously Goes Offline Hours Later

The social platform X, owned by Elon Musk, recently added a new image generator named Aurora to its AI assistant Grok. However, this feature disappeared from the interface of some users just a few hours after its launch, raising public interest. Similar to X's first image generator Flux, launched in October, Aurora has almost no generation limits and can be accessed through the Grok tab on both the mobile app and website. It is capable of generating images featuring public figures and copyrighted content.

The Mysterious "Blueberry" Model Emerges: A New Dominator in the AI Image Generation Realm or Just a Marketing Gimmick?

The image generation field has welcomed a striking new star. A mysterious model named "Blueberry" has suddenly appeared in the AI arena, rapidly rising to the top with its astonishing performance, sparking widespread attention and discussion in the industry. This model, called "Blueberry," has outperformed well-known competitors, including OpenAI's "Strawberry," Flux.1, Ideogram v2, and Midjourney v6.1, swiftly becoming the new champion in the image generation domain. However, the true identity of "Blueberry" remains unclear.

Opening the Book! Midjourney Fully Opens Its Website, Offering All Users 25 Free Images Daily

Midjourney, an early leader in the AI text-to-image generation field, has recently faced challenges from new competitors like xAI, Grok2, and Ideogram2. In response to competitive pressure, Midjourney announced that its official website is now open to all users, providing a daily allowance of 25 free AI image generation services. This initiative aims to expand its user base and attract new users. The new website supports registration using Google and Discord accounts, making the sign-up process simple and convenient. Users can access it through the top of the website.

AI Surveillance Footage Stuns Netizens! Musk Caught Shoplifting at Supermarket? Grok+Flux=Unlimited Prank

Tech legend Musk has become the focus of internet pranks due to an AI prank. The AI 'golden duo' Grok and Flux have created a series of exaggerated and absurd scenarios, including Musk 'shoplifting' at a supermarket, interacting with Trump and Obama, and having bizarre interactions with various figures. These scenes are absurd and hilarious. Musk's reaction to the incident is surprisingly calm as he views it as a transitional phase in technology development, emphasizing the need to maintain a moderate awareness of safety while pursuing fun. Grok is not only skilled in pranks but also capable of...