Who says "small models" can't beat "giants"? Recently, the
This time, Apple is targeting UI (user interface) development, a headache for all developers.
Although AI-generated code is powerful, it often performs poorly in UI design. The reason is simple: traditional "human feedback reinforcement learning" (RLHF) is too crude. Previously, AI learned design by listening to designers say "this interface isn't good," but AI had no idea why or how to improve it.
To train an AI with "on-point aesthetics," Apple brought in 21 senior external experts.
These experienced design professionals didn't just rate designs—they rolled up their sleeves and got involved: they wrote comments, drew sketches, and modified code. The
The astonishing result appeared: the fine-tuned Qwen3-Coder beat GPT-5.
Experimental data showed that, by fine-tuning with just 181 high-quality "sketch feedbacks," this model, which wasn't particularly large in parameters, surpassed
The study also revealed a painful truth: aesthetics are indeed subjective.
The research found that the consistency rate between common people and professional designers in judging whether a UI is good or not was only 49.2%, just like flipping a coin. However, when designers expressed specific modification intentions through "sketches," the consistency rate immediately jumped to 76.1%. This means that future AI design tools will no longer blindly guess your preferences, but instead truly understand your visual language.
If
