You're offline - Playing from downloaded podcasts
Back to All Episodes
Podcast Episode

Apple's Open-Source AI Model Beats GPT-5 at UI Design Using Designer Feedback

February 6, 2026

Audio archived. Episodes older than 60 days are removed to save server storage. Story details remain below.

Apple researchers have demonstrated that a fine-tuned open-source model can outperform GPT-5 at generating user interfaces. By collecting feedback from professional designers through natural workflows like sketching and direct editing, rather than simple rankings, the team produced higher-quality training data that enabled a smaller model to surpass its larger proprietary rival.

A New Approach to AI-Generated Interfaces

Apple has published research showing that professional designers can dramatically improve AI-generated user interfaces by providing feedback through their natural creative workflows, rather than the conventional rating systems typically used in AI training.

The study, titled "Improving User Interface Generation Models from Designer Feedback" and published on Apple's machine learning research portal, recruited twenty-one professional designers with experience ranging from two to over thirty years. Participants critiqued AI-generated interfaces using four feedback methods: pairwise ranking, natural language commenting, visually-grounded sketching, and direct revision in design software.

Sketches and Edits Beat Rankings

The results were striking. When independent evaluators assessed the quality of the feedback, they agreed with designers' direct revisions seventy-six percent of the time and sketch-based improvements sixty-four percent of the time. By contrast, traditional pairwise rankings achieved just forty-nine percent agreement, essentially no better than a coin flip.

The designers generated a total of one thousand four hundred and sixty annotations, which were converted into preference data for training.

Small Model, Big Results

Using an approach called ORPO optimisation, Apple's team fine-tuned the open-source Qwen2.5-Coder model with this designer feedback. The resulting model outperformed all tested baselines in human evaluations, including OpenAI's proprietary GPT-5, using only one hundred and eighty-one sketch annotations.

The research builds on Apple's earlier UICoder project, which used automated tools like code compilers and vision-language models to improve AI-generated interfaces. This new work demonstrates that even small amounts of high-quality expert feedback can enable smaller open-source models to exceed larger proprietary systems.

Implications for AI Training

The findings challenge prevailing approaches to AI alignment. Apple's researchers argue that simple like-or-dislike feedback methods fail to capture the rich rationale and tacit domain knowledge that experts bring to their craft. The team has released the trained models publicly on GitHub under the repository name ml-rldf.

Published February 6, 2026 at 10:51pm

More Recent Episodes