Tencent Hunyuan Releases PhoneBuddy Phone Agent Model

Smartphone on a desk with abstract flowing lines suggesting an autonomous agent navigating apps

Tencent Hunyuan put out PhoneBuddy, an open 4B-parameter agent line for controlling phones, alongside a technical paper and code on GitHub. The models are built on a Qwen3.5-4B backbone, and weights went up on Hugging Face earlier in June.

The pitch is a training recipe, not just a model. PhoneBuddy runs reinforcement learning across two environments: real apps on physical devices, plus PhoneWorld, a mock-app setup that reconstructs runnable Android apps from real GUI traces so tasks can be reset and auto-verified. On a 150-task human evaluation across single apps, WeChat mini-apps, and cross-app workflows, task success climbs from 36.67% after supervised fine-tuning to 40.67% with real-app RL, then 45.33% once mock-app training is mixed in. On AndroidWorld the same line moves 60.3 to 77.2 to 83.2%.

All numbers are Tencent's own, and the headline comparisons are worth reading closely. The team says the final model beats GPT-5.4 and Seed 2.0 Pro on the four-setting average (54.8 vs 48.2 and 51.4) but sits under Gemini 3.1 Pro at 59.1. Cross-app tasks are where it stalls: success actually drops to 18% after the full recipe, worse than the SFT baseline, because the mock task pool is mostly single-app.

The reward setup leans on other models too. Real-app rollouts were scored using Gemini-3.1-Pro-Preview to write rubrics and a 122B Qwen model to grade them, which is a lot of proprietary judging behind an "open" result. The cross-app gap is the number to watch next.

Bottom Line

PhoneBuddy's 4B model hits 83.2% on AndroidWorld and beats GPT-5.4 on Tencent's average, but cross-app success sits at just 18%.

Quick Facts

Model: PhoneBuddy, 4B parameters, Qwen3.5-4B backbone
Real-phone eval: 36.67% to 45.33% success (company-reported)
AndroidWorld: 60.3% to 83.2% (company-reported)
Four-setting average: 54.8, above GPT-5.4 (48.2), below Gemini 3.1 Pro (59.1)
Cross-app tasks drop to 18% after full training

Tags:TencentHunyuanAI agentsphone-use agentsopen modelsAndroidWorldreinforcement learning

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Tencent Hunyuan Releases PhoneBuddy Phone-Use Agent Models

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

NVIDIA Releases LocateAnything-3B Visual Grounding Model

Qwen Releases AgentWorld, a Language Model That Simulates Agent Environments

Linux Foundation Launches Akrites to Coordinate Open Source Patching

Stay Ahead of the AI Curve