I specialize in building and studying MLLM-based Graphical User Interface (GUI) Agents. My work involves both their practical development and evaluation, but my core research passion is to elevate their cognitive capabilities toward human-like reasoning and generalization. To this end, my research also extends to synergistic fields such as RL4LLM, LLM Reasoning, and Tool-Use Agents, which I consider crucial for pushing the boundaries of what GUI agents can achieve.

Currently a third-year Undergraduate student in Communication Engineering at Xidian University and Heriot-Watt University, I am actively seeking a PhD/Intern Opportunity and contribute to advancing this exciting field.

Curriculum Vitae

📝 Publications

AAAI 2026 (Under Review)
sym

You Don’t Know Until You Click: Automated GUI Testing for Production-Ready Software Evaluation

Yutong Bian*, Xianhao Lin, Yupeng Xie et al.

Project

  • As the lead for AppEvalPilot, I designed and implemented a system to dynamically assess software functionality through UI interaction, moving beyond the limitations of static analysis for LLM-based software engineers.
  • My work involved creating automated test case generation and a test execution agent capable of complex GUI interactions.
  • The experimental results demonstrated a high correlation (0.91) between AppEvalPilot’s assessments and those of human experts, while also being 55% faster and 94.8% cheaper.

📖 Educations

  • 2021.09 - 2026.06, B.Eng. in Communication Engineering, Xidian University, China & Heriot-Watt University, UK.
    • GPA: 3.8/4.0

💻 Internships

🚀 Projects

  • OSAgent: Cross-platform Intelligent Assistant (Sep. 2024 - May. 2025)
    • Focused on developing a universal, stable, and efficient GUI agent framework.
    • Contributed to the architecture design, perception, planning, and execution modules.
    • Achieved state-of-the-art performance on SpaBench (mobile) cross-application tasks (26.7% vs. 13.3% by the previous SOTA).
  • R1-Like GUI Agent Training (Apr. 2024 - May. 2025)
    • Focused on enhancing the core element grounding capability of GUI agents using GRPO.
    • Designed a data collection and refinement pipeline and a multi-component reward function.
    • Demonstrated that 1k meticulously selected data points can achieve performance comparable to SOTA models trained on millions of samples, significantly improving GUI grounding accuracy on benchmarks like ScreenSpot and ScreenSpotPro.