University of Reims Champagne Ardenne
ASSISTGUI Benchmark for Desktop GUI Automation
Pages
14
Time to read
49 mins
Publication
Language
English
Pages
14
Time to read
49 mins
Publication
Language
English
This document is a technical report that introduces the ASSISTGUI benchmark, aimed at evaluating the capabilities of models in automating tasks on desktop graphical user interfaces (GUIs). It outlines the challenges associated with desktop automation, particularly in comparison to mobile and web applications. The report details the development of a benchmark consisting of 100 tasks sourced from nine widely-used software applications, including After Effects and MS Word. Each task is accompanied by instructional videos and project files to facilitate evaluation. The authors propose an Actor-Critic Embodied Agent framework, which includes a sophisticated GUI parser and a reasoning mechanism designed to handle complex procedural tasks. Experimental results indicate that while the proposed methods show promise, there is significant room for improvement, as the best-performing model achieved only a 46% success rate. The report concludes with an analysis of the limitations of current methods and suggests future directions for enhancing desktop GUI automation.