arXiv AI recent: Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding
The authors propose a quality-aware self-distillation method for vision‑language models applied to graphical user interface (GUI) grounding.,The method introduces soft correctness‑aware g...
GUI grounding requires VLMs to locate small target elements in high‑resolution screenshots and predict precise screen coordinates. On‑policy self‑distillation (OPSD) can provide dense token‑level teacher signals, but naive OPSD can produce unreliable signals when the student‑generated prefix devi...