arXiv AI recent: WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents
The authors introduced WorldLines, a benchmark for long‑horizon embodied household assistance that creates temporally extended household traces with dialogues, actions, execution feedback...
WorldLines constructs temporally extended household traces that include dialogues, actions, execution feedback, and changes to object and device states, and it generates evidence‑linked samples for two tasks: Memory Question Answering and Embodied Task Planning.,ObsMem is described as an observer...