Subjecthood desk method note: We report the discourse. We do not assert AI systems are or are not conscious. We label position families.

arXiv AI recent: Dissecting model behavior through agent trajectories

2026-06-17 arxiv.org

Researchers developed a customizable harness called 'Simple Strands Agent' (SSA) to analyze the 'intent-execution gap' in AI agents. They used this harness to analyze 138k trajectories to...

The authors define the 'intent-execution gap' as the mismatch between a model's intentions and the harness's execution. Using SSA, the researchers reproduced or improved pass@1 performance on benchmarks including SWE-Pro, SWE-Verified, and Terminal-Bench-2 across model families such as Claude, Ge...

Sources

arXiv AI recent challenge