2026-06-12 - Labs & Research - Brief - confirmed
Source label: Academic - Peer-reviewed or preprint research
COI: Adjacent
Trust Observatory evaluates vendors separately; the Ledger does not score or rank companies.
What happened: METR published Time Horizon 1.1 materials describing updated task-completion time-horizon measurements for public frontier language models.
What is known: METR says Time Horizon 1.1 uses more tasks and a new evaluation infrastructure, and its live time-horizon page is intended to be updated as new measurements are available.
What is not known: External replication, exact transfer from benchmark task horizons to deployed agent reliability, and downstream governance use remain open research questions.
Why it matters factually: The entry gives the Labs & Research desk a method-focused evaluation item while keeping measurement claims separate from product scoring.