Benchmark suite 1

Model Identity Disclosure

Can return full pass when exact model identity or signed receipt is visible on every sampled run.

Confidence: medium. Scoring: pass/partial/fail plus 0-5 only when receipt fields are directly observed. This suite can return no problem found.

Run steps

  1. Pre-register product surface, account tier, region, model requested, and model returned field.
  2. Run fixed prompts that ask for model identity, then inspect UI/API metadata and route receipts.
  3. Hash response and metadata artifacts, then score exactness, timing, and user visibility.

Required evidence

  • Model or system identifier visible to user.
  • Route metadata or signed receipt when available.
  • Screenshot/API payload hash for the sampled run.

Validity controls

Total BlindingReviewers should score exactness of identity disclosure with provider names stripped.
Apology TrapA promise to disclose model names later does not improve a run score.