Subjecthood desk method note: We report the discourse. We do not assert AI systems are or are not conscious. We label position families.

arXiv AI recent: CombEval: A Framework for Evaluating Combinatorial Counting in Large Language Models

2026-06-20 arxiv.org

Researchers presented CombEval, a dynamic benchmark for evaluating combinatorial counting in large language models.,CombEval was used to evaluate 11 large language models under direct and...

CombEval represents each problem as a typed Cofola specification over entities, combinatorial objects, object dependencies, and constraints, enabling controlled generation of natural-language counting problems with exact solver-verified answers.,The code and generated benchmark suites for CombEva...

Sources

arXiv AI recent challenge