arXiv AI recent: SuperThoughts: Reasoning Tokens in Superposition
Researchers proposed SuperThoughts, a method to improve Long Chain-of-Thought (CoT) reasoning in Large Language Models (LLMs) by compressing pairs of consecutive CoT tokens into single la...
The SuperThoughts method uses a lightweight Multi-Token Prediction (MTP) module to decode two tokens per step, preserving discrete token supervision at training time while doubling throughput at inference time. The method was applied to Qwen2.5-Math-1.5B-Instruct, Qwen2.5-Math-7B-Instruct, and Qw...