Transformers Provably Learn to Internalize Chain-of-Thought
Abstract
Chain-of-Thought (CoT) prompting can substantially improve the sample efficiency of Large Language Models (LLMs), reducing the complexity of tasks like parity learning from exponential to polynomial. However, this benefit comes at the cost of generating explicit reasoning steps, which is computationally expensive during inference. Implicit CoT (ICoT) has been proposed to mitigate this cost by training models to internalize these intermediate steps. In this work, we provide a theoretical analysis showing that ICoT retains the sample efficiency gains of explicit CoT, enabling models to solve complex tasks efficiently without sacrificing inference speed. Moreover, we propose Log-ICoT, a provably efficient curriculum that reduces training stages from linear dependency on the problem complexity to a logarithmic one. Our theoretical results are verified with numerical experiments, confirming that ICoT offers a path to robust reasoning without the high computational overhead.