CAO-LLM: Catching, Adapting and Operating Under Distribution Drift for Large Language Models
Abstract
Large language models deployed in real-world environments inevitably encounter distribution drift like temporal shifts in data characteristics, emerging domains and evolving user requirements. Existing adaptation methods either ignore drift entirely, react post-hoc without anticipation or employ coupled optimization that fails to produce drift-specific responses. We propose CAO-LLM, a unified three-stage framework that Catches drift through representation-based monitoring, Adapts via calibrated parameter alignment with forgetting prevention and Operates at scale using test-time strategy selection. By temporally separating these complementary objectives, CAO-LLM avoids the interference that plagues joint optimization approaches. Experiments on Qwen2.5 models across 12 benchmarks spanning common-sense, coding, logic, social, medical and mathematical reasoning domains, demonstrate that CAO-LLM outperforms reactive and amortized adaptation baselines, achieving consistent gains across model sizes. With controlled behavioral analysis experiments and ablation studies on the full pipeline, we validate how and why, all three stages are essential for robust operation under drift.