Skip to yearly menu bar Skip to main content


Cross-Family Speculative Prefill: Training-Free Long-Context Compression with Small Draft Models

Shubhangi Upasani ⋅ Guangtao Wang ⋅ Ravi Raju ⋅ Bo Li ⋅ Urmish Thakker ⋅ Mengmeng Ji ⋅ John Long ⋅ Chen Wu

Abstract

Chat is not available.