Skip to yearly menu bar Skip to main content


Recursive Speculative Decoding: Accelerating LLM Inference via Sampling Without Replacement

Wonseok Jeon ⋅ Mukul Gagrani ⋅ Raghavv Goel ⋅ Junyoung Park ⋅ Mingu Lee ⋅ Christopher Lott

Abstract

Chat is not available.