Skip to yearly menu bar Skip to main content


Poster Sat, Apr 25, 2026 • 11:15 AM – 1:45 PM PDT Pavilion 3 P3-#1625

Cascadia: An Efficient Cascade Serving System for Large Language Models

YOUHE JIANG ⋅ Fangcheng Fu ⋅ Wanru Zhao ⋅ Stephan Rabanser ⋅ Jintao Zhang ⋅ Nic Lane ⋅ Binhang Yuan

Abstract

Log in and register to view live content