RouterArena: An Open Platform for Comprehensive Comparison of LLM Routers
Abstract
Today's LLM ecosystem comprises a wide spectrum of models that differ in size, capability, and cost. No single model is optimal for all scenarios; hence, LLM routers have become essential for selecting the most appropriate model under varying circumstances. However, the rapid emergence of various routers has led to fragmented evaluation practices and inconsistent metrics, making it difficult to systematically assess progress in this space. To address this problem, we need a comprehensive router comparison and a standardized leaderboard, similar to those available for models. In this work, we introduce RouterArena, the first open platform enabling comprehensive comparison of LLM routers. RouterArena has (1) a principally constructed dataset with broad knowledge domain coverage, (2) distinguishable difficulty levels for each domain, (3) an extensive list of evaluation metrics, and (4) an automated framework for evaluation and leaderboard updates. Leveraging this framework, we have produced the initial leaderboard with detailed metrics comparison. Figure1 provides a preview of the leaderboard. The complete framework and the latest router leaderboard are publicly available at https://routeworks.github.io/