Poster
in
Workshop: 2nd Workshop on Navigating and Addressing Data Problems for Foundation Models (DATA-FM)

Robust In-Context Learning via Multi-Armed Bandit-Based Partition Selection

Varul Srivastava · Sankarshan Damle · Manisha Padala

Project Page [ Poster] [ OpenReview]

Abstract

In-context learning (ICL) enables Large Language Models (LLMs) to adapt to new tasks without parameter updates, relying solely on exemplar selection. However, in real-world scenarios, data partitions may contain corrupted labels, degrading ICL performance. We address this challenge by formulating partition selection as a multi-armed bandit (MAB) problem, where each evaluation sample serves as a pull, allowing the model to identify the most reliable partitions iteratively. Using an Upper Confidence Bound (UCB) strategy, we progressively refine exemplar selection to mitigate the impact of noisy data. Empirical results demonstrate that UCB-based partition selection recovers performance comparable to settings without label noise, highlighting its effectiveness in improving ICL robustness.

Video

Chat is not available.