ICLR Dual Operating Modes of In-Context Learning

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Dual Operating Modes of In-Context Learning

Ziqian Lin · Kangwook Lee

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

In-context learning (ICL) exhibits dual operating modes: task learning, i.e., acquiring a new skill from in-context samples, and task retrieval, i.e., locating and activating a relevant pretrained skill. Recent theoretical work proposes various mathematical models to analyze ICL, but they cannot fully explain the duality. In this work, we analyze the dual operating modes leveraging assumptions on the pretraining data. Based on our analysis, we obtain a quantitative understanding of the two operating modes of ICL. We first explain an unexplained phenomenon observed with real-world large language models (LLMs), where the ICL risk initially increases and then decreases with more in-context examples. We also analyze ICL with biased labels, e.g., zero-shot ICL, where in-context examples are assigned random labels, and predict the bounded efficacy of such approaches. We corroborate our analysis and predictions with extensive experiments with real-world LLMs.

Chat is not available.

Poster in Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Dual Operating Modes of In-Context Learning

Ziqian Lin · Kangwook Lee

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models