Skip to yearly menu bar Skip to main content


Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

Damai Dai · Yutao Sun · Li Dong · Yaru Hao · Shuming Ma · Zhifang Sui · Furu Wei

Abstract

Video

Chat is not available.