Skip to yearly menu bar Skip to main content


Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

Damai Dai ⋅ Yutao Sun ⋅ Li Dong ⋅ Yaru Hao ⋅ Shuming Ma ⋅ Zhifang Sui ⋅ Furu Wei

Abstract

Video

Chat is not available.