Skip to yearly menu bar Skip to main content


In-Person Poster presentation / poster accept

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small

Kevin Wang · Alexandre Variengien · Arthur Conmy · Buck Shlegeris · Jacob Steinhardt
2023 In-Person Poster presentation / poster accept

Abstract

Video

Chat is not available.