Abstract:
Our event aims to gather researchers and practitioners interested in discussing recent advances and promising directions for deep learning interpretability research, with a specific focus on Transformers-based language models and mechanistic approaches aimed at reverse-engineering their behaviors.
Chat is not available.