Skip to yearly menu bar Skip to main content


Poster

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

Lorenzo Pacchiardi ⋅ Alex Chan ⋅ Sören Mindermann ⋅ Ilan Moscovitz ⋅ Alexa Pan ⋅ Yarin Gal ⋅ Owain Evans ⋅ Jan Brauner
2024 Poster

Abstract

Video

Chat is not available.