Skip to yearly menu bar Skip to main content


Spotlight Poster

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Sam Toyer ⋅ Olivia Watkins ⋅ Ethan Mendes ⋅ Justin Svegliato ⋅ Luke Bailey ⋅ Tiffany Wang ⋅ Isaac Ong ⋅ Karim Elmaaroufi ⋅ Pieter Abbeel ⋅ trevor darrell ⋅ Alan Ritter ⋅ Stuart Russell
2024 Spotlight Poster

Abstract

Video

Chat is not available.