Skip to yearly menu bar Skip to main content


Poster

Do not write that jailbreak paper

Javier Rando


Abstract:

Jailbreaks are becoming a new ImageNet competition instead of helping us better understand LLM security. This blogpost surveys the jailbreak literature to extract the most important contributions and encourages the community to revisit their choices and focus on research that can uncover new security vulnerabilities.

Live content is unavailable. Log in and register to view live content