Skip to yearly menu bar Skip to main content


Poster Thu, Apr 23, 2026 • 11:15 AM – 1:45 PM PDT

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Mikhail Terekhov · Alexander Panfilov · Daniil Dzenhaliou · Caglar Gulcehre · Maksym Andriushchenko · Ameya Prabhu · Jonas Geiping

Abstract

Log in and register to view live content