Skip to yearly menu bar Skip to main content


Poster Thu, Apr 23, 2026 • 11:15 AM – 1:45 PM PDT Pavilion 4 P4-#4113

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Mikhail Terekhov ⋅ Alexander Panfilov ⋅ Daniil Dzenhaliou ⋅ Caglar Gulcehre ⋅ Maksym Andriushchenko ⋅ Ameya Prabhu ⋅ Jonas Geiping

Abstract

Log in and register to view live content