Skip to yearly menu bar Skip to main content


How does information access affect LLM monitors' ability to detect sabotage?

Rauno Arike ⋅ Raja Moreno ⋅ Rohan Subramani ⋅ Shubhorup Biswas ⋅ Francis Ward

Abstract

Log in and register to view live content