Skip to yearly menu bar Skip to main content


Poster

Robust LLM safeguarding via refusal feature adversarial training

Lei Yu · Virginie Do · Karen Hambardzumyan · Nicola Cancedda
2025 Poster

Abstract

Video

Chat is not available.