CASteer: Cross-Attention Steering for Controllable Concept Erasure
Abstract
Diffusion models have transformed image generation, yet controlling their outputs for diverse applications, including content moderation and creative customization, remains challenging. Existing approaches usually require task-specific training and struggle to generalise across both concrete (e.g., objects) and abstract (e.g.,4 styles) concepts. We propose CASteer (Cross-Attention Steering), a training-free framework for controllable image generation using steering vectors to influence a diffusion model’s hidden representations dynamically. CASteer precomputes concept-specific steering vectors by averaging neural activations from images generated for each target concept. During inference, it dynamically applies these vectors to modify outputs only when necessary, either removing undesired concepts from images where they appear or adding desired concepts to images where they are absent. This selective activation ensures precise, context-aware adjustments without altering unaffected regions. This approach enables precise control over a wide range of tasks, including removing harmful content, interpolating between desired attributes, replacing objects, all without model retraining. CASteer outperforms state-of-the-art techniques while preserving unrelated content and minimising unintended effects. Code is provided in the supplementary