ICLR Poster HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Poster

HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models

Hayk Manukyan · Andranik Sargsyan · Barsegh Atanyan · Zhangyang Wang · Shant Navasardyan · Humphrey Shi

Hall 3 + Hall 2B #151

[ Abstract ] [ Project Page ]

Sat 26 Apr midnight PDT — 2:30 a.m. PDT

Abstract: Recent progress in text-guided image inpainting, based on the unprecedented success of text-to-image diffusion models, has led to exceptionally realistic and visually plausible results. However, there is still significant potential for improvement in current text-to-image inpainting models, particularly in better aligning the inpainted area with user prompts. Therefore, we introduce

$\textit{HD-Painter}$ , a

$\textbf{training-free}$ approach that

$\textbf{accurately follows prompts}$ . To this end, we design the

$\textit{Prompt-Aware Introverted Attention (PAIntA)}$ layer enhancing self-attention scores by prompt information resulting in better text aligned generations. To further improve the prompt coherence we introduce the

$\textit{Reweighting Attention Score Guidance (RASG)}$ mechanism seamlessly integrating a post-hoc sampling strategy into the general form of DDIM to prevent out-of-distribution latent shifts. Our experiments demonstrate that HD-Painter surpasses existing state-of-the-art approaches quantitatively and qualitatively across multiple metrics and a user study. Code is publicly available at: [https://github.com/Picsart-AI-Research/HD-Painter](https://github.com/Picsart-AI-Research/HD-Painter)

Live content is unavailable. Log in and register to view live content