ICLR Poster InstantPortrait: One-Step Portrait Editing via Diffusion Multi-Objective Distillation

Poster

InstantPortrait: One-Step Portrait Editing via Diffusion Multi-Objective Distillation

Zhixin Lai · Keqiang Sun · Fu-Yun Wang · Dhritiman Sagar · Erli Ding

Hall 3 + Hall 2B #113

[ Abstract ]

Sat 26 Apr midnight PDT — 2:30 a.m. PDT

Abstract:

Real-time instruction-based portrait image editing is crucial in various applications, including filters, augmented reality, and video communications, etc. However, real-time portrait editing presents three significant challenges: identity preservation, fidelity to editing instructions, and fast model inference. Given that these aspects often present a trade-off, concurrently addressing them poses an even greater challenge. While diffusion-based image editing methods have shown promising capabilities in personalized image editing in recent years, they lack a dedicated focus on portrait editing and thus suffer from the aforementioned problems as well. To address the gap, this paper introduces an Instant-Portrait Network (IPNet), the first one-step diffusion-based model for portrait editing. We train the network in two stages. We first employ an annealing identity loss to train an Identity Enhancement Network (IDE-Net), to ensure robust identity preservation. We then train the IPNet using a novel diffusion Multi-Objective Distillation approach that integrates adversarial loss, identity distillation loss, and a novel Facial-Style Enhancing loss. The Diffusion Multi-Objective Distillation approach efficiently reduces inference steps, ensures identity consistency, and enhances the precision of instruction-based editing. Extensive comparison with prior models demonstrates IPNet as a superior model in terms of identity preservation, text fidelity, and inference speed.

Live content is unavailable. Log in and register to view live content