ICLR Poster AlphaEdit: Null-Space Constrained Model Editing for Language Models

Poster

AlphaEdit: Null-Space Constrained Model Editing for Language Models

Junfeng Fang · Houcheng Jiang · Kun Wang · Yunshan Ma · Jie Shi · Xiang Wang · Xiangnan He · Tat-Seng Chua

Hall 3 + Hall 2B #568

[ Abstract ]

Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT

Oral presentation: Oral Session 6B
Sat 26 Apr 12:30 a.m. PDT — 2 a.m. PDT

Abstract:

Large language models (LLMs) often exhibit hallucinations, producing incorrect or outdated knowledge. Hence, model editing methods have emerged to enable targeted knowledge updates. To achieve this, a prevailing paradigm is the locating-then-editing approach, which first locates influential parameters and then edits them by introducing a perturbation. While effective, current studies have demonstrated that this perturbation inevitably disrupt the originally preserved knowledge within LLMs, especially in sequential editing scenarios.To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters. We theoretically prove that this projection ensures the output of post-edited LLMs remains unchanged when queried about the preserved knowledge, thereby mitigating the issue of disruption. Extensive experiments on various LLMs, including LLaMA3, GPT2-XL, and GPT-J, show that AlphaEdit boosts the performance of most locating-then-editing methods by an average of 36.7% with a single line of additional code for projection solely.

Live content is unavailable. Log in and register to view live content