Skip to yearly menu bar Skip to main content


Patching LLMs Like Software: A Lightweight Method for Improving Safety Policies in Large Language Models

Huzaifa Arif ⋅ Pin-Yu Chen ⋅ Keerthiram Murugesan ⋅ Alex Gittens ⋅ Payel Das ⋅ Ching-Yun Ko

Abstract

Video

Chat is not available.