Oral
in
Workshop: Secure and Trustworthy Large Language Models

BEYOND FINE-TUNING: LORA MODULES BOOST NEAR- OOD DETECTION AND LLM SECURITY

Etienne Salimbeni · Francesco Craighero · Renata Khasanova · Milos Vasic · Pierre Vandergheynst

Project Page [ OpenReview]

Abstract

Under resource constraints, LLMs are usually fine-tuned with additional knowl- edge using Parameter Efficient Fine-Tuning (PEFT), using Low-Rank Adaptation (LoRA) modules. In fact, LoRA injects a new set of small trainable matrices to adapt an LLM to a new task, while keeping the latter frozen. At deployment, LoRA weights are subsequently merged with the LLM weights to speed up inference. In this work, we show how to exploit the unmerged LoRA’s embedding to boost the performance of Out-Of-Distribution (OOD) detectors, especially in the more challenging near-OOD scenarios. Accordingly, we demonstrate how improving OOD detection also helps in characterizing wrong predictions in downstream tasks, a fundamental aspect to improve the reliability of LLMs. Moreover, we will present a use-case in which the sensitivity of LoRA modules and OOD detection are em- ployed together to alert stakeholders about new model updates. This scenario is particularly important when LLMs are out-sourced. Indeed, test functions should be applied as soon as the model changes the version in order to adapt prompts in the downstream applications. In order to validate our method, we performed tests on Multiple Choice Question Answering datasets, by focusing on the medical domain as a fine-tuning task. Our results motivate the use of LoRA modules even after deployment, since they provide strong features for OOD detection for fine-tuning tasks and can be employed to improve the security of LLMs.

Chat is not available.