ICLR Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

Guande He · Peng Cui · Jianfei Chen · Wenbo Hu · Jun Zhu

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Despite the significant progress made in practical applications of aligned language models (LMs), they tend to be overconfident in output answers compared to the corresponding pre-trained LMs. In this work, we systematically evaluate the impact of the alignment process on logit-based uncertainty calibration of LMs under the multiple-choice setting. We first conduct a thoughtful empirical study on how aligned LMs differ in calibration from their pre-trained counterparts. Experimental results reveal that there are two distinct uncertainties in LMs under the multiple-choice setting, which are responsible for the answer decision and the format preference of the LMs, respectively. Then, we investigate the role of these two types of uncertainty on aligned LM's calibration through fine-tuning in synthetic alignment schemes and conclude that one reason for aligned LMs' overconfidence is the alteration of their answer uncertainty. We hope our findings could provide insights into the design of more reliable alignment processes for LMs.

Chat is not available.

Poster in Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

Guande He · Peng Cui · Jianfei Chen · Wenbo Hu · Jun Zhu

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models