Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Representational Alignment

Training Large Language Models for Self-Explanation Faithfulness

Yeoktatt Cheah ⋅ Maria Perez-Ortiz ⋅ Noah Y Siegel ⋅ Oana-Maria Camburu

Abstract

Chat is not available.