Skip to yearly menu bar Skip to main content


Poster

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models

Ashutosh Baheti · Ximing Lu · Faeze Brahman · Ronan Le Bras · Maarten Sap · Mark Riedl
2024 Poster

Abstract

Video

Chat is not available.