Skip to yearly menu bar Skip to main content


Think Outside the Bot: Automating Evaluation of Creativity in LLMs for Physical Reasoning with Semantic Entropy and Efficient Multi-Agent Judge

Min Sen Tan ⋅ Zachary Choy ⋅ Swaagat Saikia ⋅ Syed Ali Redha Alsagoff ⋅ Banerjee Mohor ⋅ Nadya Wangsajaya ⋅ Alvin Chan

Abstract

Chat is not available.