ICLR 2024

Skip to yearly menu bar Skip to main content

97 Results

Workshop		Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation Seongyun Lee · Seungone Kim · Sue Park · Geewook Kim · Minjoon Seo
Workshop		Don't Label Twice: Quantity Beats Quality when Comparing Binary Classifiers on a Budget Florian Eddie Dorner · Moritz Hardt
Workshop		Evaluating predictive patterns of antigen specific B cells by single cell transcriptome and antibody repertoire sequencing Lena Erlach · Raphael Kuhn · Andreas Agrafiotis · Danielle Shlesinger · Alexander Yermanos · Sai Reddy
Workshop		Squeezing Lemons with Hammers: An Evaluation of AutoML and Tabular Deep Learning for Data-Scarce Classification Applications Ricardo Knauer · Erik Rodner
Workshop		Evaluating Large Language Models in an Emerging Domain: A Pilot Study in Decentralized Finance Joshua Pearlson · Xiaoyuan Liu · Chengsong Huang · Kripa George · Dawn Song · Chenguang Wang
Workshop	Sat 2:40	DARKIN: A zero-shot classification benchmark and an evaluation of protein language models Emine Ayşe Sunar · Zeynep Işık · Mert Pekey · Ramazan Gokberk Cinbis · Oznur Tastan
Workshop		TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness Danna Zheng · Danyang Liu · Mirella Lapata · J Pan
Workshop		CatCode: A Comprehensive Evaluation Framework for LLMs On the Mixture of Code and Text Zhenru Lin · Yiqun Yao · Yang Yuan
Workshop		Enhancing and Evaluating Logical Reasoning Abilities of Large Language Models Shujie Deng · Honghua Dong · Xujie Si
Workshop		Enhancing and Evaluating Logical Reasoning Abilities of Large Language Models Shujie Deng · Honghua Dong · Xujie Si
Workshop		TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness Danna Zheng · Danyang Liu · Mirella Lapata · J Pan
Workshop		On Fairness Implications and Evaluations of Low-Rank Adaptation of Large Models Ken Liu · Zhoujie Ding · Berivan Isik · Sanmi Koyejo