firstbacksecondback
107 Results
Workshop
|
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression Junyuan Hong · Jinhao Duan · Chenhui Zhang · Zhangheng LI · Chulin Xie · Kelsey Lieberman · James Diffenderfer · Brian Bartoldson · AJAY JAISWAL · Kaidi Xu · Bhavya Kailkhura · Dan Hendrycks · Dawn Song · Zhangyang Wang · Bo Li |
||
Workshop
|
A closer look at adversarial suffix learning for Jailbreaking LLMs Zhe Wang · Yanjun Qi |
||
Workshop
|
A closer look at adversarial suffix learning for Jailbreaking LLMs Zhe Wang · Yanjun Qi |
||
Workshop
|
I'm not familiar with the name Harry Potter: Prompting Baselines for Unlearning in LLMs Pratiksha Thaker · Yash Maurya · Virginia Smith |
||
Workshop
|
I'm not familiar with the name Harry Potter: Prompting Baselines for Unlearning in LLMs Pratiksha Thaker · Yash Maurya · Virginia Smith |
||
Workshop
|
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Fengqing Jiang · Zhangchen Xu · Luyao Niu · Zhen Xiang · Bhaskar Ramasubramanian · Bo Li · Radha Poovendran |
||
Workshop
|
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs Fengqing Jiang · Zhangchen Xu · Luyao Niu · Zhen Xiang · Bhaskar Ramasubramanian · Bo Li · Radha Poovendran |
||
Workshop
|
Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models Qiming Bao · Juho Leinonen · Alex Peng · Wanjun Zhong · Gaël Gendron · Timothy Pistotti · Alice Huang · Paul Denny · Michael Witbrock · Jiamou Liu |
||
Workshop
|
Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework Jingling Li · Zeyu Tang · Xiaoyu Liu · Peter Spirtes · Kun Zhang · Liu Leqi · Yang Liu |
||
Workshop
|
Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework Jingling Li · Zeyu Tang · Xiaoyu Liu · Peter Spirtes · Kun Zhang · Liu Leqi · Yang Liu |
||
Workshop
|
MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs Yavuz Faruk Bakman · Duygu Nur Yaldiz · Baturalp Buyukates · Chenyang Tao · Dimitrios Dimitriadis · Salman Avestimehr |
||
Workshop
|
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games Yizhe Zhang · Jiarui Lu · Navdeep Jaitly |