Skip to yearly menu bar Skip to main content


(7 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Thu Apr 23 06:30 AM -- 06:40 AM (PDT) @ 201 C None
Benchmarking Empirical Privacy Protection for Adaptations of Large Language Models
Bartłomiej Marek ⋅ Lorenzo Rossi ⋅ Vincent Hanke ⋅ Xun Wang ⋅ Michael Backes ⋅ Franziska Boenisch ⋅ Adam Dziedzic
[ OpenReview
Oral
Thu Apr 23 06:42 AM -- 06:52 AM (PDT) @ 201 C None
Invisible Safety Threat: Malicious Finetuning for LLM via Steganography
Guangnian Wan ⋅ Xinyin Ma ⋅ Gongfan Fang ⋅ Xinchao Wang
[ OpenReview
Oral
Thu Apr 23 06:54 AM -- 07:04 AM (PDT) @ 201 C None
The Shape of Adversarial Influence: Characterizing LLM Latent Spaces with Persistent Homology
Aideen Fay ⋅ Inés García-Redondo ⋅ Qiquan Wang ⋅ Haim Dubossarsky ⋅ Anthea Monod
[ OpenReview
Oral
Thu Apr 23 07:06 AM -- 07:16 AM (PDT) @ 201 C None
Watch your steps: Dormant Adversarial Behaviors that Activate upon LLM Finetuning
Thibaud Gloaguen ⋅ Mark Vero ⋅ Robin Staab ⋅ Martin Vechev
[ OpenReview
Oral
Thu Apr 23 07:18 AM -- 07:28 AM (PDT) @ 201 C None
LLM Fingerprinting via Semantically Conditioned Watermarks
Thibaud Gloaguen ⋅ Robin Staab ⋅ Nikola Jovanović ⋅ Martin Vechev
[ OpenReview
Oral
Thu Apr 23 07:30 AM -- 07:40 AM (PDT) @ 201 C None
Steering the Herd: A Framework for LLM-based Control of Social Learning
Raghu Arghal ⋅ Kevin He ⋅ Shirin Saeedi Bidokhti ⋅ Saswati Sarkar
[ OpenReview
Oral
Thu Apr 23 07:42 AM -- 07:52 AM (PDT) @ 201 C None
Every Language Model Has a Forgery-Resistant Signature
Matthew Finlayson ⋅ Xiang Ren ⋅ Swabha Swayamdipta
[ OpenReview