Track: Oral 7D

Oral

Oral 7D

Moderator: Grigorios Chrysos

Abstract:

Chat is not available.

Fri 10 May 1:00 - 1:15 PDT

One-shot Empirical Privacy Estimation for Federated Learning

Galen Andrew ⋅ Peter Kairouz ⋅ Sewoong Oh ⋅ Alina Oprea ⋅ H. Brendan McMahan ⋅ Vinith Suriyakumar

Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution), are tailored to specific tasks, model architectures, or DP algorithm, and/or require retraining the model many times (typically on the order of thousands). These shortcomings make deploying such techniques at scale difficult in practice, especially in federated settings where model training can take days or weeks. In this work, we present a novel “one-shot” approach that can systematically address these challenges, allowing efficient auditing or estimation of the privacy loss of a model during the same, single training run used to fit model parameters, and without requiring any a priori knowledge about the model architecture, task, or DP algorithm. We show that our method provides provably correct estimates for the privacy loss under the Gaussian mechanism, and we demonstrate its performance on a well-established FL benchmark dataset under several adversarial threat models.

Fri 10 May 1:15 - 1:30 PDT

On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs

Jen-tse Huang ⋅ Wenxuan Wang ⋅ Eric John Li ⋅ Man Ho LAM ⋅ Shujie Ren ⋅ Youliang Yuan ⋅ Wenxiang Jiao ⋅ Zhaopeng Tu ⋅ Michael Lyu

Large Language Models (LLMs) have recently showcased their remarkable capacities, not only in natural language processing tasks but also across diverse domains such as clinical medicine, legal consultation, and education. LLMs become more than mere applications, evolving into assistants capable of addressing diverse user requests. This narrows the distinction between human beings and artificial intelligence agents, raising intriguing questions regarding the potential manifestation of personalities, temperaments, and emotions within LLMs. In this paper, we propose a framework, PsychoBench, for evaluating diverse psychological aspects of LLMs. Comprising thirteen scales commonly used in clinical psychology, PsychoBench further classifies these scales into four distinct categories: personality traits, interpersonal relationships, motivational tests, and emotional abilities. Our study examines five popular models, namely text-davinci-003, ChatGPT, GPT-4, LLaMA-2-7b, and LLaMA-2-13b. Additionally, we employ a jailbreak approach to bypass the safety alignment protocols and test the intrinsic natures of LLMs. We have made PsychoBench openly accessible via https://github.com/CUHK-ARISE/PsychoBench.