Over the last decade, progress in machine learning has resulted in a surge of data-driven services affecting our daily lives. Conversational agents, healthcare providers, online retailers, and social networks continually access and jointly process vast amounts of data about their geographically distributed customers. Progress in distributed machine learning technology which has enabled widespread adoption and personalization has also raised issues regarding privacy, accountability, and fairness. This tension is particularly apparent in the context of the Covid-19 pandemic. This motivates the need to jointly address distributed and private machine learning technologies.
Fri 8:30 a.m. - 8:40 a.m.
|
Introduction and Opening Remarks
(
Opening Remarks
)
|
Gautam Kamath 🔗 |
Fri 8:40 a.m. - 9:05 a.m.
|
Biased Client Selection for Improved Convergence of Federated Learning
(
Invited Talk
)
SlidesLive Video » Abstract: Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing. Previous works have analyzed the convergence of federated learning by assuming unbiased client participation, where clients are selected such that the aggregated model update is unbiased. In this paper, we present the first convergence analysis of federated optimization for biased client selection and quantify how the selection skew affects convergence speed. We discover from the convergence analysis that biasing client selection towards clients with higher local loss yields faster error convergence. Using this insight, we propose the power-of-choice client selection framework that can flexibly span the trade-off between convergence speed and solution bias. Extensive experiments demonstrate that power-of-choice strategies can converge up to 3x faster and give 10% higher test accuracy than the baseline random selection. Bio: Gauri Joshi is an assistant professor in the ECE department at Carnegie Mellon University since September 2017. Previously, she worked as a Research Staff Member at IBM T. J. Watson Research Center. Gauri completed her Ph.D. from MIT EECS in June 2016, advised by Prof. Gregory Wornell. She received her B.Tech and M.Tech in Electrical Engineering from the Indian Institute of Technology (IIT) Bombay in 2010. Her awards and honors include the NSF CAREER Award (2021), ACM Sigmetrics Best Paper Award (2020), NSF CRII Award (2018), IBM Faculty Research Award (2017), Best Thesis Prize in Computer science at MIT (2012), and Institute Gold Medal of IIT Bombay (2010). |
Gauri Joshi 🔗 |
Fri 9:06 a.m. - 9:10 a.m.
|
Q&A for Biased Client Selection for Improved Convergence of Federated Learning
(
Q&A
)
|
🔗 |
Fri 9:10 a.m. - 9:35 a.m.
|
Frequency Estimation in Local and Multiparty Differential Privacy
(
Invited Talk
)
SlidesLive Video » Abstract: Computing the frequency of items in a way that is both private and accurate is at the heart of many computational tasks. This talk presents recent results on this task in two distributed models, the local differential privacy model, and the more general multiparty differential privacy model. By combining sampling, sketching and random noise techniques, we can obtain private estimators that are unbiased, accurate, and have low communication overheads. Bio: Graham Cormode works on topics in privacy and data summarization. He is a Fellow of the ACM, and recipient of the 2017 Adams Prize for Mathematics. He is co-author of the book "Small Summaries for Big Data". |
Graham Cormode 🔗 |
Fri 9:35 a.m. - 9:40 a.m.
|
Q&A for Frequency Estimation in Local and Multiparty Differential Privacy
(
Q&A
)
|
🔗 |
Fri 9:40 a.m. - 10:05 a.m.
|
Inference Risks for Machine Learning
(
Invited Talk
)
SlidesLive Video » Abstract: When models are trained on private data, such as medical records or personal emails, there is a risk that those models not only learn the hoped-for patterns, but will also learn and expose sensitive information about their training data. Several different types of inference attacks on machine learning models have been found, and we will characterize inference risks according to whether they expose statistical properties of the distribution used for training or specific information in the training dataset. Differential privacy provides formal guarantees bounding some (but not all) types of inference risk, but providing substantive differential privacy guarantees with state-of-the-art methods requires adding so much noise to the training process for complex models that the resulting models are useless. Experimental evidence, however, suggests that in practice inference attacks have limited power, and in many cases, a very small amount of privacy noise seems to be enough to defuse inference attacks. In this talk, I will overview of a variety of different inference risks for machine learning models and report on some experiments to better understand the power of inference attacks in more realistic settings. Bio: David Evans is a Professor of Computer Science at the University of Virginia where he leads a research group focusing on security and privacy (https://uvasrg.github.io). He won the Outstanding Faculty Award from the State Council of Higher Education for Virginia, and was Program Co-Chair for the 24th ACM Conference on Computer and Communications Security (CCS 2017) and the 30th (2009) and 31st (2010) IEEE Symposia on Security and Privacy, where he initiated the Systematization of Knowledge (SoK) papers. He is the author of an open computer science textbook (https://computingbook.org) and a children's book on combinatorics and computability (https://dori-mic.org), and co-author of a book on secure multi-party computation (https://securecomputation.org/). He has SB, SM and PhD degrees from MIT and has been a faculty member at the University of Virginia since 1999. |
David Evans 🔗 |
Fri 10:05 a.m. - 10:10 a.m.
|
Q&A for Inference Risks for Machine Learning
(
Q&A
)
|
🔗 |
Fri 10:10 a.m. - 10:28 a.m.
|
Coffee Break
|
🔗 |
Fri 10:28 a.m. - 10:30 a.m.
|
Introduction for the Contributed Talks
(
Intro
)
|
Fatemehsadat Mireshghallah 🔗 |
Fri 10:30 a.m. - 10:42 a.m.
|
Federated Learning with Taskonomy
(
Contributed Talk
)
SlidesLive Video » Classical federated learning approaches incur significant performance degradation in the presence of non-IID client data. A possible direction to address this issue is forming clusters of clients with roughly IID data. We introduce federated learning with taskonomy (FLT) that generalizes this direction by learning the task-relatedness between clients for more efficient federated aggregation of heterogeneous data. In a one-off process, the server provides the clients with a pretrained encoder to compress their data into a latent representation, and transmit the signature of their data back to the server. The server then learns the task-relatedness among clients via manifold learning, and performs a generalization of federated averaging. We demonstrate that FLT not only outperforms the existing state-of-the-art baselines but also offers improved fairness across clients. |
Hadi Jamali-Rad · Mohammad Abdizadeh · Attila Szabó 🔗 |
Fri 10:42 a.m. - 10:54 a.m.
|
Privacy Amplification via Iteration for Shuffled and Online PNSGD
(
Contributed Talk
)
SlidesLive Video » We consider the framework of privacy amplification via iteration, originally proposed by Feldman et al. and then simplified by Asoodeh et al. in their analysis via contraction coefficient. This line of work studies the privacy guarantees obtained by the projected noisy stochastic gradient descent (PNSGD) algorithm with hidden intermediate updates. Here, we first prove a privacy guarantee for shuffled PNSGD, which is investigated asymptotically when the noise is fixed for each individual but reduced as the sample size n grows. We then provide a faster decaying scheme for the magnitude of the injected noise that also guarantees the convergence of privacy loss when new data are received in an online fashion. |
Matteo Sordello · Zhiqi Bu · Jinshuo Dong · Weijie J Su 🔗 |
Fri 10:54 a.m. - 11:06 a.m.
|
TenSEAL: A Library for Encrypted Tensor Operations Using Homomorphic Encryption
(
Contributed Talk
)
SlidesLive Video » Machine learning algorithms have achieved remarkable results and are widely applied in a variety of domains. These algorithms often rely on sensitive and private data such as medical and financial records. Therefore, it is vital to draw further attention regarding privacy threats and corresponding defensive techniques applied to machine learning models. In this paper, we present TenSEAL, an open-source library for Privacy-Preserving Machine Learning using Homomorphic Encryption that can be easily integrated within popular machine learning frameworks. We benchmark our implementation using MNIST and show that an encrypted convolutional neural network can be evaluated in less than a second, using less than half a megabyte of communication. |
Ayoub Benaissa 🔗 |
Fri 11:06 a.m. - 11:18 a.m.
|
Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization
(
Contributed Talk
)
SlidesLive Video » Large scale distributed optimization has become the default tool for the training of supervised machine learning models with a large number of parameters and training data. Recent advancements in the field provide several mechanisms for speeding up the training, including {\em compressed communication}, {\em variance reduction} and {\em acceleration}. However, none of these methods is capable of exploiting the inherently rich data-dependent smoothness structure of the local losses beyond standard smoothness constants. In this paper, we argue that when training supervised models, {\em smoothness matrices}---information-rich generalizations of the ubiquitous smoothness constants---can and should be exploited for further dramatic gains, both in theory and practice. In order to further alleviate the communication burden inherent in distributed optimization, we propose a novel communication sparsification strategy that can take full advantage of the smoothness matrices associated with local losses. To showcase the power of this tool, we describe how our sparsification technique can be adapted to three distributed optimization algorithms---DCGD \citep{KFJ}, DIANA \citep{MGTR} and ADIANA \citep{AccCGD}---yielding significant savings in terms of communication complexity. The new methods always outperform the baselines, often dramatically so. |
Mher Safaryan · Filip Hanzely · Peter Richtarik 🔗 |
Fri 11:18 a.m. - 11:30 a.m.
|
Leveraging Public Data for Practical Private Query Release
(
Contributed Talk
)
SlidesLive Video » In many statistical problems, incorporating priors can significantly improve performance. However, using prior knowledge in differentially private query release has remained underexplored, despite such priors commonly being available in the form of public data, such as previous US Census releases. With the goal of releasing statistics about a private dataset, we present PMW^Pub, which---unlike existing baselines---leverages public data drawn from a related distribution as prior information. We provide a theoretical analysis and an empirical evaluation on the American Community Survey, showing that PMW^Pub outperforms state-of-the-art methods. Furthermore, our method scales well to high-dimensional data domains, where running many existing methods would be computationally infeasible. |
Terrance Liu · Giuseppe Vietri · Thomas Steinke · Jonathan Ullman · Steven Wu 🔗 |
Fri 11:30 a.m. - 12:13 p.m.
|
Poster Session and Break
(
Poster Session
)
link »
join with Google Chrome please: https://eventhosts.gather.town/app/6VP8iLAgUO3EdXsx/dpml-iclr The posters are in poster rooms 1 and 2 |
🔗 |
Fri 12:13 p.m. - 12:15 p.m.
|
Introduction for A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via f-Divergences
(
Intro
)
|
🔗 |
Fri 12:15 p.m. - 12:40 p.m.
|
A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via f-Divergences
(
Invited Talk
)
SlidesLive Video » Abstract: Differential privacy (DP) has come to be accepted as a strong definition and approach to designing privacy mechanisms and assuring privacy guarantees. In practice, variants of differential privacy such as (e,d) and Rényi DP for better utility. In machine learning applications wherein differentially private noise is added to each iteration of stochastic gradient descent (SGD), for a fixed choice of overall DP parameters, privacy deteriorates with each iteration. In this talk, we present a novel way of using information-theoretic methods to tighten the conversion between (e,d)-DP and Rényi-DP where the latter is used in the intermediate steps to add noise to SGD iterations. Compared to the state-of-the-art, we show that our bounds can lead to 100 or more SGD iterations for training deep learning models for the same privacy budget. Bio: Lalitha Sankar is an Associate Professor in the School of Electrical, Computer, and Energy Engineering at Arizona State University. She received her doctorate from Rutgers University, her masters from the University of Maryland and her Bachelors degree from the Indian Institute of Technology, Bombay. Her research is at the intersection of information theory and learning theory and its applications to identifying meaningful metrics for information privacy and algorithmic fairness. She received the NSF CAREER award in 2014 and currently leads an NSF-and Google-funded effort on using learning techniques to assess COVID-19 exposure risk in a secure and privacy-preserving manner. |
Lalitha Sankar 🔗 |
Fri 12:40 p.m. - 12:45 p.m.
|
Q&A for A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via f-Divergences
(
Q&A
)
|
🔗 |
Fri 12:45 p.m. - 1:00 p.m.
|
Concluding Remarks and Awards
(
Concluding Remarks
)
|
Fatemehsadat Mireshghallah · Praneeth Vepakomma 🔗 |
-
|
Federated Learning's Blessing: FedAvg has Linear Speedup
(
Poster
)
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data. The characteristics of non-i.i.d. data across the network, low device participation, high communication costs, and the mandate that data remain private bring challenges in understanding the convergence of FL algorithms, particularly in regards to how convergence scales with the number of participating devices. In this paper, we focus on Federated Averaging (FedAvg)--arguably the most popular and effective FL algorithm class in use today--and provide a unified and comprehensive study of its convergence rate. Although FedAvg has recently been studied by an emerging line of literature, it remains open as to how FedAvg's convergence scales with the number of participating devices in the fully heterogeneous FL setting--a crucial question whose answer would shed light on the performance of FedAvg in large FL systems. We fill this gap by providing a unified analysis that establishes convergence guarantees for FedAvg under three classes of problems: strongly convex smooth, convex smooth, and overparameterized strongly convex smooth problems. We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies. While there have been linear speedup results from distributed optimization that assumes full participation, ours are the first to establish linear speedup for FedAvg under both statistical and system heterogeneity. For strongly convex and convex problems, we also characterize the corresponding convergence rates for the Nesterov accelerated FedAvg algorithm, which are the first linear speedup guarantees for momentum variants of FedAvg in the convex setting. Empirical studies of the algorithms in various settings have supported our theoretical results. |
Zhaonan Qu · Kaixiang Lin · Zhaojian Li · Jiayu Zhou · Zhengyuan Zhou 🔗 |
-
|
Distributed Gaussian Differential Privacy Via Shuffling
(
Poster
)
SlidesLive Video »
Traditionally, there are two models for implementing differential privacy: local model and centralized model. \emph{Shuffled model} is a relatively new model that aims to provide greater accuracy while preserving privacy by shuffling batches of similar data. In this paper, we consider the analytic privacy study of a \emph{shuffled model} for ``$f$-differential privacy''($f$-DP), a new relaxation of traditional $(\epsilon,\delta)$-differential privacy. We provide a powerful technique to import the existing \emph{shuffled model} results proven for the $(\epsilon,\delta)$-DP to $f$-DP, with which we derive a simple and easy-to-interpret theorem of privacy amplification by shuffling for $f$-DP. Furthermore, we prove that compared with the original \emph{shuffled model} from \cite{cheu2019distributed}, $f$-DP provides a tighter upper bound in terms of the privacy analysis of sum queries. The approach of $f$-DP can be applied to broader classes of models to achieve more accurate privacy analysis
|
Kan Chen · Qi Long 🔗 |
-
|
AsymmetricML: An Asymmetric Decomposition Framework for Privacy-Preserving DNN Training and Inference
(
Poster
)
SlidesLive Video »
Leveraging specialized parallel hardware, such as GPUs, to conduct DNN training/inference significantly reduces time. However, data in these platforms is visible to any party, which, in certain circumstances, raises great concerns of data misuse. Trusted execution environments (TEEs) protect data privacy by performing training/inference in a secure environment, but at the cost of serious performance degradation. To bridge the gap between privacy and computing performance, we propose an \emph{asymmetric} model-splitting framework, AsymmetricML, to (1) exploit computing power in specialized parallel hardware; and (2) preserve data privacy in TEEs during DNN training/inference. AsymML asymmetrically splits a DNN model into two parts: the first part features most sensitive data information but less computation; while most computation is performed in the second part. Evaluations on typical models (VGG, ResNet) shows the framework delivers $5.9\times$ speedup in model inference, and $5.4\times$ in model training compared with TEE-only executions.
|
Yue Niu · Salman Avestimehr 🔗 |
-
|
Layer-wise Characterization of Latent Information Leakage in Federated Learning
(
Poster
)
SlidesLive Video » Training deep neural networks via federated learning allows clients to share, instead of the original data, only the model trained on their data. Prior work has demonstrated that in practice a client's private information, unrelated to the main learning task, can be discovered from the model's gradients, which compromises the promised privacy protection. However, there is still no formal approach for quantifying the leakage of private information via the shared updated model or gradients. In this work, we analyze property inference attacks and define two metrics based on (i) an adaptation of the empirical V-information, and (ii) a sensitivity analysis using Jacobian matrices allowing us to measure changes in the gradients with respect to latent information. We show the applicability of our proposed metrics in localizing private latent information in a layer-wise manner and in two settings where (i) we have or (ii) we do not have knowledge of the attackers' capabilities. We evaluate the proposed metrics for quantifying information leakage on three real-world datasets using three benchmark models. |
Fan Mo · Anastasia Borovykh · Mohammad Malekzadeh · Hamed Haddadi · Soteris Demetriou 🔗 |
-
|
CAUSALLY CONSTRAINED DATA SYNTHESIS FOR PRIVATE DATA RELEASE
(
Poster
)
SlidesLive Video » Making evidence based decisions requires data. However for real-world applica- tions, the privacy of data is critical. Using synthetic data which reflects certain statistical properties of the original data preserves the privacy of the original data. To this end, prior works utilize differentially private data release mechanisms to provide formal privacy guarantees. However, such mechanisms have unacceptable privacy vs. utility trade-offs. We propose incorporating causal information into the training process to favorably modify the aforementioned trade-off. We theo- retically prove that generative models trained with additional causal knowledge provide stronger differential privacy guarantees. Empirically, we evaluate our solution comparing different models based on variational auto-encoders (VAEs), and show that causal information improves resilience to membership inference, with improvements in downstream utility. |
Varun Chandrasekaran · Darren Edge · Somesh Jha · Amit Sharma · Cheng Zhang · Shruti Tople 🔗 |
-
|
MPCLeague: Robust 4-party Computation for Privacy-Preserving Machine Learning
(
Poster
)
SlidesLive Video » Secure computation has demonstrated its potential in several practical use-cases, particularly in privacy-preserving machine learning (PPML). Robustness, the property that guarantees output delivery irrespective of adversarial behaviour, and efficiency, are the two first-order asks of a successfully deployable PPML framework. Towards this, we propose the first robust, highly-efficient mixed-protocol framework, MPCLeague that works with four parties, offers malicious security, and supports ring. MPCLeague has a multifold improvement over ABY3 (Mohassel et al. CCS'18), a 3-party framework achieving security with abort, and improves upon Trident (Chaudhari et al. NDSS’20), a 4-party framework achieving security with fairness. MPCLeague's competence is tested with extensive benchmarking for deep neural networks such as LeNet and VGG16, and support vector machines. |
Nishat Koti · Arpita Patra · Ajith Suresh 🔗 |
-
|
Privacy Amplification via Iteration for Shuffled and Online PNSGD
(
Poster
)
We consider the framework of privacy amplification via iteration, originally proposed by Feldman et al. and then simplified by Asoodeh et al. in their analysis via contraction coefficient. This line of work studies the privacy guarantees obtained by the projected noisy stochastic gradient descent (PNSGD) algorithm with hidden intermediate updates. Here, we first prove a privacy guarantee for shuffled PNSGD, which is investigated asymptotically when the noise is fixed for each individual but reduced as the sample size n grows. We then provide a faster decaying scheme for the magnitude of the injected noise that also guarantees the convergence of privacy loss when new data are received in an online fashion. |
Matteo Sordello · Zhiqi Bu · Jinshuo Dong · Weijie J Su 🔗 |
-
|
Computing Differential Privacy Guarantees for Heterogeneous Compositions Using FFT
(
Poster
)
SlidesLive Video » The recently proposed Fast Fourier Transform (FFT)-based accountant for evaluating (ε,δ)-differential privacy guarantees using the privacy loss distribution formalism has been shown to give tighter bounds than commonly used methods such as Rényi accountants when applied to compositions of homogeneous mechanisms. This approach is also applicable to certain discrete mechanisms that cannot be analysed with Rényi accountants. In this paper, we extend this approach to compositions of heterogeneous mechanisms. We carry out a full error analysis that allows choosing the parameters of the algorithm such that a desired accuracy is obtained. Using our analysis, we also give a bound for the computational complexity in terms of the error which is analogous to and slightly tightens the one given by Murtagh and Vadhan (2018). We also show how to speed up the evaluation of tight privacy guarantees using the Plancherel theorem at the cost of increased pre-computation and memory usage. |
Antti Koskela · Antti Honkela 🔗 |
-
|
Membership Inference Attack on Graph Neural Networks
(
Poster
)
SlidesLive Video » We focus on how trained Graph Neural Network (GNN) models could leak information about the \emph{member} nodes that they were trained on. We introduce two realistic inductive settings for carrying out a membership inference attacks on GNNs. While choosing the simplest possible attack model that utilizes the posteriors of the trained model, we thoroughly analyze the properties of GNNs which dictate the differences in their robustness towards Membership Inference (MI) attack. The surprising and worrying fact is that the attack is successful even if the target model generalizes well. While in traditional machine learning models, overfitting is considered the main cause of such leakage, we show that in GNNs the additional structural information is the major contributing factor. We support our findings by extensive experiments on four representative GNN models. On a positive note, we identify properties of certain models which make them less vulnerable to MI attacks than others. |
Iyiola Emmanuel Olatunji · Wolfgang Nejdl · Megha Khosla 🔗 |
-
|
Privacy and Integrity Preserving Training Using Trusted Hardware
(
Poster
)
SlidesLive Video » Privacy and security-related concerns are growing as machine learning reaches diverse application domains. The data holders want to train with private data while exploiting accelerators, such as GPUs, that are hosted in the cloud. However, Cloud systems are vulnerable to the attackers that compromise privacy of data and integrity of computations. This work presents DarKnight, a framework for large DNN training while protecting input privacy and computation integrity. DarKnight relies on cooperative execution between trusted execution environments (TEE) and accelerators, where the TEE provides privacy and integrity verification, while accelerators perform the computation heavy linear algebraic operations. |
Seyedeh Hanieh Hashemi · Yongqin Wang · Murali Annavaram 🔗 |
-
|
Practical Defences Against Model Inversion Attacks for Split Neural Networks
(
Poster
)
SlidesLive Video » We describe a threat model under which a split network-based federated learning system is susceptible to a model inversion attack by a malicious computational server. We demonstrate that the attack can be successfully performed with limited knowledge of the data distribution by the attacker. We propose a simple additive noise method to defend against model inversion, finding that the method can significantly reduce attack efficacy at an acceptable accuracy trade-off on MNIST. Furthermore, we show that NoPeekNN, an existing defensive method, protects different information from exposure, suggesting that a combined defence is necessary to fully protect private user data. |
Tom Titcombe · Adam Hall · Pavlos Papadopoulos · Daniele Romanini 🔗 |
-
|
TenSEAL: A Library for Encrypted Tensor Operations Using Homomorphic Encryption
(
Poster
)
Machine learning algorithms have achieved remarkable results and are widely applied in a variety of domains. These algorithms often rely on sensitive and private data such as medical and financial records. Therefore, it is vital to draw further attention regarding privacy threats and corresponding defensive techniques applied to machine learning models. In this paper, we present TenSEAL, an open-source library for Privacy-Preserving Machine Learning using Homomorphic Encryption that can be easily integrated within popular machine learning frameworks. We benchmark our implementation using MNIST and show that an encrypted convolutional neural network can be evaluated in less than a second, using less than half a megabyte of communication. |
Ayoub Benaissa 🔗 |
-
|
UNDERSTANDING CLIPPED FEDAVG: CONVERGENCE AND CLIENT-LEVEL DIFFERENTIAL PRIVACY
(
Poster
)
SlidesLive Video » Providing privacy guarantees has been one of the primary motivations of Federated Learning (FL). However, to guarantee the client-level differential privacy (DP) in FL algorithms, the clients’ transmitted model updates have to be clipped before adding privacy noise. Such clipping operation is substantially different from its counterpart in the centralized differentially private SGD and has not been well-understood. In this paper, we first empirically demonstrate that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity when training neural networks, which is partly because the clients’ updates become similar for several popular deep architectures. Based on this key observation, we provide the convergence analysis of a DP FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients’ updates. Our result leads to a natural guarantee of client-level DP for FedAvg. To the best of our knowledge, this is the first work that rigorously investigates theoretical and empirical issues regarding the clipping operation in FL algorithms. |
Xinwei Zhang · Xiangyi Chen · Jinfeng Yi · Steven Wu · Mingyi Hong 🔗 |
-
|
Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization
(
Poster
)
SlidesLive Video » Large scale distributed optimization has become the default tool for the training of supervised machine learning models with a large number of parameters and training data. Recent advancements in the field provide several mechanisms for speeding up the training, including {\em compressed communication}, {\em variance reduction} and {\em acceleration}. However, none of these methods is capable of exploiting the inherently rich data-dependent smoothness structure of the local losses beyond standard smoothness constants. In this paper, we argue that when training supervised models, {\em smoothness matrices}---information-rich generalizations of the ubiquitous smoothness constants---can and should be exploited for further dramatic gains, both in theory and practice. In order to further alleviate the communication burden inherent in distributed optimization, we propose a novel communication sparsification strategy that can take full advantage of the smoothness matrices associated with local losses. To showcase the power of this tool, we describe how our sparsification technique can be adapted to three distributed optimization algorithms---DCGD \citep{KFJ}, DIANA \citep{MGTR} and ADIANA \citep{AccCGD}---yielding significant savings in terms of communication complexity. The new methods always outperform the baselines, often dramatically so. |
Mher Safaryan · Filip Hanzely · Peter Richtarik 🔗 |
-
|
Meta Federated Learning
(
Poster
)
SlidesLive Video » Due to its distributed methodology alongside its privacy-preserving features, Federated Learning (FL) is vulnerable to training time backdoor attacks. Contemporary defenses against backdoor attacks in FL require direct access to each individual client's update which is not feasible in recent FL settings where Secure Aggregation is deployed. In this study, we seek to answer the following question, ”Is it possible to defend against backdoor attacks when secure aggregation is in place?”. To this end, we propose Meta Federated Learning (Meta-FL), a novel variant of FL which not only is compatible with secure aggregation protocol but also facilitates defense against backdoor attacks. |
Omid Aramoon · Gang Qu · Pin-Yu Chen · Yuan Tian 🔗 |
-
|
Talk Less, Smile More: Reducing Communication with Distributed Auto-Differentiation
(
Poster
)
SlidesLive Video » The gradient has long been the most common shared statistic for distributed machine learning; however, distributed deep neural networks (DNNs) tend to be large, so transmitting gradients can consume considerable bandwidth. Methods such as sparsification and quantization have emerged in attempts to reduce this, but the focus remains on compressing gradients, rather than sharing some other value. Here, we present an unexplored shift away from gradients towards a statistic which is more communication-friendly than the gradient, yet still grounded in mathematically correct optimization. The process, inspired by auto-differentiation, also provides unique insights into how gradients are composed via the outer-product. This insight can be further exploited to obtain a low-rank approximation of the gradients, which further reduces communication, while providing a better approximation of the gradient than other low-rank compression methods. |
Bradley Baker · Vince Calhoun · Barak Pearlmutter · Sergey Plis 🔗 |
-
|
Does Differential Privacy Defeat Data Poisoning?
(
Poster
)
SlidesLive Video » Data poisoning attacks have attracted considerable interest, both from the practical and theoretical machine learning communities. Recently, following widespread adoption for its privacy properties, differential privacy has been proposed as a defense from data poisoning attacks. In this paper, we show that the connection between poisoning and differential privacy is more complicated than it would appear. We argue that differential privacy itself does not serve as a defense, but that differential privacy benefits from robust machine learning algorithms, explaining much of differential privacy's success against poisoning. |
Matthew Jagielski · Alina Oprea 🔗 |
-
|
PyVertical: A Vertical Federated Learning Framework for Multi-headed SplitNN
(
Poster
)
SlidesLive Video » We introduce PyVertical, a framework supporting vertical federated learning using split neural networks. The proposed framework allows a data scientist to train neural networks on data features vertically partitioned across multiple owners while keeping raw data on an owner’s device. To link entities shared across different datasets’ partitions, we use Private Set Intersection on IDs associated with data points. To demonstrate the validity of the proposed framework, we present the training of a simple dual-headed split neural network for a MNIST classification task, with data samples vertically distributed across two data owners and a data scientist. |
Daniele Romanini · Adam Hall · Pavlos Papadopoulos · Tom Titcombe · Abbas Ismail · Tudor Cebere · Robert Sandmann · Robin Roehm · Michael Hoeh 🔗 |
-
|
On Privacy and Confidentiality of Communications in Organizational Graphs
(
Poster
)
SlidesLive Video » Machine learned models trained on organizational communication data, such as emails in an enterprise, carry unique risks of breaching confidentiality, even if the model is intended only for internal use. This work shows how confidentiality is distinct from privacy in an enterprise context, and aims to formulate an approach to preserving confidentiality while leveraging principles from differential privacy (DP). Works that apply DP techniques to natural language processing tasks usually assume independently distributed data, and overlook potential correlation among the records. Ignoring this correlation results in a fictional promise of privacy while, conversely, extending DP techniques to include group privacy is over-cautious and severely impacts model utility. We introduce a middle-ground solution, proposing a model that captures the correlation in the social network graph, and incorporates this correlation in the privacy calculations through Pufferfish privacy principles. |
Masoumeh Shafieinejad · Huseyin Inan · Marcello Hasegawa · Robert Sim 🔗 |
-
|
Towards Prior-Free Approximately Truthful One-Shot Auction Learning via Differential Privacy
(
Poster
)
SlidesLive Video » Designing truthful, revenue maximizing auctions is a core problem of auction design. Multi-item settings have long been elusive. Recent work of Dütting et. al. (2020) introduces effective deep learning techniques to find such auctions for the prior-dependent setting, in which distributions about bidder preferences are known. One remaining problem is to obtain priors in a way that excludes the possibility of manipulating the resulting auctions. Using techniques from differential privacy for the construction of approximately truthful mechanisms, we modify the RegretNet approach to be applicable to the prior-free setting. In this more general setting, no distributional information is assumed, but we trade this property for worse performance. |
Daniel Reusche · Nicolás Della Penna 🔗 |
-
|
FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks
(
Poster
)
SlidesLive Video » Graph Neural Network (GNN) research is rapidly growing thanks to the capacity of GNNs to learn representations from graph-structured data. However, centralizing a massive amount of real-world graph data for GNN training is prohibitive due to user-side privacy concerns, regulation restrictions, and commercial competition. Federated learning (FL), a trending distributed learning paradigm, aims to solve this challenge while preserving privacy. Despite recent advances in CV and NLP, there is no suitable platform for the federated training of GNNs. To this end, we introduce FedGraphNN, an open research federated learning system and the benchmark to facilitate GNN-based FL research. FedGraphNN is built on a unified formulation of federated GNNs and supports commonly used datasets, GNN models, FL algorithms, and flexible APIs. We also include a new molecular dataset, hERG, to promote research exploration. Our experimental results present significant challenges from federated GNN training: federated GNNs perform worse in most datasets with a non-I.I.D split than centralized GNNs; the GNN model that performs the best in centralized training may not hold its advantage in the federated setting. These results imply that more research effort is needed to unravel the mystery of federated GNN training. Moreover, our system performance analysis demonstrates that the FedGraphNN system is affordable to most research labs with a few GPUs. FedGraphNN will be regularly updated and welcomes inputs from the community. |
Chaoyang He · Keshav Balasubramanian · Emir Ceyani · Yu Rong · Junzhou Huang · Murali Annavaram · Salman Avestimehr 🔗 |
-
|
Direct Federated Neural Architecture Search
(
Poster
)
SlidesLive Video » Neural Architecture Search (NAS) is a collection of methods to craft the way neural networks are built. We apply this idea to Federated Learning (FL), wherein neural networks with predefined architecture are trained on the client/device data. This approach is not optimal as the model developers can't observe the local data, and hence, are unable to build highly accurate and efficient models. NAS is promising for FL which can search for global and personalized models automatically for the non-IID data. Most NAS methods are computationally expensive and require fine-tuning after the search, making it a two-stage complex process with possible human intervention. Thus there is a need for end-to-end NAS which can run on the heterogeneous data and resource distribution typically seen in a FL scenario. In this paper, we present an effective approach for direct federated NAS which is hardware agnostic, computationally lightweight, and a one-stage method to search for ready-to-deploy neural network models. Our results show an order of magnitude reduction in resource consumption while edging out prior art in accuracy. This opens up a window of opportunity to create optimized and computationally efficient federated learning systems. |
Anubhav Garg · Amit Saha · Debojyoti Dutta 🔗 |
-
|
Differentially Private Multi-Task Learning
(
Poster
)
SlidesLive Video » Many problems in machine learning rely on multi-task learning (MTL), in which the goal is to solve multiple related machine learning tasks simultaneously. MTL is particularly relevant for privacy-sensitive applications in areas such as healthcare, finance, and IoT computing, where sensitive data from multiple, varied sources are shared for the purpose of learning. In this work, we formalize notions of multi-task privacy via joint differential privacy (JDP), a relaxation of Differential Privacy (DP) for mechanism design and distributed optimization. We then propose a differentially private algorithm for the commonly-used mean-regularized MTL objective. We analyze our objective and solver, providing certifiable guarantees on both privacy and utility. Our initial work provides groundwork for privacy-preserving multi-task learning and highlights several interesting directions of future study. |
Shengyuan Hu · Steven Wu · Virginia Smith 🔗 |
-
|
A Graphical Model Perspective on Federated Learning
(
Poster
)
SlidesLive Video » Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device. In this work, we formalize the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters. We then show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for this federated learning setting. This perspective on federated learning unifies several recent works in the field and opens up the possibility for extensions through different choices in the hierarchical model. Based on this view, we further propose a variant of the hierarchical model that employs prior distributions to promote sparsity. By using the hard-EM algorithm for learning, we obtain FedSparse, a procedure that can learn sparse neural networks in the federated learning setting. FedSparse reduces communication costs from client to server and vice-versa, as well as the computational costs for inference with the sparsified network – both of which are of great practical importance in federated learning. |
Christos Louizos · Matthias Reisser · Joseph Soriaga · Max Welling 🔗 |
-
|
Syft: A Platform for Universally Deployable Structured Transparency
(
Poster
)
SlidesLive Video » We present Syft, a general-purpose framework which combines a core group of privacy-enhancing technologies that facilitate a universal set of structured transparency systems. This framework is demonstrated through the design and implementation of a novel privacy-preserving inference information flow where we pass homomorphically encrypted activation signals through a split neural network for inference. We show that splitting the model further up the computation chain significantly reduces the computation time of inference and the payload size of activation signals at the cost of model secrecy. We evaluate our proposed flow with respect to it's provision of the core structural transparency principles. |
Adam Hall 🔗 |
-
|
FedPandemic: A Cross-Device Federated Learning Approach Towards Elementary Prognosis of Diseases During a Pandemic
(
Poster
)
SlidesLive Video » The amount of data, manpower and capital required to understand, evaluate and agree on a group of symptoms for the elementary prognosis of pandemic diseases is enormous. In this paper, we present FedPandemic, a novel noise implementation algorithm integrated with cross-device Federated learning for Elementary symptom prognosis during a pandemic, taking COVID-19 as a case study. Our results display consistency and enhance robustness in recovering the common symptoms displayed by the disease, paving a faster and cheaper path towards symptom retrieval while also preserving the privacy of patients’ symptoms via Federated learning. |
Aman Priyanshu · Rakshit Naidu 🔗 |
-
|
Towards Causal Federated Learning - For enhanced robustness and privacy
(
Poster
)
SlidesLive Video » Federated learning enables participating entities to collaboratively learn a shared prediction model while keeping their training data local. As this technique prevents data collection and aggregation, it helps in deducting associated privacy risks to a great extent. However, it still remains vulnerable to numerous attacks on security wherein a few malicious participating entities work towards inserting backdoors, degrading the generated aggregated model as well as inferring the data owned by participating entities. In this paper, we propose an approach to learn causal features common to all participating clients in a federated learning setup and analyse how it enhances the out of distribution accuracy and privacy. |
Sreya Francis 🔗 |
-
|
Personalized Federated Learning: A Unified Framework and Universal Optimization Techniques
(
Poster
)
SlidesLive Video » We study the optimization aspects of personalized Federated Learning (FL). We develop a universal optimization theory applicable to all convex personalized FL models in the literature. In particular, we propose a general personalized objective capable of recovering essentially any existing personalized FL objective as a special case. We design several optimization techniques to minimize the general objective, namely a tailored variant of Local SGD and variants of accelerated coordinate descent/accelerated SVRCD. We demonstrate the practicality and/or optimality of our methods both in terms of communication and local computation. Lastly, we argue about the implications of our general optimization theory when applied to solve specific personalized FL objectives. |
Filip Hanzely · Boxin Zhao · Mladen Kolar 🔗 |
-
|
Gradient-Masked Federated Optimization
(
Poster
)
Federated Averaging (FedAVG) has become the most popular federated learning algorithm due to its simplicity and low communication overhead. We use simple examples to show that FedAVG has the tendency to sew together the optima across the participating clients. These sewed optima exhibit poor generalization when used on a new client with new data distribution. Inspired by the invariance principles in Arjovsky et al. (2019); Parascandolo et al. (2020), we focus on learning a model that is locally optimal across the different clients simultaneously. We propose an algorithm that masks gradients (AND-mask from Parascandoloet al.) across the clients and uses them to carry out server model updates. We show that this algorithm achieves similar accuracy (in and out-of-distribution) and requires fewer communication rounds to converge than FedAVG, especially when the data is non-identically distributed. |
Irene Tenison · Sreya Francis · Irina Rish 🔗 |
-
|
Prior-Free Auctions for the Demand Side of Federated Learning
(
Poster
)
SlidesLive Video » Federated learning (FL) is a paradigm that allows distributed clients to learn a shared machine learning model without sharing their sensitive training data. However, for a successful and credible federated learning system, parties must be incentivized according to their contribution. Federated learning architectures require resources to fund a central server and, depending on the use-case, reimburse clients for their participation. While the problem of distributing resources to incentivize participation is important, a sustainable system first needs to get such resources. We propose a federated learning system (Federated Incentive Payments via Prior-Free Auctions, FIPFA) in the semi-honest trust model that can collect resources from self-interested clients using insights from prior-free auction design. Our system works even if clients are willing to make monetary contributions of differing amounts in exchange for high-quality models, and the server has no prior knowledge of these preferences. We run experiments on the MNIST dataset to test the model quality and incentive properties of our system. |
Andreas Haupt · Vaikkunth Mugunthan 🔗 |
-
|
Heterogeneous Zero-Shot Federated Learning with New Classes for Audio Classification
(
Poster
)
SlidesLive Video » Federated learning is an effective way of extracting insights from different user devices while preserving the privacy of users. However, new classes with completely unseen data distributions can stream across any device in a federated learning setting, whose data cannot be accessed by the global server or other users. To this end, we propose a unified zero-shot framework to handle these aforementioned challenges during federated learning. We simulate two scenarios here -- 1) when the new class labels are not reported by the user, the traditional FL setting is used; 2) when new class labels are reported by the user, we synthesize Anonymized Data Impressions by calculating class similarity matrices corresponding to each device's new classes followed by unsupervised clustering to distinguish between new classes across different users. Moreover, our proposed framework can also handle statistical heterogeneities in both labels and models across the participating users. We empirically evaluate our framework on-device across different communication rounds (FL iterations) with new classes in both local and global updates, along with heterogeneous labels and models, on two widely used audio classification applications -- keyword spotting and urban sound classification, and observe an average deterministic accuracy increase of ~4.041% and ~4.258% respectively. |
Gautham Krishna Gudur · Satheesh Perepu 🔗 |
-
|
SWIFT: Super-fast and Robust Privacy-Preserving Machine Learning
(
Poster
)
SlidesLive Video » Privacy-preserving Machine Learning (PPML) has seen a visible shift towards the adoption of the Secure Outsourced Computation (SOC) paradigm due to the heavy computation that it entails. In the SOC paradigm, computation is outsourced to a set of powerful and specially equipped servers that provide service on a pay-per-use basis. In this work, we propose SWIFT, a robust PPML framework for a range of machine learning (ML) algorithms in SOC setting, that guarantees output delivery to the users irrespective of any adversarial behaviour. Robustness, a highly desirable feature, evokes user participation without the fear of denial of service. At the heart of our framework lies a highly-efficient, maliciously-secure, three-party computation (3PC) over rings that provides guaranteed output delivery (GOD) in the honest-majority setting. To the best of our knowledge, SWIFT is the first robust and efficient PPML framework in the 3PC setting and is as fast as (and is strictly better in some cases than) the best-known 3PC framework BLAZE (Patra et al. NDSS'20), which only achieves fairness. We extend our 3PC framework for four parties (4PC). In this regime, SWIFT is as fast as the best known fair 4PC framework Trident (Chaudhari et al. NDSS'20) and twice faster than the best-known robust 4PC framework FLASH (Byali et al. PETS'20). We demonstrate our framework's practical relevance by benchmarking ML algorithms such as Logistic Regression and deep Neural Networks such as VGG16 and LeNet, both over a 64-bit ring in a WAN setting. For deep NN, our results testify to our claims that we provide improved security guarantee while incurring no additional overhead for 3PC and obtaining 2x improvement for 4PC. |
Nishat Koti · Mahak Pancholi · Arpita Patra · Ajith Suresh 🔗 |