Skip to yearly menu bar Skip to main content


Session

Tue AM Talks

Kevin Swersky

Abstract:
Chat is not available.

Tue 1 May 9:00 - 9:45 PDT

Invited Talk
Augmenting Clinical Intellgence with Machine Intelligence

Suchi Saria

Healthcare is rapidly becoming a data-intensive discipline, driven by increasing digitization of health data, novel measurement technologies, and new policy-based incentives. Critical decisions about ​whom​ and h​ ow​ to treat can be made more precisely by layering an individual’s data over that from a population. In this talk, I will begin by introducing the types of health data currently being collected and the challenges associated with learning models from these data. Next, I will describe new techniques that leverage probabilistic methods and counterfactual reasoning for tackling the aforementioned challenges. Finally, I will introduce areas where ​statistical machine-learning techniques are leading to new classes of computational diagnostic and treatment planning tools—tools that tease out subtle information from “messy” observational datasets, and provide reliable inferences given detailed context about the individual patient.

Tue 1 May 9:45 - 10:00 PDT

Oral
Learning to Represent Programs with Graphs

Miltiadis Allamanis · Marc Brockschmidt · Mahmoud Khademi

Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered. We propose to use graphs to represent both the syntactic and semantic structure of code and use graph-based deep learning methods to learn to reason over program structures.

In this work, we present how to construct graphs from source code and how to scale Gated Graph Neural Networks training to such large graphs. We evaluate our method on two tasks: VarNaming, in which a network attempts to predict the name of a variable given its usage, and VarMisuse, in which the network learns to reason about selecting the correct variable that should be used at a given program location. Our comparison to methods that use less structured program representations shows the advantages of modeling known structure, and suggests that our models learn to infer meaningful names and to solve the VarMisuse task in many cases. Additionally, our testing showed that VarMisuse identifies a number of bugs in mature open-source projects.

Tue 1 May 10:00 - 10:15 PDT

Oral
Neural Sketch Learning for Conditional Program Generation

Vijayaraghavan Murali · Letao Qi · Swarat Chaudhuri · Chris Jermaine

We study the problem of generating source code in a strongly typed, Java-like programming language, given a label (for example a set of API calls or types) carrying a small amount of information about the code that is desired. The generated programs are expected to respect a `"realistic" relationship between programs and labels, as exemplified by a corpus of labeled programs available during training.

Two challenges in such conditional program generation are that the generated programs must satisfy a rich set of syntactic and semantic constraints, and that source code contains many low-level features that impede learning. We address these problems by training a neural generator not on code but on program sketches, or models of program syntax that abstract out names and operations that do not generalize across programs. During generation, we infer a posterior distribution over sketches, then concretize samples from this distribution into type-safe programs using combinatorial techniques. We implement our ideas in a system for generating API-heavy Java code, and show that it can often predict the entire body of a method given just a few API calls or data types that appear in the method.

Tue 1 May 10:15 - 10:30 PDT

Oral
Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

Xingjun Ma · Bo Li · Yisen Wang · Sarah Erfani · Sudanthi Wijewickrema · Grant Schoenebeck · Dawn Song · Michael E Houle · James Bailey

Deep Neural Networks (DNNs) have recently been shown to be vulnerable against adversarial examples, which are carefully crafted instances that can mislead DNNs to make errors during prediction. To better understand such attacks, a characterization is needed of the properties of regions (the so-called 'adversarial subspaces') in which adversarial examples lie. We tackle this challenge by characterizing the dimensional properties of adversarial regions, via the use of Local Intrinsic Dimensionality (LID). LID assesses the space-filling capability of the region surrounding a reference example, based on the distance distribution of the example to its neighbors. We first provide explanations about how adversarial perturbation can affect the LID characteristic of adversarial regions, and then show empirically that LID characteristics can facilitate the distinction of adversarial examples generated using state-of-the-art attacks. As a proof-of-concept, we show that a potential application of LID is to distinguish adversarial examples, and the preliminary results show that it can outperform several state-of-the-art detection measures by large margins for five attack strategies considered in this paper across three benchmark datasets. Our analysis of the LID characteristic for adversarial regions not only motivates new directions of effective adversarial defense, but also opens up more challenges for developing new attacks to better understand the vulnerabilities of DNNs.

Tue 1 May 10:30 - 10:45 PDT

Oral
Certifying Some Distributional Robustness with Principled Adversarial Training

Aman Sinha · Hong Namkoong · John Duchi

Neural networks are vulnerable to adversarial examples and researchers have proposed many heuristic attack and defense mechanisms. We take the principled view of distributionally robust optimization, which guarantees performance under adversarial input perturbations. By considering a Lagrangian penalty formulation of perturbation of the underlying data distribution in a Wasserstein ball, we provide a training procedure that augments model parameter updates with worst-case perturbations of training data. For smooth losses, our procedure provably achieves moderate levels of robustness with little computational or statistical cost relative to empirical risk minimization. Furthermore, our statistical guarantees allow us to efficiently certify robustness for the population loss. For imperceptible perturbations, our method matches or outperforms heuristic approaches.

Tue 1 May 10:45 - 11:00 PDT

Break
Coffee Break