Data are the most valuable ingredient of machine learning models to help researchers and companies make informed decisions. However, access to rich, diverse, and clean datasets may not always be possible. One of the reasons for the lack of rich datasets is the substantial amount of time needed for data collection, especially when manual annotation is required. Another reason is the need for protecting privacy, whenever raw data contains sensitive information about individuals and hence cannot be shared directly. A powerful solution that can address both of these challenging scenarios is generating synthetic data. Thanks to the recent advances in generative models, it is possible to create realistic synthetic samples that closely match the distribution of complex, real data. In the case of limited labeled data, synthetic data can be used to augment training data to mitigate overfitting. In the case of protecting privacy, data curators can share the synthetic data instead of the original data, where the utility of the original data is preserved but privacy is protected. Despite the substantial benefits from using synthetic data, the process of synthetic data generation is still an ongoing technical challenge. Although the two scenarios of limited data and privacy concerns share similar technical challenges such as quality and fairness, they are often studied separately. We will bring together researchers from both fields in order to discuss challenges and advances in synthetic data generation.
Fri 7:00 a.m. - 7:10 a.m.
|
Opening Remarks
(
Remark
)
Opening Remarks |
Sergul Aydore 🔗 |
Fri 7:10 a.m. - 7:35 a.m.
|
"Can Machine Learning Revolutionize Healthcare? Synthetic Data may be the Answer" by Mihaela van der Schaar, UCLA
(
Invited Talk
)
SlidesLive Video » Mihaela van der Schaar is John Humphrey Plummer Professor of Machine Learning, Artificial Intelligence and Medicine at the University of Cambridge, a Turing Faculty Fellow at The Alan Tur-ing Institute in London, and a Chancellor’s Professor at UCLA. Mihaela was elected IEEE Fellow in 2009. She has received numerous awards, including the Oon Prize on Preventative Medicine fromthe University of Cambridge (2018), an NSF Career Award (2004), 3 IBM Faculty Awards, the IBM Exploratory Stream Analytics Innovation Award, the Philips Make a Difference Award and severalbest paper awards, including the IEEE Darlington Award. Mihaela’s work has also led to 35 USApatents (many widely cited and adopted in standards) and 45+ contributions to international standards for which she received 3 International ISO (International Organization for Standardization) Awards. |
Mihaela van der Schaar 🔗 |
Fri 7:35 a.m. - 7:40 a.m.
|
Q&A with Mihaela van der Schaar
(
Q&A
)
|
🔗 |
Fri 7:40 a.m. - 7:42 a.m.
|
Introducing contributed talks 1-2
(
Intro
)
|
Jamie Hayes 🔗 |
Fri 7:42 a.m. - 7:51 a.m.
|
Contributed Talk: Synthetic Data for Model selection
(
Contributed Talk
)
SlidesLive Video » |
Nadav Bhonker · Alon Shoshan 🔗 |
Fri 7:51 a.m. - 8:00 a.m.
|
Contributed Talk: Ensembles of GANs for synthetic training data generation
(
Contributed Talk
)
SlidesLive Video » |
Gabriel Eilertsen 🔗 |
Fri 8:00 a.m. - 8:01 a.m.
|
Intoducing Jan Kautz
(
Intro
)
|
Jamie Hayes 🔗 |
Fri 8:01 a.m. - 8:25 a.m.
|
"Generative Models for Image Synthesis" by Jan Kautz, NVIDIA
(
Invited Talk
)
SlidesLive Video » Jan Kautz is VP of Learning and Perception Research at NVIDIA. Jan and his team pursue fundamental research in the areas of computer vision and deep learning, including visual perception, geometric vision, generative models, and efficient deep learning. His team's work has been recognized with various awards and has been regularly featured in the media. Before joining NVIDIA in 2013, Jan was a tenured faculty member at University College London. He holds a BSc in Computer Science from the University of Erlangen-Nürnberg (1999), an MMath from the University of Waterloo (1999), received his PhD from the Max-Planck-Institut für Informatik (2003), and worked as a post-doctoral researcher at the Massachusetts Institute of Technology (2003-2006). |
Jan Kautz 🔗 |
Fri 8:25 a.m. - 8:30 a.m.
|
Q&A with Jan Kautz
(
Q&A
)
|
🔗 |
Fri 8:30 a.m. - 9:00 a.m.
|
Break + Posters
(
GatherTown
)
link »
Please join us in GatherTown (using FireFox or Chrome) for our first poster session:
|
🔗 |
Fri 9:00 a.m. - 9:01 a.m.
|
Intoducing Jinsung Yoon
(
Intro
)
|
Edward Choi 🔗 |
Fri 9:01 a.m. - 9:25 a.m.
|
"Differentially Private Synthetic Data Generations Using Generative Adversarial Networks" by Jinsung Yoon, Google Cloud AI
(
Invited Talk
)
SlidesLive Video » Jinsung Yoon is a research scientist at Google Cloud AI. Prior to Google Cloud, Jinsung was a PhD student in the Electrical and Computer Engineering Department at UCLA. He received his PhD from UCLA in 2020 and his PhD thesis was on machine learning for medicine (titled as End-to-End Machine Learning Frameworks for Medicine: Data Imputation, Model Interpretation and Synthetic Data Generation). His main research interests have been on synthetic data generation with privacy guarantee, data imputation, model interpretation, and transfer learning using adversarial learning and reinforcement learning frameworks. He has published various papers and served as a reviewer in top-tier machine learning conferences (NeurIPS, ICML, ICLR, AAAI). |
Jinsung Yoon 🔗 |
Fri 9:25 a.m. - 9:30 a.m.
|
Q&A with Jinsung Yoon
(
Q&A
)
|
🔗 |
Fri 9:30 a.m. - 9:32 a.m.
|
Introducing contributed talks 3-4
(
Intro
)
|
Jamie Hayes 🔗 |
Fri 9:32 a.m. - 9:41 a.m.
|
Contributed Talk: Few-shot learning via tensor hallucination
(
Contributed Talk
)
SlidesLive Video » |
Michalis Lazarou 🔗 |
Fri 9:41 a.m. - 9:50 a.m.
|
Contributed Talk: Leveraging Public Data for Practical Private Query Release
(
Contributed Talk
)
SlidesLive Video » |
Terrance Liu 🔗 |
Fri 9:50 a.m. - 9:51 a.m.
|
Introducing Manuela M. Veloso
(
Intro
)
|
Sergul Aydore 🔗 |
Fri 9:51 a.m. - 10:15 a.m.
|
"Towards Financial Synthetic Data" by Manuela M. Veloso, J.P.Morgan, CMU
(
Invited Talk
)
Manuela M. Veloso joined J.P.Morgan as Managing Director to create and head the Artificial Intelligence Research Lab. With her group, she investigates opportunities for automated, optimized, and novel approaches to AI in Finance. Veloso is on leave from Carnegie Mellon University (CMU) as Herbert A. Simon University Professor in the School of Computer Science, and where she was the Head of the Machine Learning Department. She researches in AI, Robotics, and Machine Learning. At CMU, she founded and directs the CORAL research laboratory, for the study of autonomous agents that Collaborate, Observe, Reason, Act, and Learn. Veloso and her students research a variety of autonomous robots, including mobile service robots and soccer robots. Veloso is Fellow of the AAAI, AAAS, ACM, and IEEE. She is Einstein Chair Professor of the Chinese National Academy of Science, the co-founder and past President of RoboCup, and past President of AAAI. As of now, Professor Veloso has graduated 40 PhD students and co-authored more than 300 journal and conference publications. |
Manuela Veloso 🔗 |
Fri 10:15 a.m. - 10:20 a.m.
|
Q&A with Manuela M. Veloso
(
Q&A
)
|
🔗 |
Fri 10:20 a.m. - 10:50 a.m.
|
Break + Posters
(
GatherTown
)
link »
Please join us in GatherTown (using FireFox or Chrome) for our second poster session:
|
🔗 |
Fri 10:50 a.m. - 10:51 a.m.
|
Introducing Stefano Ermon
(
Intro
)
|
Krishnaram Kenthapadi 🔗 |
Fri 10:51 a.m. - 11:15 a.m.
|
"Bias and Generalization of Deep Generative Models" by Stefano Ermon, Stanford University
(
Invited Talk
)
SlidesLive Video » Stefano Ermon is an Assistant Professor of Computer Science in the CS Department at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory, and a fellow of the Woods Institute for the Environment. His research is centered on techniques for probabilistic modeling of data and is motivated by applications in the emerging field of computational sustainability. He has won several awards, including four Best Paper Awards (AAAI, UAI and CP), a NSF Career Award, ONR and AFOSR Young Investigator Awards, a Sony Faculty Innovation Award, a Hellman Faculty Fellowship, Microsoft Research Fellowship, Sloan Fellowship, and the IJCAI Computers and Thought Award. Stefano earned his Ph.D. in Computer Science at Cornell University in 2015. |
Stefano Ermon 🔗 |
Fri 11:15 a.m. - 11:20 a.m.
|
Q&A with Stefano Ermon
(
Q&A
)
|
🔗 |
Fri 11:20 a.m. - 11:23 a.m.
|
Introducing contributed talks 5-6-7
(
Intro
)
|
Haipeng Chen 🔗 |
Fri 11:23 a.m. - 11:32 a.m.
|
Contributed Talk: FFPDG: Fast, Fair and Private Data Generation
(
Contributed Talk
)
SlidesLive Video » |
Weijie Xu 🔗 |
Fri 11:32 a.m. - 11:41 a.m.
|
Contributed Talk: Overcoming Barriers to Data Sharing with Medical Image Generation: A Comprehensive Evaluation
(
Contributed Talk
)
SlidesLive Video » |
Stefan Bauer · August DuMont Schütte 🔗 |
Fri 11:41 a.m. - 11:50 a.m.
|
Contributed Talk: Imperfect ImaGANation: Implications of GANs Exacerbating Biases on Facial Data
(
Contributed Talk
)
SlidesLive Video » |
Alberto Olmo · Niharika Jain 🔗 |
Fri 11:50 a.m. - 11:51 a.m.
|
Intoducing Sander Dieleman
(
Intro
)
|
Haipeng Chen 🔗 |
Fri 11:51 a.m. - 12:15 p.m.
|
"Generative Modeling for Music Generation" by Sander Dieleman, DeepMind
(
Invited Talk
)
SlidesLive Video » Sander Dieleman is a research scientist at DeepMind in London, UK, where he has worked on the AlphaGo and WaveNet projects. His research interests include generative modelling and representation learning in the audio and visual domains, with a particular focus on music, as well as recommender systems and equivariance in neural networks. He obtained his Ph.D. in Computer Science from Ghent University in 2016, working on feature learning and deep learning techniques for learning hierarchical representations of musical audio signals. |
Sander Dieleman 🔗 |
Fri 12:15 p.m. - 12:20 p.m.
|
Q&A with Sander Dieleman
(
Q&A
)
|
🔗 |
Fri 12:20 p.m. - 12:50 p.m.
|
Break + Posters
(
GatherTown
)
link »
Please join us in GatherTown (using FireFox or Chrome) for our third poster session:
|
🔗 |
Fri 12:50 p.m. - 12:51 p.m.
|
Introducing Emily Denton
(
Intro
)
|
Krishnaram Kenthapadi 🔗 |
Fri 12:51 p.m. - 1:15 p.m.
|
"Ethical Considerations of Generative AI" by Emily Denton, Google’s Ethical AI team
(
Invited Talk
)
SlidesLive Video » Emily Denton is a Research Scientist on Google’s Ethical AI team where they examine the societal impacts of AI technology. Their recent research centers on critically examining the norms, values, and work practices that structure the development and use of machine learning datasets. Prior to joining Google, Emily received their PhD in machine learning from the Courant Institute of Mathematical Sciences at New York University, where they focused on unsupervised learning and generative modeling of images and video. |
Emily Denton 🔗 |
Fri 1:15 p.m. - 1:20 p.m.
|
Q&A with Emily Denton
(
Q&A
)
|
🔗 |
Fri 1:20 p.m. - 2:20 p.m.
|
Discussion Panel by All invited speakers
(
Discussion Panel
)
|
Mario Fritz 🔗 |
Fri 2:20 p.m. - 2:30 p.m.
|
Closing Remarks and Award Ceremony
(
Remark
)
|
Jamie Hayes 🔗 |