Deep learning relies on massive training sets of labeled examples to learn from - often tens of thousands to millions to reach peak predictive performance. However, large amounts of training data are only available for very few standardized learning problems. Even small variations of the problem specification or changes in the data distribution would necessitate re-annotation of large amounts of data.
However, domain knowledge can often be expressed by sets of prototypical descriptions. These knowledge-based descriptions can be either used as rule-based predictors or as labeling functions for providing partial data annotations. The growing field of weak supervision provides methods for refining and generalizing such heuristic-based annotations in interaction with deep neural networks and large amounts of unannotated data.
In this workshop, we want to advance theory, methods and tools for allowing experts to express prior coded knowledge for automatic data annotations that can be used to train arbitrary deep neural networks for prediction. Learning with weak supervision is both studied from a theoretical perspective as well as applied to a variety of tasks from areas like natural language processing and computer vision. This workshop aims at bringing together researchers from this wide range of fields to facilitate discussions across research areas that share the common ground of using weak supervision. A target of this workshop is also to inspire applications of weak supervision to new scenarios and to enable researchers to work on tasks that so far have been considered too low-resource.
As weak supervision addresses one of the major issues of current machine learning techniques, the lack of labeled data, it has also started to obtain commercial interest. This workshop is an opportunity to bridge innovations from academia and the requirements of industry settings.
Fri 7:00 a.m. - 7:10 a.m.
|
Introduction and Opening Remarks
(
Introduction
)
|
🔗 |
Fri 7:10 a.m. - 8:10 a.m.
|
Invited Speaker Dan Roth - Natural Language Understanding with Incidental Supervision
(
Keynote Talk
)
SlidesLive Video » Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, and a Fellow of the AAAS, the ACM, AAAI, and the ACL. In 2017 Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.” Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory. Until February 2017 Roth was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR). His recent work has emphasized, among other topics, the notion of incidental supervision as a way to get around the inherent difficulty in supervising complex problems in various areas in natural language understanding. |
Dan Roth 🔗 |
Fri 8:10 a.m. - 8:25 a.m.
|
Invited Talk Dan Roth - Q&A
(
Q&A
)
|
🔗 |
Fri 8:25 a.m. - 9:10 a.m.
|
Invited Speaker Marine Carpuat - Weak Supervision for Cross-Lingual Semantic Analysis
(
Keynote Talk
)
SlidesLive Video » Marine Carpuat is an Assistant Professor of Computer Science at the University of Maryland. She received a PhD in Computer Science from Hong Kong University. Her research interests are in Natural Language Processing and Machine Translation. Recent work of hers includes the use of weak supervision for cross-lingual classification. Marine is the recipient of an NSF CAREER award, research awards from Google and Amazon and best paper awards at the *SEM and TALN conferences. |
Marine Carpuat 🔗 |
Fri 9:10 a.m. - 9:25 a.m.
|
Invited Speaker Marine Carpuat - Q&A
(
Q&A
)
|
🔗 |
Fri 9:25 a.m. - 9:40 a.m.
|
Dependency Structure Misspecification in Multi-Source Weak Supervision Models
(
Contributed Talk
)
SlidesLive Video » |
Salva Rühling Cachay 🔗 |
Fri 9:40 a.m. - 9:50 a.m.
|
Dependency Structure Misspecification in Multi-Source Weak Supervision Models - Q&A
(
Q&A
)
|
🔗 |
Fri 9:50 a.m. - 9:53 a.m.
|
AutoTriggER: Named Entity Recognition with Auxiliary Trigger Extraction
(
Poster Spotlight
)
SlidesLive Video » |
Dong-Ho Lee 🔗 |
Fri 9:53 a.m. - 9:56 a.m.
|
Handling Long-Tail Queries with Slice-Aware Conversational Systems
(
Poster Spotlight
)
SlidesLive Video » |
Cheng Wang 🔗 |
Fri 9:56 a.m. - 9:59 a.m.
|
Tabular Data Modeling via Contextual Embeddings
(
Poster Spotlight
)
SlidesLive Video » |
Xin Huang 🔗 |
Fri 9:59 a.m. - 10:02 a.m.
|
TADPOLE: Task ADapted Pre-training via anOmaLy dEtection
(
Poster Spotlight
)
SlidesLive Video » |
Vivek Madan 🔗 |
Fri 10:02 a.m. - 10:05 a.m.
|
Active WeaSuL: Improving Weak Supervision with Active Learning
(
Poster Spotlight
)
SlidesLive Video » |
Samantha Biegel 🔗 |
Fri 10:05 a.m. - 10:08 a.m.
|
Transformer Language Models as Universal Computation Engines
(
Poster Spotlight
)
SlidesLive Video » |
Kevin Lu 🔗 |
Fri 10:15 a.m. - 11:15 a.m.
|
Poster Session 1
(
Poster Session
)
link »
This poster session is running on Gather.Town. Please join the session via this link: [ protected link dropped ] |
🔗 |
Fri 11:15 a.m. - 11:20 a.m.
|
Welcome Back
(
Introduction
)
|
🔗 |
Fri 11:20 a.m. - 11:50 a.m.
|
Invited Speaker Heng Ji - InfoSurgeon: Cross-media Weak Supervision for Knowledge-Element Level Fake News Detection
(
Keynote Talk
)
SlidesLive Video » Bio: Heng Ji is a professor of Computer Science at the University of Illinois at Urbana-Champaign. Her research interests focus on Natural Language Processing and its connections with Data Mining, Social Science and Vision. Recent work focuses on weak supervision methods for schema-guided event understanding. Heng Ji was selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. The awards she received include "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, PACLIC2012 Best paper runner-up, "Best of ICDM2013" paper award, "Best of SDM2013" paper award, ACL2018 Best Demo paper nomination, Google Research Award in 2009 and 2014, IBM Watson Faculty Award in 2012 and 2014 and Bosch Research Award in 2014-2018. She has coordinated the NIST TAC Knowledge Base Population task since 2010 and led several multi-institute research efforts including DARPA DEFT Tinker Bell team of seven universities and DARPA KAIROS RESIN team of six universities. |
Heng Ji 🔗 |
Fri 11:50 a.m. - 12:05 p.m.
|
Invited Speaker Heng Ji - Q&A
(
Q&A
)
|
🔗 |
Fri 12:05 p.m. - 12:20 p.m.
|
Weakly Supervised Multi-task Learning for Concept-based Explainability
(
Contributed Talk
)
SlidesLive Video » |
Vladimir Balayan 🔗 |
Fri 12:20 p.m. - 12:30 p.m.
|
Weakly Supervised Multi-task Learning for Concept-based Explainability - Q&A
(
Q&A
)
|
🔗 |
Fri 12:30 p.m. - 12:45 p.m.
|
Better Adaptation to Distribution Shifts with Robust Pseudo-Labeling
(
Contributed Talk
)
SlidesLive Video » |
Evgenia Rusak 🔗 |
Fri 12:45 p.m. - 12:55 p.m.
|
Better Adaptation to Distribution Shifts with Robust Pseudo-Labeling - Q&A
(
Q&A
)
|
🔗 |
Fri 12:55 p.m. - 12:58 p.m.
|
Using system context information to complement weakly labeled data
(
Poster Spotlight
)
SlidesLive Video » |
Matthias Meyer 🔗 |
Fri 12:58 p.m. - 1:01 p.m.
|
CIGMO: Learning categorical invariant deep generative models from grouped data
(
Poster Spotlight
)
SlidesLive Video » |
Haruo Hosoya 🔗 |
Fri 1:01 p.m. - 1:04 p.m.
|
Pre-Training by Completing Points Cloud
(
Poster Spotlight
)
SlidesLive Video » |
Hanchen Wang 🔗 |
Fri 1:04 p.m. - 1:07 p.m.
|
Weakly-Supervised Group Disentanglement using Total Correlation
(
Poster Spotlight
)
SlidesLive Video » |
Linh Tran 🔗 |
Fri 1:07 p.m. - 1:10 p.m.
|
Is Disentanglement all you need? Comparing Concept-based & Disentanglement Approaches
(
Poster Spotlight
)
SlidesLive Video » |
Dmitry Kazhdan 🔗 |
Fri 1:07 p.m. - 1:10 p.m.
|
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks
(
Poster Spotlight
)
SlidesLive Video » |
Curtis G Northcutt 🔗 |
Fri 1:20 p.m. - 2:20 p.m.
|
Poster Session 2
(
Poster Session
)
link »
This poster session is running in Gather.Town. Please join the session via this link: [ protected link dropped ] |
🔗 |
Fri 2:20 p.m. - 2:25 p.m.
|
Welcome Back
(
Introduction
)
|
🔗 |
Fri 2:25 p.m. - 3:10 p.m.
|
Invited Speaker Lu Jiang - Robust Deep Learning and Applications
(
Keynote Talk
)
SlidesLive Video » Bio: Lu Jiang is a senior research scientist at Google Research. He obtained his Ph.D. at Carnegie Mellon University. His research goal is to solve realistic problems on big multimodal data. His recent work on weak and unreliable supervision includes approaches like MentorNet and a recent dataset for noisy web images. His work on robust machine translation was nominated for best paper at ACL’19. |
Lu Jiang 🔗 |
Fri 3:10 p.m. - 3:25 p.m.
|
Invited Speaker Lu Jiang - Q&A
(
Q&A
)
|
🔗 |
Fri 3:25 p.m. - 3:55 p.m.
|
Invited Speaker Paroma Varma - Snorkel: Programmatically Labeling Training Data
(
Keynote Talk
)
SlidesLive Video » Bio: Paroma Varma is a co-founder of Snorkel AI, an AI start-up based on the influential Snorkel project. Snorkel AI provides a data-first platform for building, managing, and monitoring end-to-end AI applications. Paroma received her Ph.D. from Stanford University and she was supported by the Stanford Graduate Fellowship and the National Science Foundation Graduate Research Fellowship. Her research interests revolve around weak supervision, or using high-level knowledge in the form of noisy labeling sources to efficiently label massive datasets required to train machine learning models. In this context, she is also interested in using developer exhaust, byproducts of the data analytics pipeline, to simplify complex statistical and search-based problems. |
Paroma Varma 🔗 |
Fri 3:55 p.m. - 4:10 p.m.
|
Invited Speaker Paroma Varma - Q&A
(
Q&A
)
|
🔗 |
Fri 4:10 p.m. - 5:10 p.m.
|
Panel Discussion
(
Panel
)
|
🔗 |
Fri 5:10 p.m. - 5:25 p.m.
|
Concluding Remarks
(
Conclusion
)
|
🔗 |
Fri 5:25 p.m. - 6:00 p.m.
|
Post Workshop Hangout
(
Gather.Town
)
link »
Let's hang out after the official part of the workshop to meet and discuss. This will be done in Gather.Town. Just follow this link: [ protected link dropped ] |
🔗 |