Poster

Weighted Training for Cross-Task Learning

Shuxiao Chen · Koby Crammer · Hangfeng He · Dan Roth · Weijie J Su

Keywords: [ representation learning ] [ natural language processing ]

[ Abstract ]
[ Visit Poster at Spot G2 in Virtual World ] [ OpenReview
Mon 25 Apr 6:30 p.m. PDT — 8:30 p.m. PDT
 
Oral presentation: Oral 3: Meta-learning and adaptation
Wed 27 Apr 9 a.m. PDT — 10:30 a.m. PDT

Abstract:

In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks. We show that TAWT is easy to implement, is computationally efficient, requires little hyperparameter tuning, and enjoys non-asymptotic learning-theoretic guarantees. The effectiveness of TAWT is corroborated through extensive experiments with BERT on four sequence tagging tasks in natural language processing (NLP), including part-of-speech (PoS) tagging, chunking, predicate detection, and named entity recognition (NER). As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning, such as the choice of the source data and the impact of fine-tuning.

Chat is not available.