Abstract: In this paper, we explore meta-learning for few-shot text classification. Meta-learning has shown strong performance in computer vision, where low-level patterns are transferable across learning tasks. However, directly applying this approach to text is challenging--lexical features highly informative for one task may be insignificant for another. Thus, rather than learning solely from words, our model also leverages their distributional signatures, which encode pertinent word occurrence patterns. Our model is trained within a meta-learning framework to map these signatures into attention scores, which are then used to weight the lexical representations of words. We demonstrate that our model consistently outperforms prototypical networks learned on lexical knowledge (Snell et al., 2017) in both few-shot text classification and relation classification by a significant margin across six benchmark datasets (20.0% on average in 1-shot classification).

Similar Papers

A Baseline for Few-Shot Image Classification
Guneet Singh Dhillon, Pratik Chaudhari, Avinash Ravichandran, Stefano Soatto,
Meta-Learning without Memorization
Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn,
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks
Hae Beom Lee, Hayeon Lee, Donghyun Na, Saehoon Kim, Minseop Park, Eunho Yang, Sung Ju Hwang,