Skip to yearly menu bar Skip to main content

Virtual oral
Affinity Workshop: Tiny Papers Showcase Day (a DEI initiative)

Tiny Attention: A Simple yet Effective Method for Learning Contextual Word Embeddings

Renjith P Ravindran


Contextual Word Embedding (CWE) obtained via the Attention Mechanism in Transformer (AMT) models is one of the key drivers of the current revolution in Natural Language Processing. Previous techniques for learning CWEs are not only inferior to AMT but also are largely subpar to the simple bag-of-words baseline. Though there have been many variants of the Transformer model, the attention mechanism itself remains unchanged and is largely opaque. We introduce a new method for leaning CWEs that uses a simple and transparent attention mechanism. Our method is derived from the SVD based Syntagmatic Word Embeddings, which capture word associations. We test our model on the Word-in-Context dataset, and show that it outperforms the simple but tough-to-beat baseline by a substantial margin.

Chat is not available.