Not All Contexts Are Created Equal: Better Word Representations with Variable Attention
Ling, Wang,
Tsvetkov, Yulia,
Amir, Silvio,
Fermandez, Ramon,
Dyer, Chris,
Black, Alan W,
Trancoso, Isabel,
and Lin, Chu-Cheng
In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
2015
We introduce an extension to the bag-of-words model for learning words representations that take into account both syntactic and semantic properties within language. This is done by employing an attention model that finds within the contextual words, the words that are relevant for each prediction. The general intuition of our model is that some words are only relevant for predicting local context (e.g. function words), while other words are more suited for determining global con- text, such as the topic of the document. Experiments performed on both semantically and syntactically oriented tasks show gains using our model over the existing bag-of-words model. Furthermore, compared to other more sophisticated models, our model scales better as we increase the size of the context of the model.