【刘知远NLP课 整理】Phrase & Sentence & Document Representation
There are multi-grained semantic units in natural languages such as word, phrase, sentence, document. We have seen how to learn a word representation in link. In this post, we will focus on phrase, sentence and document representation learning.
Phrase Representation
Suppose a phrase \(p\) is composed by two words \(u\) and \(v\). The phrase embedding \(p\) is learned from its words' embedding, \(u\) and \(v\). Phrase representation methods can be divided into three categories: additive models, multiplicative models and others.
Additive Models
-
Vector addition:\(\boldsymbol{p}=\boldsymbol{u}+\boldsymbol{v}\)
-
Weight the constituents differentially in the addition: \(\boldsymbol{p}=\alpha \boldsymbol{u}+\beta \boldsymbol{v}\)(where