Deep Short Text Classification with Knowledge Powered Attention
research problem
- short text are more ambigous since they have not enough contextual information
- retrieve knowledge from external konwledge source of enhance the semantic repretentatiom
- attention mechanisms and prposed STCKA
- text information concept set from KB
explicit repretentation and implicit representation (not know)? "understanding short text".
incorporate the conceptual information as prior konwledge into deep neural networks
combline KB into short text ,problem as follows
- First some improper concepts are easliy introduced due to the ambiguity of enetity or the noist in kbs
- second take in to account the granalarity(粒度) and the relative importance of the concepts
to solve the problem propose the way
- concept towards short context attention(cst)度量相应的概念与文本的相似度之间的关系
- concept towards concepts set attention to explore the importance of each concevt with respect to the whole concept set
- soft switch to combine two attention
Model
1 knowledge retrieval
- the goal is to retrieve relevant konwledge from kbs
- isa relation entity rather than property
- entity linking is uesd to identify the entity concepyulzation
2 input embedding
char,word/concept char use charcnn word/concept pre-train word vector embeding = char +word (elmo,word2vec)
3short text embeding
the goal of this moudle is to produce the short text representation q .in a words,the sequence of d-dimensional word vectors(x1,x2,........xn),is converted into represention q
(获取文本的语义,语法等高级特征),可以换成比BILSTM更好的模型进行处理
method bilstm-----------scaled dot-porducet attention ----------max-pooling (to get each embedding of vector to capture the most important feature)
4 knowledge encoding
the prior knowledge obtain from knowledge base.given a concept set Cof size m(c1,c2.....cn)ci is the i-th concept vector aim at p[roduce it's vector representation p
two attention mechanisms to pay more attention to important concepts
to reduced the bad influence of some improper concepts due to ambigity and noise
C-ST
Here α i denotes the weight of attention from i -th concept towards the short text. Besides, in order to take the relative importance of the concepts into consideration, we propose Concept towards Concept Set (C-CS) attention based on source2token self attention β i denotes the weight of attention from the i -th concept towards whole concept set.