参考:
https://blog.csdn.net/qq_41664845/article/details/84245520#t5
https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
论文题目:Neural Machine Translation by Jointly Learning to Align and Translate
论文地址:http://pdfs.semanticscholar.org/071b/16f25117fb6133480c6259227d54fc2a5ea0.pdf
这篇论文在传统的encoder-decoder的NMT上加了attention机制。
encoder-decoder for NMT:
我们就主要来讲一讲加上的attention。