Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

有哪些sequence model

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Notation:

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

RNN - Recurrent Neural Network

传统NN 在解决sequence input 时有什么问题?

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

RNN就没有上面的问题. 注意这里还提到了BRNN 双向RNN的概念。

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

激活函数 g1 经常用的是tanh, 也有用relu的但是不常用

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Backpropagation through time

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Difference types of RNNs

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Language model and sequence generation

language modelling 用来找出可能性最大的句子.

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

language model 训练好了以后,一个有趣的应用例子是自己创造句子, 也就是 sample novel sequences

Sample novel sequences

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

除了常见的word-level language model, 还有一种很不常见的character-level language model.

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Vanishing gradient problem

因为RNN 每个word 最主要受到附近的word的影响,如果遇到下面图片里的setence 就处理不好. 遇到一个名词就需要记忆很久这个名词(cat)的单复数,直到遇到动词(was/were)这个不是RNN擅长的.

除了vanishing gradient 问题,还有exploding gradient 问题,但是相对来说 exploding gradient 好解决,solution 是gradient clipping, 具体是说gradient 的值太大了就clip according to max value (threshold).

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

  

GRU - Gated Recurrent Unit

接下来就谈怎么解决vanishing gradient 问题。

先来看basic RNN.

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

在对比着看GRU

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

上面是为了好理解做的简化版的GRU,Full GRU是这样的   

Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

LSTM 和GRU 怎么选择呢?没有优劣,不同的问题可能适用不同的算法。

LSTM 比 GRU 更复杂,但是GRU更简单所以更快。GRU 有两个gate, LSTM 有三个gate. 如果要选择一个,可以默认先选择LSTM

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

BRNN - Bidirection RNN

下面的问题需要BRNN来处理

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

实际应用中,BRNN + LSTM 的组合最常用

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Deep RNNs

  Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

Questions:

1. gate 的概念没有理解

2. LSTM 没有理解

3. One-hot vector: 一个向量里只有一个1,其他都是0.

上一篇:deep learning framework(不同的深度学习框架)


下一篇:个性探测综述阅读笔记——Recent trends in deep learning based personality detection