nn.Dropout

Dropout

torch.nn.Dropout(p=0.5, inplace=False)

  • p – probability of an element to be zeroed. Default: 0.5
  • inplace – If set to True, will do this operation in-place. Default: False

训练过程中以概率P随机的将参数置0,其中P为置0的概率,例如P=1表示将网络参数全部置0

During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution. Each channel will be zeroed out independently on every forward call.

**注意:**Pytorch文档中给出了一点,输出的参数会以 1 1 − p \frac{1}{1-p} 1−p1​进行一个缩放

Furthermore, the outputs are scaled by a factor of 1 1 − p \frac{1}{1-p} 1−p1​​ during training. This means that during evaluation the module simply computes an identity function.

下面例子展示出在dropout之后,参数变为了原来的 1 1 − p = 2 \frac{1}{1-p} = 2 1−p1​=2倍

input = torch.tensor([[1, 2, 3],
                      [4, 5, 6],
                      [7, 8, 9]], dtype=torch.float64)
input = torch.unsqueeze(input, 0)
m = nn.Dropout(p = 0.5)
output = m(input)

print("input: ", input)
print("output: ", output)
print("input: ", input)
'''
input:  
tensor([[[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]], dtype=torch.float64)
output:  
tensor([[[ 2.,  4.,  0.],
         [ 0., 10., 12.],
         [ 0., 16.,  0.]]], dtype=torch.float64)
input:  
tensor([[[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]], dtype=torch.float64)
'''

当我们把nn.Dropoutinplace=True时,计算的结果就会替换掉原来的输入input,如下:

input = torch.tensor([[1, 2, 3],
                      [4, 5, 6],
                      [7, 8, 9]], dtype=torch.float64)
input = torch.unsqueeze(input, 0)
m = nn.Dropout(p = 0.5, inplace=True)
output = m(input)

print("input: ", input)
print("output: ", output)
print("input: ", input)
'''
input:  
tensor([[[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]], dtype=torch.float64)
output:  
tensor([[[ 2.,  4.,  0.],
         [ 0., 10., 12.],
         [ 0., 16.,  0.]]], dtype=torch.float64)
input:  
tensor([[[ 2.,  4.,  0.],
         [ 0., 10., 12.],
         [ 0., 16.,  0.]]], dtype=torch.float64)
'''
上一篇:10分钟彻底理解自适应大邻域搜索算法


下一篇:np.float32()和np.float64