参考链接: torch.Tensor.register_hook()的使用举例
总结说明:代码实验表面,反向传播计算梯度时的执行顺序是和前向计算相反的.这一点由钩子函数的执行顺序可以观察到,并且由保存梯度的列表中的内容顺序可以推断出来.
代码实验展示:
import torch
print(torch.__version__) # 1.2.0+cu92
torch.manual_seed(seed=20200910)
gradients = list()
# ------------------------------------------ #
def grad_hook_x0(grad):
print("\n为x0执行自定义的钩子函数...")
print("保存x0的梯度...")
gradients.append(grad)
print("x0的钩子函数执行结束...\n")
# return grad
x0 = torch.randn(2,3,4,5,6,7,requires_grad=True)
print('x0.shape:', x0.shape) # x0.shape: torch.Size([2, 3, 4, 5, 6, 7])
# print('x0:\n',x0)
x0.register_hook(grad_hook_x0)
# ------------------------------------------ #
def grad_hook_x1(grad):
print("\n为x1执行自定义的钩子函数...")
print("保存x1的梯度...")
gradients.append(grad)
print("x1的钩子函数执行结束...\n")
# return grad
x1 = torch.sum((4 * x0 + 18.0), dim=(0,1))
x1.retain_grad()
print('x1.shape:', x1.shape) # x1.shape: torch.Size([4, 5, 6, 7])
# print('x1:\n',x1)
x1.register_hook(grad_hook_x1)
# ------------------------------------------ #
def grad_hook_x2(grad):
print("\n为x2执行自定义的钩子函数...")
print("保存x2的梯度...")
gradients.append(grad)
print("x2的钩子函数执行结束...\n")
# return grad
x2 = torch.sum(x1, dim=(1,2)) * 10.0
x2.retain_grad()
print('x2.shape:', x2.shape) # x2.shape: torch.Size([4, 7])
# print('x2:\n',x2)
x2.register_hook(grad_hook_x2)
# ------------------------------------------ #
def grad_hook_loss(grad):
print("\n为loss执行自定义的钩子函数...")
print("保存loss的梯度...")
gradients.append(grad)
print("loss的钩子函数执行结束...\n")
# return grad
loss = torch.mean(x2)
loss.retain_grad()
print('loss.shape:', loss.shape) # loss.shape: torch.Size([])
print('loss:',loss) # loss: tensor(32403.7344, grad_fn=<MeanBackward0>)
loss.register_hook(grad_hook_loss)
# ------------------------------------------ #
loss.backward() # 这行代码将会执行已注册登记的钩子函数
tensors_list = [loss, x2, x1, x0]
print('打印相关信息,gradients列表的长度为:', len(gradients))
print('打印相关信息,tensors_list列表的长度为:', len(tensors_list))
for g, t in zip(gradients, tensors_list):
print( torch.equal(g, t.grad), g.shape==t.grad.shape==t.shape, g.shape, t.grad.shape, t.shape)
控制台输出结果:
Windows PowerShell
版权所有 (C) Microsoft Corporation。保留所有权利。
尝试新的跨平台 PowerShell https://aka.ms/pscore6
加载个人及系统配置文件用了 869 毫秒。
(base) PS C:\Users\chenxuqi\Desktop\News4cxq\test4cxq> conda activate ssd4pytorch1_2_0
(ssd4pytorch1_2_0) PS C:\Users\chenxuqi\Desktop\News4cxq\test4cxq> & 'D:\Anaconda3\envs\ssd4pytorch1_2_0\python.exe' 'c:\Users\chenxuqi\.vscode\extensions\ms-python.python-2021.1.502429796\pythonFiles\lib\python\debugpy\launcher' '58682' '--' 'c:\Users\chenxuqi\Desktop\News4cxq\test4cxq\testHook.py'
1.2.0+cu92
x0.shape: torch.Size([2, 3, 4, 5, 6, 7])
x1.shape: torch.Size([4, 5, 6, 7])
x2.shape: torch.Size([4, 7])
loss.shape: torch.Size([])
loss: tensor(32403.7344, grad_fn=<MeanBackward0>)
为loss执行自定义的钩子函数...
保存loss的梯度...
loss的钩子函数执行结束...
为x2执行自定义的钩子函数...
保存x2的梯度...
x2的钩子函数执行结束...
为x1执行自定义的钩子函数...
保存x1的梯度...
x1的钩子函数执行结束...
为x0执行自定义的钩子函数...
保存x0的梯度...
x0的钩子函数执行结束...
打印相关信息,gradients列表的长度为: 4
打印相关信息,tensors_list列表的长度为: 4
True True torch.Size([]) torch.Size([]) torch.Size([])
True True torch.Size([4, 7]) torch.Size([4, 7]) torch.Size([4, 7])
True True torch.Size([4, 5, 6, 7]) torch.Size([4, 5, 6, 7]) torch.Size([4, 5, 6, 7])
True True torch.Size([2, 3, 4, 5, 6, 7]) torch.Size([2, 3, 4, 5, 6, 7]) torch.Size([2, 3, 4, 5, 6, 7])
(ssd4pytorch1_2_0) PS C:\Users\chenxuqi\Desktop\News4cxq\test4cxq>