Pytorch中backward函数

backward函数是反向求导数，使用链式法则求导，如果对非标量y求导，函数需要额外指定grad_tensors，grad_tensors的shape必须和y的相同。

import torch
from torch.autograd import Variable
x=Variable(torch.Tensor([16]),requires_grad=True) #需要求导数
y=x*x
y.backward()
print(x.grad)

运行结果是：tensor([32.])，

y是标量所以backward函数不需要额外的参数

import torch
from torch.autograd import Variable
x=Variable(torch.Tensor([1,16]),requires_grad=True) #需要求导数
y=x*x
y.backward()
print(x.grad)

    raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs

这里的y=[1,256]，不是标量就报错RuntimeError: grad can be implicitly created only for scalar outputs
需要增加相应的参数y.backward(tensor.Tensor[value,value])，接着来看参数的含义具体是什么

import torch
from torch.autograd import Variable
x=Variable(torch.Tensor([1,5,6,10,16]),requires_grad=True) #需要求导数
y=x*x

weights1=torch.ones(5)
y.backward(weights1,retain_graph=True)
print(x.grad)

运行结果是：tensor([ 2., 10., 12., 20., 32.])

这里对y求导数， $\Delta$ y=2*x，是没有问题的，weights1是5维的和y的维度相同，由于使用了weights1
这里的 $\Delta$ y=x*2*weights1=[1.*2*1,5.*2*1,6.*2*1,10.*2*1,16.*2*1]=[2*1.,10.*1,12.*1,20.*1,32.*1]
可以清楚的看到weights1是对求导的变量指定一个权重，以便更清楚的认识，再举个实例

import torch
from torch.autograd import Variable
x=Variable(torch.Tensor([1,5,6,10,16]),requires_grad=True) #需要求导数
y=x*x

weights1=torch.Tensor([1,0.1,0.1,0.5,0.1])
y.backward(weights1,retain_graph=True)
print(x.grad)

运行结果是：tensor([ 2.0000, 1.0000, 1.2000, 10.0000, 3.2000])

这里对y求导数， $\Delta$ y=2*x，但多了weights1，需要对x的不同变量置偏重
$\Delta$ y=x*2*weights1=[1.*2*1,5.*2*0.1,6.*2*0.1,10.*2*0.5,16.*2*0.1]=[2.*1,10.*0.1,12.*0.1,20.*0.5,32.*0.1]
=[ 2.0000, 1.0000, 1.2000, 10.0000, 3.2000]

backward函数中还有retain_graph参数，使用retain_graph参数的时候，再次求导的时候，会对之前的导数进行累加

import torch
from torch.autograd import Variable
x=Variable(torch.Tensor([1,5,6,10,16]),requires_grad=True) #需要求导数
y=x*x

weights0=torch.ones(5)
y.backward(weights0,retain_graph=True)
print(x.grad)

weights1=torch.FloatTensor([0.1,0.1,0.1,0.1,0.1])
y.backward(weights1,retain_graph=True)
print(x.grad)

weights2=torch.FloatTensor([0.5,0.1,0.1,0.1,0.2])
y.backward(weights2,retain_graph=True)
print(x.grad)

运行结果是：
tensor([ 2., 10., 12., 20., 32.])
tensor([ 2.2000, 11.0000, 13.2000, 22.0000, 35.2000])
tensor([ 3.2000, 12.0000, 14.4000, 24.0000, 41.6000])

weights0处求导， $\Delta$ y0=2*x*weights0=[ 2., 10., 12., 20., 32.]
由于使用retain_graph保留计算图和求导结果
weights1处求导， $\Delta$ y1= $\Delta$ y0+2*x*weights1=[ 2., 10., 12., 20., 32.]+[ 0.2, 1., 1.2, 2, 3.2]=[2.2，11.，13.2，22.，35.2]
仍然使用retain_graph保留计算图和求导结果
weights2处求导， $\Delta$ y2= $\Delta$ y1+2*x*weights2=[2.2，11.，13.2，22.，35.2]+[ 1., 1., 1.2, 2, 6.4]=[ 3.2, 12. 14.4, 24., 41.6]
可以看到使用retain_graph，接着的求导会进行累加
参考的内容：https://pytorch.org/docs/stable/autograd.html?highlight=backward#torch.autograd.backward

Pytorch中backward函数

猜你喜欢