参考链接: torch.Tensor.requires_grad
原文及翻译:
requires_grad()
方法: requires_grad()
Is True if gradients need to be computed for this Tensor, False otherwise.
如果需要为这个张量计算梯度,那么该函数返回True,否则返回False.
Note 注意:
The fact that gradients need to be computed for a Tensor do not mean that the grad
attribute will be populated, see is_leaf for more details.
我们需要清楚这个事实: 如果一个张量我们需要为它计算梯度,这并不意味着这个张量地grad属性会一直
被保持,详细信息请参考is_leaf属性.
(译者注:如果一个张量地requires_grad=True,那么在调用backward()方法时反向传播计算梯度,我们
会为这个张量计算梯度,但是计算完梯度之后这个梯度并不一定会一直保存在属性grad中.只有对于
requires_grad=True的叶子张量,我们才会将梯度一直保存在该叶子张量的grad属性中,对于非叶子节点,
即中间节点的张量,我们在计算完梯度之后为了更高效地利用内存,我们会将梯度grad的内存释放掉.)
实验代码展示:
Microsoft Windows [版本 10.0.18363.1316]
(c) 2019 Microsoft Corporation。保留所有权利。
C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0
(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.manual_seed(seed=20200910)
<torch._C.Generator object at 0x00000151BC7DD330>
>>>
>>> a = torch.randn(3,5)
>>> a
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad
False
>>> a.requires_grad()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'bool' object is not callable
>>>
>>> a.requires_grad_()
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.requires_grad
True
>>> a.requires_grad_()
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.requires_grad_(False)
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad_()
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.requires_grad_(requires_grad=False)
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]])
>>> a
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad_(requires_grad=False)
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]])
>>> a
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad
False
>>>
>>>
实验代码展示:关于非叶子节点的梯度保持,不释放内存.
Microsoft Windows [版本 10.0.18363.1316]
(c) 2019 Microsoft Corporation。保留所有权利。
C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0
(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.manual_seed(seed=20200910)
<torch._C.Generator object at 0x000001CDB4A5D330>
>>> data_in = torch.randn(3,5,requires_grad=True)
>>> data_in
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> data_mean = data_in.mean()
>>> data_mean
tensor(0.0585, grad_fn=<MeanBackward0>)
>>> data_in.requires_grad
True
>>> data_mean.requires_grad
True
>>> data_1 = data_mean * 20200910.0
>>> data_1
tensor(1182591., grad_fn=<MulBackward0>)
>>> data_2 = data_1 * 15.0
>>> data_2
tensor(17738864., grad_fn=<MulBackward0>)
>>> data_2.retain_grad()
>>> data_3 = 2 * (data_2 + 55.0)
>>> loss = data_3 / 2.0 +89.2
>>> loss
tensor(17739010., grad_fn=<AddBackward0>)
>>>
>>> data_in.grad
>>> data_mean.grad
>>> data_1.grad
>>> data_2.grad
>>> data_3.grad
>>> loss.grad
>>> print(data_in.grad, data_mean.grad, data_1.grad, data_2.grad, data_3.grad, loss.grad)
None None None None None None
>>>
>>> loss.backward()
>>> data_in.grad
tensor([[20200910., 20200910., 20200910., 20200910., 20200910.],
[20200910., 20200910., 20200910., 20200910., 20200910.],
[20200910., 20200910., 20200910., 20200910., 20200910.]])
>>> data_mean.grad
>>> data_mean.grad
>>> data_1.grad
>>> data_2.grad
tensor(1.)
>>> data_3.grad
>>> loss.grad
>>>
>>>
>>> print(data_in.grad, data_mean.grad, data_1.grad, data_2.grad, data_3.grad, loss.grad)
tensor([[20200910., 20200910., 20200910., 20200910., 20200910.],
[20200910., 20200910., 20200910., 20200910., 20200910.],
[20200910., 20200910., 20200910., 20200910., 20200910.]]) None None tensor(1.) None None
>>>
>>>
>>> print(data_in.is_leaf, data_mean.is_leaf, data_1.is_leaf, data_2.is_leaf, data_3.is_leaf, loss.is_leaf)
True False False False False False
>>>
>>>
>>>