torch.Tensor.requires_grad属性的使用说明

在这里插入图片描述

原文及翻译:

requires_grad()
方法: requires_grad()
    Is True if gradients need to be computed for this Tensor, False otherwise.
    如果需要为这个张量计算梯度,那么该函数返回True,否则返回False.
    
    Note  注意:
    The fact that gradients need to be computed for a Tensor do not mean that the grad 
    attribute will be populated, see is_leaf for more details.
    我们需要清楚这个事实: 如果一个张量我们需要为它计算梯度,这并不意味着这个张量地grad属性会一直
    被保持,详细信息请参考is_leaf属性.
    (译者注:如果一个张量地requires_grad=True,那么在调用backward()方法时反向传播计算梯度,我们
    会为这个张量计算梯度,但是计算完梯度之后这个梯度并不一定会一直保存在属性grad中.只有对于
    requires_grad=True的叶子张量,我们才会将梯度一直保存在该叶子张量的grad属性中,对于非叶子节点,
    即中间节点的张量,我们在计算完梯度之后为了更高效地利用内存,我们会将梯度grad的内存释放掉.)

实验代码展示:

Microsoft Windows [版本 10.0.18363.1316]
(c) 2019 Microsoft Corporation。保留所有权利。

C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0

(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.manual_seed(seed=20200910)
<torch._C.Generator object at 0x00000151BC7DD330>
>>>
>>> a = torch.randn(3,5)
>>> a
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad
False
>>> a.requires_grad()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'bool' object is not callable
>>>
>>> a.requires_grad_()
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.requires_grad
True
>>> a.requires_grad_()
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.requires_grad_(False)
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad_()
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.requires_grad_(requires_grad=False)
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]])
>>> a
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad_(requires_grad=False)
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]])
>>> a
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]])
>>> a.requires_grad
False
>>>
>>>

实验代码展示:关于非叶子节点的梯度保持,不释放内存.

Microsoft Windows [版本 10.0.18363.1316]
(c) 2019 Microsoft Corporation。保留所有权利。

C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0

(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.manual_seed(seed=20200910)
<torch._C.Generator object at 0x000001CDB4A5D330>
>>> data_in = torch.randn(3,5,requires_grad=True)
>>> data_in
tensor([[ 0.2824, -0.3715,  0.9088, -1.7601, -0.1806],
        [ 2.0937,  1.0406, -1.7651,  1.1216,  0.8440],
        [ 0.1783,  0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> data_mean = data_in.mean()
>>> data_mean
tensor(0.0585, grad_fn=<MeanBackward0>)
>>> data_in.requires_grad
True
>>> data_mean.requires_grad
True
>>> data_1 = data_mean * 20200910.0
>>> data_1
tensor(1182591., grad_fn=<MulBackward0>)
>>> data_2 = data_1 * 15.0
>>> data_2
tensor(17738864., grad_fn=<MulBackward0>)
>>> data_2.retain_grad()
>>> data_3 = 2 * (data_2 + 55.0)
>>> loss = data_3 / 2.0 +89.2
>>> loss
tensor(17739010., grad_fn=<AddBackward0>)
>>>
>>> data_in.grad
>>> data_mean.grad
>>> data_1.grad
>>> data_2.grad
>>> data_3.grad
>>> loss.grad
>>> print(data_in.grad, data_mean.grad, data_1.grad, data_2.grad, data_3.grad, loss.grad)
None None None None None None
>>>
>>> loss.backward()
>>> data_in.grad
tensor([[20200910., 20200910., 20200910., 20200910., 20200910.],
        [20200910., 20200910., 20200910., 20200910., 20200910.],
        [20200910., 20200910., 20200910., 20200910., 20200910.]])
>>> data_mean.grad
>>> data_mean.grad
>>> data_1.grad
>>> data_2.grad
tensor(1.)
>>> data_3.grad
>>> loss.grad
>>>
>>>
>>> print(data_in.grad, data_mean.grad, data_1.grad, data_2.grad, data_3.grad, loss.grad)
tensor([[20200910., 20200910., 20200910., 20200910., 20200910.],
        [20200910., 20200910., 20200910., 20200910., 20200910.],
        [20200910., 20200910., 20200910., 20200910., 20200910.]]) None None tensor(1.) None None
>>>
>>>
>>> print(data_in.is_leaf, data_mean.is_leaf, data_1.is_leaf, data_2.is_leaf, data_3.is_leaf, loss.is_leaf)
True False False False False False
>>>
>>>
>>>

torch.Tensor.requires_grad属性的使用说明

猜你喜欢