import tensorflow as tf
w = tf.Variable([[1.0]])
with tf.GradientTape() as t:
loss = w * w
dw = t.gradient(loss, w)
dw
求w*w 在w=1.0处的导数
输出:
<tf.Tensor: id=39, shape=(1, 1), dtype=float32, numpy=array([[2.]], dtype=float32)>
再计算一次:
dw = t.gradient(loss, w)
报错:RuntimeError: GradientTape.gradient can only be called once on non-persistent tapes.
注意两点:
1、求在某处w的导数,w必须为浮点类型
2、GradientTape 占用的资源默认情况下dw = t.gradient(loss, w)计算完毕就会立即释放
求多个微分,设置GradientTape 的persistent参数
w = tf.constant(3.0)
with tf.GradientTape(persistent=True) as t:
t.watch(w)
y = w * w
z = y * y
dy_dw = t.gradient(y, w)
print(dy_dw)
dz_dw = t.gradient(z, w)
print(dz_dw)
tf.Tensor(6.0, shape=(), dtype=float32) tf.Tensor(108.0, shape=(), dtype=float32)
dy_dw = t.gradient(y, w)、dz_dw = t.gradient(z, w)计算完后不会释放资源
再计算一次
dz_dw = t.gradient(z, w)
print(dz_dw)
输出:tf.Tensor(108.0, shape=(), dtype=float32)