B站up主 刘二大人《PyTorch深度学习实践》完结合集
梯度下降——笔记和代码
实际上,在深度神经网络里面,它的损失函数当中,并没有非常多的局部最优点,但存在一种鞍点,由于导数为0使得权重无法更新。
求得gradient后,使w朝着导数的负方向更新,会使得cost(w)减小,如下图所示:
梯度下降
import matplotlib.pyplot as plt
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
w = 1.0
def forward(x):
return x*w
'求cost(w),在当前w的条件下,求出y_pred,计算单个样本的loss,再累加,最后将loss累加和除以d_data的数量算出cost'
def costfunction(xs,ys):
cost = 0
for x, y in zip(xs, ys):
y_pred = forward(x)
loss = (y_pred - y)**2
cost += loss
return cost/len(xs)
"在当前w的条件下,利用cost(w)导数公式求出导数,累加,除以len(x_data),算出gradient"
def gradient(xs,ys):
grad = 0
for x,y in zip(xs,ys):
grad += 2*x*(x*w-y)
return grad/len(xs)
'创立列表用于最后的可视化'
epoch_list = []
cost_list = []
'迭代100次'
for epoch in range(100):
cost_val = costfunction(x_data,y_data) #计算cost
grad_val = gradient(x_data,y_data) #计算当前梯度
w -= 0.01*grad_val #梯度下降
print("epoch", epoch, "w=", w, "cost=", cost_val)
epoch_list.append(epoch) #每次迭代就向列表增加一个数据
cost_list.append(cost_val)
plt.plot(epoch_list,cost_list)
plt.ylabel('cost_val')
plt.xlabel('epoch')
plt.show()
随机梯度下降
cost(w)用的是所有样本,这里采用loss(w)采用单个样本,这样当陷入鞍点时,由于单个样本带来的随机噪声,可能会向前推进从而度过鞍点。
import matplotlib.pyplot as plt
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
w = 1
def forward(x):
return x*w
def lossfunction(x,y):
y_pred = forward(x)
loss = (y_pred-y)**2
return loss
def gradient(x,y):
grad = 2*x*(x*w-y)
return grad
epoch_list = []
loss_list = []
for epoch in range(200):
for x,y in zip(x_data,y_data):
loss_val = lossfunction(x, y)
grad_val = gradient(x, y)
w -= 0.001 * grad_val
epoch_list.append(epoch)
loss_list.append(loss_val)
print('epoch=', epoch,'loss=', loss_val, 'w=', w)
plt.plot(epoch_list,loss_list)
plt.ylabel('loss_val')
plt.xlabel('epoch')
plt.show()