数据集:
T={(x
1,y1),(x
2,y2),⋯(x
N,yN)}
xi
∈X⊆ Rn,yi∈Y={0,1}
求解已知
x
对应的
y,即:
P(y∣x
) ,也即:
P(y=1∣x
)
对数概率函数:
P(y=1∣x
)=1+e−(w
T⋅x
+b)1记:
w~
=(w
,b)T
x~
=(x
,1)T
D(x~
)=P(y=1∣x~
)=1+e−(w~
T⋅x~
)1=1+ew~
T⋅x~
ew~
T⋅x~
同理:
1−D(x~
)=P(y=0∣x~
)=1+ew~
T⋅x~
1似然函数:
S(w~
∣X)=i=1∏N[D(x~
i)]yi[1−D(x~
i)]1−yi,yi∈{0,1}对数似然函数:
L(w~
)=logS(w~
∣X)
⇒L(w~
)=i∑N[yilogD(x~
i)+(1−yi)log(1−D(x~
i))]
⇒L(w~
)=i∑N[yilog1−D(x~
i)D(x~
i)+log(1−D(x~
i))]
⇒L(w~
)=i∑N[yi(w~
T⋅x~
i)−log(1+ew~
T⋅x~
i)]
极大似然估计法:
w~
∗=argw~
Tmaxi∑N[yi(w~
T⋅x~
i)−log(1+ew~
T⋅x~
i)]
梯度下降法:
最大化:L(w~
)=i∑N[yilogD(x~
i)+(1−yi)log(1−D(x~
i))]
令:J(w~
)=−N1L(w~
)
最小化:J(w~
)=−N1i∑N[yilogD(x~
i)+(1−yi)log(1−D(x~
i))]
∂w~
j∂J(w~
)=−N1i∑N[yiD(x~
i)1∂w~
j∂D(x~
i)−(1−yi)1−D(x~
i)1∂w~
j∂D(x~
i)]
=−N1i∑N(yiD(x~
i)1−(1−yi)1−D(x~
i)1)∂w~
j∂D(x~
i)
=−N1i∑N(yiD(x~
i)1−(1−yi)1−D(x~
i)1)D(x~
i)(1−D(x~
i))∂w~
j∂(w~
T⋅x~
)
=−N1i∑N(yi(1−D(x~
i))−(1−yi)D(x~
i))∂w~
j∂(w~
T⋅x~
)
=−N1i∑N(yi−D(x~
i))x~
ij
=N1i∑N(D(x~
i)−yi)x~
ij
⇒N1(D(X~)−y
)X~⋅j最后:
w~
j=w~
j−N1η(D(X~)−y
)TX~⋅j
其中,D(X~)=1+eX~⋅w~
eX~⋅w~
python实例-1
import numpy as np
import matplotlib.pyplot as plt
def computeCost(X, y, w_e):
return 1.0 / (2 * len(y)) * np.dot((np.dot(X, w_e) - y).T, (np.dot(X, w_e) - y))
def D(X,w_e):
return np.exp(np.dot(X, w_e)) / (1 + np.exp(-np.dot(X, w_e)))
def gradientDescent(X, y, w_e, alpha):
for j in range(len(w_e)):
w_e[j] = w_e[j] - alpha * 1.0 / len(y) * np.dot((D(X,w_e) - y).T, X[:, j])
return w_e
X = np.random.rand(10, 5)
m = np.ones((10, 1))
X = np.concatenate((X, m), axis=1)
w = np.array([[1.0], [2.0], [3.0], [4.0], [5.0], [10.0]])
y = np.dot(X, w)
a = np.mean(y, axis=0)
for i in range(10):
if y[i][0] > a:
y[i][0] = 1
else: y[i][0] = 0
w_e = np.zeros_like(w)
cost = []
for i in range(10000):
w_e = gradientDescent(X, y, w_e, 0.001)
cost.append(computeCost(X, y, w_e)[0, 0])
print(y)
print(D(X,w_e))
fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax1.plot(range(10000), cost,label='lost')
ax1.set_yscale('log')
ax1.set_xlabel("Iteration")
ax1.set_ylabel("Loss")
ax1.legend(loc='best')
plt.show()
[[1.]
[0.]
[1.]
[0.]
[0.]
[1.]
[0.]
[1.]
[1.]
[1.]]
[[0.63103198]
[0.54273564]
[0.72331236]
[0.55213718]
[0.41564281]
[0.78018675]
[0.49371131]
[0.59834743]
[0.67121118]
[0.86004243]]
损失函数:
![在这里插入图片描述](https://img-blog.csdnimg.cn/20181207221307618.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L1dfTEFJTEFJ,size_16,color_FFFFFF,t_70)
增加至迭代30000轮数
import numpy as np
import matplotlib.pyplot as plt
def computeCost(X, y, w_e):
return 1.0 / (2 * len(y)) * np.dot((np.dot(X, w_e) - y).T, (np.dot(X, w_e) - y))
def D(X,w_e):
return np.exp(np.dot(X, w_e)) / (1 + np.exp(-np.dot(X, w_e)))
def gradientDescent(X, y, w_e, alpha):
for j in range(len(w_e)):
w_e[j] = w_e[j] - alpha * 1.0 / len(y) * np.dot((D(X,w_e) - y).T, X[:, j])
return w_e
X = np.random.rand(10, 5)
m = np.ones((10, 1))
X = np.concatenate((X, m), axis=1)
w = np.array([[1.0], [2.0], [3.0], [4.0], [5.0], [10.0]])
y = np.dot(X, w)
a = np.mean(y, axis=0)
for i in range(10):
if y[i][0] > a:
y[i][0] = 1
else: y[i][0] = 0
w_e = np.zeros_like(w)
cost = []
for i in range(30000):
w_e = gradientDescent(X, y, w_e, 0.001)
cost.append(computeCost(X, y, w_e)[0, 0])
print(y)
print(D(X,w_e))
fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax1.plot(range(30000), cost,label='lost')
ax1.set_yscale('log')
ax1.set_xlabel("Iteration")
ax1.set_ylabel("Loss")
ax1.legend(loc='best')
plt.show()
![在这里插入图片描述](https://img-blog.csdnimg.cn/2018120722295968.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L1dfTEFJTEFJ,size_16,color_FFFFFF,t_70)
损失有点上升!
可以接着减小学习率
import numpy as np
import matplotlib.pyplot as plt
def computeCost(X, y, w_e):
return 1.0 / (2 * len(y)) * np.dot((np.dot(X, w_e) - y).T, (np.dot(X, w_e) - y))
def D(X,w_e):
return np.exp(np.dot(X, w_e)) / (1 + np.exp(-np.dot(X, w_e)))
def gradientDescent(X, y, w_e, alpha):
for j in range(len(w_e)):
w_e[j] = w_e[j] - alpha * 1.0 / len(y) * np.dot((D(X,w_e) - y).T, X[:, j])
return w_e
X = np.random.rand(10, 5)
m = np.ones((10, 1))
X = np.concatenate((X, m), axis=1)
w = np.array([[1.0], [2.0], [3.0], [4.0], [5.0], [10.0]])
y = np.dot(X, w)
a = np.mean(y, axis=0)
for i in range(10):
if y[i][0] > a:
y[i][0] = 1
else: y[i][0] = 0
w_e = np.zeros_like(w)
cost = []
for i in range(30000):
w_e = gradientDescent(X, y, w_e, 0.0001)
cost.append(computeCost(X, y, w_e)[0, 0])
print(y)
print(D(X,w_e))
fig = plt.figure()
ax1 = fig.add_subplot(2, 1, 1)
ax1.plot(range(30000), cost,label='lost')
ax1.set_yscale('log')
ax1.set_xlabel("Iteration")
ax1.set_ylabel("Loss")
ax1.legend(loc='best')
plt.show()
![在这里插入图片描述](https://img-blog.csdnimg.cn/20181207223746326.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L1dfTEFJTEFJ,size_16,color_FFFFFF,t_70)