Logistic回归分类器
logistic回归是一种广义的线性回归分析模型,logistic回归的模型与线性回归分析模型基本相同,对于自变量 x x x和因变量 w ∗ x + b w * x + b w∗x+b,通过逻辑回归函数将因变量的值映射到 ( 0 , 1 ) (0, 1) (0,1),logistic回归模型试图学得一个通过属性的线性组合来进行预测的函数
f ( x ) = x T + b f(\pmb{x}) = \pmb{x}^T + b f(xxx)=xxxT+b
Logistic回归试图学得合适的权重向量 w \pmb{w} www和实数 b b b,对于标记向量 y \pmb{y} yyy,使得 f ( x ) ≈ y f(\pmb{x}) \approx \pmb{y} f(xxx)≈yyy
import numpy as np
from numpy.core.fromnumeric import shape
import matplotlib.pyplot as plt
读取训练数据集
def load_data_set():
data_matrix = []
label_matrix = []
with open('testSet.txt', "r+") as file:
for line in file.readlines():
data = line.strip().split()
data_matrix.append([1.0, float(data[0]), float(data[1])])
label_matrix.append(int(data[2]))
return data_matrix, label_matrix
Sigmoid函数
sigmoid函数也叫Logistic函数,可以将一个实数映射到(0,1)的区间,用来而分类
S ( x ) = 1 1 + e − x S(x) = \frac{1}{1 + e^{-x} } S(x)=1+e−x1
sigmoid函数的导数为
S ( x ) = S ( x ) ∗ S ( 1 − x ) S(x) = S(x) * S(1 - x) S(x)=S(x)∗S(1−x)
def sigmoid(X):
return 1.0 / (1 + np.exp(-X))
data_matrix, label_matrix = load_data_set()
data_matrix = np.mat(data_matrix)
label_matrix = np.mat(label_matrix).transpose()
m, n = shape(data_matrix)
m, n
(100, 3)
alpha = 0.001
max_cycles = 500
weights = np.ones((n, 1))
weights
array([[1.],
[1.],
[1.]])
h = sigmoid(data_matrix * weights)
h
matrix([[0.9999997 ],
[0.98616889],
[0.99887232],
[0.99892083],
[0.99999619],
[0.99979122],
[0.99999945],
[0.99553342],
[0.99998516],
[0.99998882],
[0.99984482],
[0.99999982],
[0.99524519],
[0.99975551],
[0.99793879],
[0.97128332],
[0.99919801],
[0.97477903],
[0.77681757],
[0.99957748],
[0.9980066 ],
[0.22252829],
[0.99999498],
[0.26394949],
[0.8246228 ],
[0.99999261],
[0.99991432],
[0.01392443],
[0.99215449],
[0.99999407],
[0.99007735],
[0.99994736],
[0.999999 ],
[0.05986936],
[0.99921454],
[0.99997998],
[0.99997966],
[0.99982544],
[0.99999104],
[0.99998525],
[0.97919678],
[0.99971059],
[0.99997751],
[0.93705909],
[0.9890627 ],
[0.99996675],
[0.1359093 ],
[0.99921684],
[0.99999079],
[0.99999622],
[0.99995015],
[0.99998279],
[0.99982675],
[0.99999982],
[0.9994663 ],
[0.99964232],
[0.9999885 ],
[0.99997259],
[0.99999121],
[0.99542831],
[0.98631076],
[0.96925991],
[0.99995761],
[0.99999899],
[0.99999879],
[0.21808844],
[0.99995494],
[0.99999908],
[0.99999865],
[0.99998668],
[0.99999443],
[0.53267014],
[0.99999957],
[0.97651256],
[0.99998887],
[0.99993141],
[0.33507029],
[0.98891672],
[0.99968925],
[0.98927143],
[0.99613509],
[0.03702176],
[0.99999797],
[0.99999593],
[0.83044946],
[0.17239595],
[0.9820568 ],
[0.9999997 ],
[0.99973113],
[0.75736609],
[0.59244738],
[0.99999982],
[0.9999823 ],
[0.88578868],
[0.82357126],
[0.98572192],
[0.9999961 ],
[0.26402371],
[0.99999196],
[0.99999989]])
梯度上升算法
梯度上升算法,用来求解函数的最大值,沿着梯度的方向上升的速度最快
对于一个函数 y = f ( x ) y = f(\pmb{x}) y=f(xxx),这个函数的导数(derivative)记为 f ′ ( x ) f\prime(x) f′(x)或 d y d x \frac{dy}{dx} dxdy,导数代表f(x)在点x处的斜率,表明如何缩放输入的小变化才能在输出获得相应的变化:
f ( x + ϵ ) ≈ f ( x ) + ϵ d y d x f(x+ \epsilon) \approx f(x) + \epsilon \frac{dy}{dx} f(x+ϵ)≈f(x)+ϵdxdy
针对具有多位输入的函数,需要用到偏导数(partial derivative),偏导数 ∂ f ( ( x ) ) ∂ x i \frac{\partial f (\pmb(x))}{\partial x_i} ∂xi∂f((((x))衡量点x处只有 x i x_i xi增加时f(x)如何变化,梯度(gradient)是相对一个向量求导的导数,f的导数是包含所有偏导数的向量
梯度向量指向上坡,在梯度方向上移动增加f,称为最速上升法(method of steepest descent)或梯度上升(gradient descent)算法
梯度上升算法建议新的点为
x ′ = x + ϵ ∂ f ( x ) ∂ x i x' = x + \epsilon \frac{\partial f( \pmb{x} )}{\partial x_i} x′=x+ϵ∂xi∂f(xxx)
ε指的是学习率,一个确定步长大小的正标量,通常选择一个较小的常数
sigmoid函数的输入为 z = w 0 x 0 + w 1 x 1 + . . . + w n x n z = w_0 x_0 + w_1 x_1 + ... + w_n x_n z=w0x0+w1x1+...+wnxn
通过多次迭代不断更新权重向量 w \pmb{w} www和 b b b,使得 f ( x ) f(\pmb{x}) f(xxx)接近 y \pmb{y} yyy
for i in range(max_cycles):
h = sigmoid(data_matrix * weights)
error = (label_matrix - h)
weights += alpha * data_matrix.transpose() * error
weights
array([[ 4.12414349],
[ 0.48007329],
[-0.6168482 ]])
weights * [1.0, -0.017612, 14.053064]
array([[ 4.12414349e+00, -7.26344151e-02, 5.79568524e+01],
[ 4.80073293e-01, -8.45505083e-03, 6.74650071e+00],
[-6.16848197e-01, 1.08639304e-02, -8.66860719e+00]])
def test(x):
return sigmoid(np.sum(weights.transpose() * list(x))) > 0.5
测试模型准确率
accuracy = 0.0
for x, y in zip(data_matrix, label_matrix):
if test(x) == True and y == 1:
accuracy += 1
elif test(x) == False and y == 0:
accuracy += 1
accuracy / len(label_matrix)
0.96
引用
周志华. 机器学习 : = Machine learning[M]. 清华大学出版社, 2016.
[美] 伊恩·古德费洛 / [加] 约书亚·本吉奥 / [加] 亚伦·库维尔. 深度学习. 人民邮电出版社, 2017.
哈林顿李锐. 机器学习实战 : Machine learning in action[M]. 人民邮电出版社, 2013.
最后
- 由于博主水平有限,不免有疏漏之处,欢迎读者随时批评指正,以免造成不必要的误解!