1. 原理
Logistic Regression 虽然被称为回归,但其实际上是分类模型,并常用于二分类。Logistic 回归的本质是:假设数据服从这个分布,然后使用极大似然估计做参数的估计。
逻辑回归,该模型的输出变量范围始终在0和1之间。 逻辑回归模型的假设是 h θ ( x ) = g ( θ T X ) h_{\theta}(x)=g(\theta^{T}X) hθ(x)=g(θTX)中: 代表特征向量 代表逻辑函数(logistic function)是一个常用的逻辑函数为S形函数(Sigmoid function),公式为: g ( z ) = 1 1 + e − z g(z) = \frac{1}{1+e^{-z}} g(z)=1+e−z1
h θ ( x ) h_{\theta}(x) hθ(x)的作用是,对于给定的输入变量,根据选择的参数计算输出变量=1的可能性(estimated probablity)即 h θ ( x ) = P ( y = 1 ∣ x ; θ ) h_{\theta}(x) = P(y=1|x;\theta) hθ(x)=P(y=1∣x;θ)
如何拟合逻辑回归模型的参数 θ \theta θ。具体来说,我要定义用来拟合参数的优化目标或者叫代价函数,这便是监督学习问题中的逻辑回归模型的拟合问题。
2.逻辑回归代价回归–交叉熵
代价函数需要满足两个最基本的要求:能够评价模型的准确性,对参数可微。
在线性回归中,最常用的是均方误差(Mean squared error),即
J ( θ ) = 1 m ∑ i = 1 m 1 2 ( h θ ( x ( i ) ) − y ( i ) ) 2 J(\theta) = \frac{1}{m}\sum_{i=1}^m \frac{1}{2}(h_{\theta}(x^{(i)})-y^{(i)})^{2} J(θ)=m1i=1∑m21(hθ(x(i))−y(i))2
其中:
m:训练样本的个数;
h θ ( x ) h_{\theta}(x) hθ(x): 用参数 θ \theta θ和x预测出来的y值;
y :原训练样本中的y值,也就是标准答案;
然而,在逻辑回归中,常用的代价函数是交叉熵
2. Tensorflow 逻辑回归二分类示例
import os
import tensorflow as tf
from numpy.random import RandomState
import matplotlib.pyplot as plt
import numpy as np
# 忽略不想提示警告错误的信息
# 1是提示,2是警告,3是错误
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
batch_size = 8
w1 = tf.Variable(tf.random_normal([2,1], stddev=1, seed=1))
b = tf.Variable(tf.random_normal([1,1], stddev=1, seed=1))
x = tf.placeholder(tf.float32, shape=(None, 2), name='x-input')
y_ = tf.placeholder(tf.float32, shape=(None, 1), name='y-input')
y = tf.sigmoid(tf.matmul(x,w1)+b)
def cross_entropy_func(which, y_, y):
"""which: 2 和 4 是一样的效果"""
if which == 2:
return -1* y_ * tf.log(y) - (1-y_) * tf.log(1-y)
if which == 3:
return y_*tf.log(tf.clip_by_value(y, 1e-10, 1.0))+(1-y_)*tf.log(tf.clip_by_value(1-y, 1e-10, 1.0))
if which == 4:
y = tf.matmul(x,w1)+b
return tf.nn.sigmoid_cross_entropy_with_logits(labels=y_, logits=y)
cross_entropy = tf.reduce_mean(cross_entropy_func(4, y_, y))
train_step = tf.train.AdamOptimizer(0.001).minimize(cross_entropy)
rdm = RandomState(1)
dataset_size = 100
X = rdm.rand(dataset_size, 2)
# print('X: ', X[:,1])
Y = [[int(x1+x2<1)] for (x1, x2) in X]
# print('Y: ', Y)
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS = 20000
for i in range(STEPS):
start = (i * batch_size) % dataset_size
end = min(start+batch_size, dataset_size)
# sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})
sess.run(train_step, feed_dict={x: X, y_: Y})
if i % 1000 == 0:
total_cross_entropy = sess.run(cross_entropy, feed_dict={x:X, y_:Y})
print("After %d training step(s), cross entropy on all data is %g" %(i, total_cross_entropy))
w1_value, b_value = sess.run([w1, b])
print(w1_value, b_value)
for i in range(dataset_size):
if Y[i][0] == 0:
plt.plot(X[i,0], X[i,1], 'bo')
elif Y[i][0] == 1:
plt.plot(X[i,0], X[i,1], 'ro')
else:
plt.plot(X[i,0], X[i,1], 'go')
x1 = np.linspace(X[:,0].min(), X[:,0].max())
y1 = -1*x1*w1_value[0][0]/w1_value[1][0] - b_value[0][0]/w1_value[1][0]
plt.plot(x1,y1, 'g-')
plt.show()
结果: