机器学习与TensorFlow编程（2）逻辑线性回归模型

0. 参考资料

1. 总体介绍

1.1. 分类（classification）

预测值是离散值（如银行是否放贷），则称此类学习任务为“分类”（classification）。

1.2. 二分类逻辑线性回归思路

输入一系列参数 $\ x\$ ，构建一系列参数 $\ \theta\$ ，获得因变量，记为 $\ z = \theta^Tx\$ 。
z 的取值范围可能是 (−∞,+∞) ，为了方便分类，将 z 映射到 [0,1] 中，记为 g(z) 。
- $\ g(z)\ >= 0.5\$ 为正类
- $\ g(z)\ < 0.5\$ 为反类
完整公式： $\ h_\theta(x) = g(z) = g(\theta^Tx)\$ ，其中 $\ h_\theta(x)$ 的取值就是为正类的概率。

2. 逻辑方程（Logistic Function）

初学者（我）不了解这个函数到底牛在哪里，于是查了下知乎和Wiki，高端，还不太明白。
函数形式（sigmoid函数）：
$h (z) = 1 1 + e - z$ $h(z) = \frac{1}{1+e^{-z}}$
函数图像为S型曲线，自变量取值范围 $\ (-\infty, +\infty)\$ ，因变量取值范围 $\ (0,1)\$ 。

3. 损失函数（Loss Function）

函数形式如下：
$C o s t (h θ (x), y) = {- log (h θ (x)) - log (1 - h θ (x)) i f y = 1 i f y = 0$ $Cost(h_\theta(x), y)= \begin{cases} -\log(h_\theta(x)) & if\ \ y = 1 \\ -\log(1-h_\theta(x)) & if\ \ y = 0\\ \end{cases}$
对于多个训练样本，其函数形式：
$J (θ) = 1 m \sum i = 1 m C o s t (h θ (x (i), y (i))) = - 1 m \sum i = 1 m [y (i) log h θ (x (i)) + (1 - y (i)) log (1 - h θ (x (i)))]$ $\begin{align*} J(\theta) &= \frac{1}{m}\sum_{i=1}^mCost(h_\theta(x^{(i)},y^{(i)})) \\ &=-\frac{1}{m}\sum^m_{i=1}[y^{(i)}\log h_\theta(x^{(i)}) + (1-y^{(i)})\log(1-h_\theta(x^{(i)}))] \end{align*}$
梯度下降公式，在Coursera Week 3 Lecture Notes 中有详细推导：
$\partial \partial θ j J (θ) = 1 m \sum i = 1 m [h θ (x (i)) - y (i)] x (j)$ $\frac{\partial}{\partial\theta_j}J(\theta) = \frac{1}{m}\sum_{i=1}^m[h_\theta(x^{(i)})-y^{(i)}]x^{(j)}$

4. 矩阵形式

4.1. 原始数据介绍

假定输入数据一共包含n个特征，m个训练样本，则输入数据为m*(n+1)维矩阵：
$X = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ 11 ⋮ 1 x (1) 1 x (2) 1 ⋮ x (m) 1 x (1) 2 x (2) 2 ⋮ x (m) 2 \dots \dots ⋱ \dots x (1) n x (2) n ⋮ x (m) n ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥$ $X = \left[ \begin{matrix} 1 & x_1^{(1)} & x_2^{(1)} & \cdots & x_n^{(1)}\\ 1 & x_1^{(2)} & x_2^{(2)} & \cdots & x_n^{(2)}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & x_1^{(m)} & x_2^{(m)} & \cdots & x_n^{(m)} \\ \end{matrix} \right]$
输入样本结果为：

$y = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ y (1) y (2) ⋮ y (m) ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥$ $y = \left[ \begin{matrix} y^{(1)} \\ y^{(2)} \\ \vdots \\ y^{(m)} \end{matrix} \right]$
要求的参数列表为：

$θ = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ θ 0 θ 1 ⋮ θ n ⎤ ⎦ ⎥ ⎥ ⎥ ⎥$ $\theta = \left[ \begin{matrix} \theta_0 \\ \theta_1 \\ \vdots \\ \theta_n \\ \end{matrix} \right]$

4.2. 矩阵计算过程

第一步，计算m维向量 $\ z\$ ：
$z = X θ$ $z = X\theta$
第二步，通过sigmoid函数计算m维向量 $\ h_\theta(x)$
$h θ (x) = s i g m o i d (z) = s i g m o i d (X θ)$ $h_\theta(x) = sigmoid(z) = sigmoid(X\theta)$
第三步，计算损失函数 $\ J(\theta)$
$J (θ) = - 1 m [(y T log (h θ (x)) + (1 - y) T log (1 - h θ (x)))]$ $J(\theta) = -\frac{1}{m}\left[(y^T\log(h_\theta(x))+(1-y)^T\log(1-h_\theta(x)))\right]$
第四步，每一次梯度下降的变化量，即n+1维向量grad：
$g r a d = 1 m X T (h θ (x) - y)$ $grad = \frac{1}{m}X^T(h_\theta(x)-y)$
第五步，通过各种优化算法得出 $\ \theta\$ 的值，计算测试样本的 $\ h_\theta(x)$ ，若结果>=0.5则判断为正例，若结果<0.5则分类为反例。

5. TensorFlow编程实现

数据源：Coursera ML Week2 课后练习数据
python源码

import tensorflow as tf
import numpy as np


def read_data(file_name, delimiter=','):
    return np.loadtxt(file_name, delimiter=delimiter)


def init_data(input_data):
    input_x = input_data[:, 0:-1]
    input_y = input_data[:, -1].reshape(m, 1)
    input_x = np.concatenate((np.ones([m, 1]), input_x), 1)
    return input_x, input_y


def tensor_flow_run(input_x, input_y, init_w):
    # 初始化参数
    x = tf.placeholder("float32", [None, n])  # m*n
    y = tf.placeholder("float32", [None, 1])  # m*1
    W = tf.Variable(init_w)  # n*1

    # 构建模型
    z = tf.matmul(x, W)  # m*1
    h = tf.sigmoid(z)  # m*1

    # 构建代价函数
    cost = (tf.reduce_sum(y*tf.log(h)) + tf.reduce_sum((1-y) * (tf.log(1-h)))) / (-m)

    # 梯度下降算法参数配置
    train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cost)

    # 判断训练集合上的准确性
    predict = tf.greater_equal(tf.sigmoid(tf.matmul(x, W)), 0.5)
    y_ = tf.equal(y, 1)
    correct_prediction = tf.equal(predict, y_)
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    with tf.Session() as sess:
        # 初始化
        init = tf.global_variables_initializer()
        sess.run(init)

        # 梯度下降算法
        for _ in range(ITERATIONS):
            sess.run(train_op, feed_dict={x: input_x, y: input_y})

        # 参数列表
        print('Octave theta: -25.16127 0.20623 0.20147')
        print('Actual theta:', sess.run(W))

        # 代价函数
        print('\nOctave Cost: 0.20350')
        print('Actual Cost:', sess.run(cost, feed_dict={x: input_x, y: input_y}))

        # 正确率
        print('\nOctave Accuracy: 0.89')
        print('Actual Accuracy:', sess.run(accuracy, feed_dict={x: input_x, y: input_y}))


# 设置超参数
LEARNING_RATE = 0.001
ITERATIONS = 500

# 读取数据
raw_data = read_data('ex2data1.txt')

# 获取样本数量，属性数量
m = raw_data.shape[0]
n = raw_data.shape[1]

# 获取输入数据
data = init_data(raw_data)

# 进行机器学习运算
tensor_flow_run(data[0], data[1], [[-25], [0.2], [.2]])

机器学习与TensorFlow编程（2）逻辑线性回归模型

0. 参考资料

1. 总体介绍

1.1. 分类（classification）

1.2. 二分类逻辑线性回归思路

2. 逻辑方程（Logistic Function）

3. 损失函数（Loss Function）

4. 矩阵形式

4.1. 原始数据介绍

4.2. 矩阵计算过程

5. TensorFlow编程实现

猜你喜欢