Tensorflow卷积神经网络LeNet模型学习

1. 基于LeNet模型的卷积神经网络模型
2. 卷积神经网络模型训练
3. 模型测试

此文章是经《Tensorflow: 实战Google深度学习框架》一书的研读，作为私密笔记便于以后查阅。代码如下链接：
链接：https://pan.baidu.com/s/1tZWySJpzZ5EmMM-WlvEPLg
提取码：ofxm

1. 基于LeNet模型的卷积神经网络模型

此MINIST手写数字识别的卷积神经网络模型，由6层网络组成：卷积1、池化1、卷积2、池化2、全连接1、全连接2。分别整理介绍如下：

1.1 第一层：卷积1

with tf.variable_scope('layer1-conv1'):
        # get_variable 共享变量;
        # truncated_normal_initializer截尾正态分布随机数组
        conv1_weights = tf.get_variable(
            "weight", [5, 5, 1, 32],
            initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv1_biases = tf.get_variable("bias", [32], initializer=tf.constant_initializer(0.0))
        conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
        # 经过conv2d后，conv1还是[28,28,32]
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))

其中：
1）input_data：训练样本输入数据，为[28,28,1]的原始MINIST图片像素。
2）conv1_weights：卷积核，尺寸5*5，通道1，深度 32，其值是截尾正态分布随机数组。

3）conv1_biases：偏置项，6个偏置项参数，初始值为常量0.0
4）conv1：二维卷积结果；因padding为SAME以及步长1，所以数据矩阵尺寸为28*28，深度为32。
5）relu1：经过relu激活函数，得到的值，作为下一层的输入。
因此这一层
输入的矩阵：[1,28,28]
输出的矩阵：[1,28,28,32]

1.2 第二层：池化1

这层实现第二层池化层的前向传播过程。使用的最大池化方法。

with tf.name_scope("layer2-pool1"):
        pool1 = tf.nn.max_pool(relu1, ksize = [1,2,2,1],strides=[1,2,2,1],padding="SAME")

其中：
1）relu1：上一层的输出。[1,28,28,32]
2）pool1 ：经过最大池化，过滤器的尺寸为2*2；使用全0填充，stride步长为2。输出[1,14,14,32]的矩阵。
因此这一层：
输入矩阵：[1,28,28,32]
输出矩阵：[1,14,14,32]

1.3 第三层：卷积2

第三层为卷积层的前向传播过程。

with tf.variable_scope("layer3-conv2"):
        conv2_weights = tf.get_variable(
            "weight", [5, 5, 32, 64],
            initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.get_variable("bias", [64], initializer=tf.constant_initializer(0.0))
        conv2 = tf.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))

其中：
1）conv2_weights ：卷积核，大小[5, 5, 32, 64]
2）conv2_biases ：偏置项，大小[64]
3）conv2 ：上层的输出pool1和conv2_weights 的二维卷积。步长1，全0填充。
4）relu2 ：对conv2进行relu激活函数作用。得出输出141464矩阵。
因此这一层：
输入矩阵：[1,14,14,32]
输出矩阵：[1,14,14,64]

1.4 第四层：池化2

第四层池化层前向传播过程。这一层输入141464矩阵，经过最大池化后，得到7764矩阵；拉伸此矩阵为一维向量，长度为7764=3136。

with tf.name_scope("layer4-pool2"):
        pool2 = tf.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        pool_shape = pool2.get_shape().as_list()
        nodes = pool_shape[1] * pool_shape[2] * pool_shape[3]
        reshaped = tf.reshape(pool2, [pool_shape[0], nodes])

1）pool2 ：对上一层的卷积输出relu2矩阵，进行最大池化，其步长2，池化滤波器尺寸22。pool2大小为77*64矩阵。

2）pool_shape ：pool2矩阵的维度列表[1,7,7,64]。例如list = [1,2,2,1]，list[0]是batch个数；list[1]和list[2]是pool2矩阵的尺寸行列；list[3]是深度。
3）nodes ：把pool2矩阵拉伸为一维向量的长度值。7764=3136
4）reshaped ：将pool2矩阵从141464矩阵 reshape为1*3136向量。此作为下一层的输入。
因此这一层：
输入矩阵：[1,14,14,64]
输出矩阵：[1,3136]

1.5 第五层：全连接1

第五层全连接层的前向传播过程。这一层输入前一层的拉直后的一组[1,3136]大小的向量，输出[1,512]大小的一维向量。

with tf.variable_scope('layer5-fc1'):
        fc1_weights = tf.get_variable("weight", [nodes, 512],
                                      initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.add_to_collection('losses', regularizer(fc1_weights))
        fc1_biases = tf.get_variable("bias", [FC_SIZE], initializer=tf.constant_initializer(0.1))

        fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)
        if train: fc1 = tf.nn.dropout(fc1, 0.5)

此层，矩阵乘积再加上偏置项。如果train为真，则经过dropout处理，随机将矩阵部分元素的值改为0，其他元素除以传入的系数。避免过拟合，使得模型效果更好。
******dropout小例子：

x = np.array([1,2,3,4,5], dtype = 'float32')

dropout = tf.nn.dropout(x,0.5)

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    print(sess.run(dropout))

结果：
在这里插入图片描述
********add_to_collection 把值加到命名的集合中。此code中，将fc1_weights的正则化结果添加到losses集合。

1.6 第六层：全连接2

第六层还是全连接层的前向传播过程。这一层输入为上一层的输出512长度的向量，输出10长度的向量。

with tf.variable_scope('layer6-fc2'):
        fc2_weights = tf.get_variable("weight", [512, 10],
                                      initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None: tf.add_to_collection('losses', regularizer(fc2_weights))
        fc2_biases = tf.get_variable("bias", [10], initializer=tf.constant_initializer(0.1))
        logit = tf.matmul(fc1, fc2_weights) + fc2_biases

实现方式，和第五层一样，不再赘述。

2. 卷积神经网络模型训练

采用L2正则化、滑动平均、sparse softmax交叉熵、指数衰减学习率、梯度下降技术。

1）定义神经网络相关的参数

BATCH_SIZE = 100 #批处理样本数
LEARNING_RATE_BASE = 0.01 #初始学习率
LEARNING_RATE_DECAY = 0.99 #指数衰减法，衰减系数
REGULARIZATION_RATE = 0.0001 #正则化系数
TRAINING_STEPS = 6000 #训练轮次
MOVING_AVERAGE_DECAY = 0.99 #滑动平均衰减率

2）定义输入输出矩阵的placeholder

    x = tf.placeholder(tf.float32, [
            BATCH_SIZE,
            28,
            28,
            1],
        name='x-input')
    y_ = tf.placeholder(tf.float32, [None, 10], name='y-input')
    regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
    y = LeNet5_infernece.inference(x,False,regularizer)
    global_step = tf.Variable(0, trainable=False) #LeNet5_infernece py文件在第一节

3）定义保存模型相关

ckpt_dir = "./ckpt_dir"
    if not os.path.exists(ckpt_dir):
        os.makedirs(ckpt_dir)
saver = tf.train.Saver()

4）定义损失函数、学习率、滑动平均操作以及训练过程

    #ExponentialMovingAverage滑动平均模型
    variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
    variables_averages_op = variable_averages.apply(tf.trainable_variables())
    # sparse softmax交叉熵
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1))
    # 交叉熵矩阵的所有元素均值
    cross_entropy_mean = tf.reduce_mean(cross_entropy)
    # get_collection返回一个列表,这个列表是所有这个集合中的元素. 通过add_n将列表的元素值相加.
    loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
    # 指数衰减法的学习率。初始学习率LEARNING_RATE_BASE=0.01
    # 迭代当前轮 global_step, 
    # 衰减速度mnist.train.num_examples/BATCH_SIZE=minist样本数/100
    # 本例中mnist.train.num_examples = 55000
    # 衰减系数LEARNING_RATE_DECAY=0.99
    learning_rate = tf.train.exponential_decay(
        LEARNING_RATE_BASE,
        global_step,
        mnist.train.num_examples / BATCH_SIZE,
        LEARNING_RATE_DECAY,
        staircase=True)
    print("mnist.train.num_examples ", mnist.train.num_examples)

    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
    with tf.control_dependencies([train_step, variables_averages_op]):
        train_op = tf.no_op(name='train')
    with tf.Session() as sess:
        ckpt = tf.train.get_checkpoint_state(ckpt_dir)
        if ckpt and ckpt.model_checkpoint_path:
            print(ckpt.model_checkpoint_path)
            saver.restore(sess, ckpt.model_checkpoint_path) # restore all variables
        else:
            tf.global_variables_initializer().run()
        for i in range(TRAINING_STEPS):
            xs, ys = mnist.train.next_batch(BATCH_SIZE)

            reshaped_xs = np.reshape(xs, (
                BATCH_SIZE,
                28,
                28,
                1))
            _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: reshaped_xs, y_: ys})

            if i % 1000 == 0:
                print("After %d training step(s), loss on training batch is %g." % (step, loss_value))
        saver.save(sess, ckpt_dir + "/model.ckpt")

xs, ys = mnist.train.next_batch(BATCH_SIZE)的数据样式：我取得BATCH_SIZE = 10
在这里插入图片描述

4）执行训练

def main(argv=None):
    mnist = input_data.read_data_sets("../datasets/MNIST_data", one_hot=True)
    train(mnist) #train函数就是由上面的2）3）4）部分组成

if __name__ == '__main__':
    main()

3. 模型测试

使用python，读取一张手写数字图片，通过已存训练模型，识别出结果。
完整代码test.py


import os

from PIL import Image, ImageFilter
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

from LeNet5_train import BATCH_SIZE, REGULARIZATION_RATE, MOVING_AVERAGE_DECAY, LEARNING_RATE_DECAY, LEARNING_RATE_BASE
from LeNet5_inference import IMAGE_SIZE, NUM_CHANNELS, OUTPUT_NODE, inference

# 忽略不想提示警告错误的信息
# 1是提示，2是警告，3是错误
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

def imageprepare(pic_path): 
    im = Image.open(pic_path) #读取的图片所在路径，注意是28*28像素
    plt.imshow(im)  #显示需要识别的图片
    plt.show()
    im = im.convert('L')
    tv = list(im.getdata())
    tva = [(255-x)*1.0/255.0 for x in tv]
    tva = np.asarray(tva, dtype='float32')
    tva = tva.reshape(28, 28, 1)
    print("input pic shape: ", tva.shape)
    return tva

def test(pic_data):
    """test verify"""
    # as follow the model of predicting result is the same as training model. 
    。。。。。。此处省略

    saver = tf.train.Saver()

    predict_res = 0

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        saver.restore(sess, "E:/Study/AI/src/ckpt_dir/model.ckpt") #使用模型，参数和之前的代码保持一致
        print("y：", sess.run(y, feed_dict={x: [pic_data]}))
        #经过CNN得到一维向量，长度为10(0-9十个分类的计算值)，最大元素值的index下标值为识别结果。
        predict_result = sess.run(tf.arg_max(y,1), feed_dict={x: [pic_data]})
        predict_res = predict_result[0]

    return predict_res


if __name__ == '__main__':
    pic_path = 'C:/Users/Administrator/Pictures/5.png'
    pic_data = imageprepare(pic_path)
    result = test(pic_data)
    print("识别结果: ", result)

Tensorflow卷积神经网络LeNet模型学习

Tensorflow卷积神经网络LeNet模型学习

1. 基于LeNet模型的卷积神经网络模型

1.1 第一层：卷积1

1.2 第二层：池化1

1.3 第三层：卷积2

1.4 第四层：池化2

1.5 第五层：全连接1

1.6 第六层：全连接2

2. 卷积神经网络模型训练

3. 模型测试

猜你喜欢