【深度学习】【python】卷积神经网络实现 中文注释版

【深度学习】【python】卷积神经网络实现 中文注释版

“你的代码很不错,不过下一秒就是我的了.”
环境要求

  • python3.5
  • tensorflow 1.4
  • pytorch 0.2.0

本程序只需要tensorflow.
程序如下:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# https://github.com/xiaohu2015/DeepLearning_tutorials

"""卷积神经网络"""
import numpy as np
import tensorflow as tf
import input_data
from logisticRegression import LogisticRegression
from mlp import HiddenLayer

"""
-------------CNN降低训练参数的2大法宝---------------
局部感受野、权值共享 
局部感受野:就是输出图像某个节点(像素点)的响应所对应的最初的输入图像的区域就是感受野. 
权值共享  :比如步长为1,如果每移动一个像素就有一个新的权值对应,那么太夸张了,需要训练的参数爆炸似增长,
            权值共享就是将每次覆盖区域的图像所赋给的权值都是卷积核对应的权值.就是说用了这个卷积核,
            则不管这个卷积核移到图像的哪个位置上,图像的被覆盖区域的所赋给的权值都是该卷积核的参数.

-------------从全连接到CNN经历了什么--------------            
演化进程: 全连接——->(全连接加上局部感受野了进化成)局部连接层———->(局部连接层加上权值共享了)卷积神经网络. 

-------------feature map----------------
同一种滤波器卷积得到的向量组合.一种滤波器提取一种特征,使用了6种滤波器,进行卷积操作,故有6层feature map.

----------------CNN训练的参数是什么-------------------
其实就是卷积核,当然还有偏置. 
"""

class ConvLayer(object):
    """卷积层"""
    def __init__(self, inpt, filter_shape, strides=(1, 1, 1, 1),
                 padding="SAME", activation=tf.nn.relu, bias_setting=True):
        """
        -----------变量说明-----------------

        inpt: tf.Tensor, 维度为 [n_examples, witdth, height, channels];
        filter_shape: 卷积核的维度, list或tuple, 形式为[witdth, height, channels, filter_nums];
        strides: list或tuple, 卷积核步长, 默认(1, 1, 1, 1);
        padding: 填充方式;CNN的两种padding方式“SAME”(必要的时候使用0进行填充)和“VALID”;(padding只是增加了边缘区域的像素点);
                 无padding情况:如果输入是a*a,filter是b*b,那么不加padding情况下,就会卷积后图像变小,变成(a−b+1)*(a−b+1)
        activation: 激活函数;
        bias_setting: 是否有偏置;
        """
        # 设置输入;
        self.input = inpt
        # 初始化卷积核;
        self.W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), dtype=tf.float32)
        if bias_setting:
            self.b = tf.Variable(tf.truncated_normal(filter_shape[-1:], stddev=0.1),
                                 dtype=tf.float32)
        else:
            self.b = None
        # 计算卷积操作的输出;
        conv_output = tf.nn.conv2d(self.input, filter=self.W, strides=strides,
                                   padding=padding)
        conv_output = conv_output + self.b if self.b is not None else conv_output
        # 设置输出;
        self.output = conv_output if activation is None else activation(conv_output)
        # 设置参数;
        self.params = [self.W, self.b] if self.b is not None else [self.W, ]


class MaxPoolLayer(object):
    """池化层"""
    def __init__(self, inpt, ksize=(1, 2, 2, 1), strides=(1, 2, 2, 1), padding="SAME"):
        """
        -----------变量说明-----------------

        inpt : tf.Tensor, 维度为 [n_examples, witdth, height, channels];
        ksize: 池化窗口的大小,取一个四维向量,一般是[1, height, width, 1],因为我们不想在batch和channels上做池化,所以这两个维度设为了1;
        strides: tuple, 池化步长, 默认(1, 2, 2, 1)(窗口在每一个维度上滑动的步长);
        padding: 填充方式;CNN的两种padding方式“SAME”(必要的时候使用0进行填充)和“VALID”;(padding只是增加了边缘区域的像素点);
                 无padding情况:如果输入是a*a,filter是b*b,那么不加padding情况下,就会卷积后图像变小,变成(a−b+1)*(a−b+1)
        """
        # 设置输入;
        self.input = inpt
        # 计算输出;
        self.output = tf.nn.max_pool(self.input, ksize=ksize, strides=strides, padding=padding)
        self.params = []


class FlattenLayer(object):
    """Flatten层"""
    def __init__(self, inpt, shape):
        # Flatten层用来将输入“压平”,即把多维的输入一维化,常用在从卷积层到全连接层的过渡;
        self.input = inpt
        self.output = tf.reshape(self.input, shape=shape)
        self.params = []

class DropoutLayer(object):
    """Dropout层"""
    def __init__(self, inpt, keep_prob):
        """
        -----------变量说明-----------------

        keep_prob: 设置神经元被选中的概率 float (0, 1];
        """
        # 设置keep_prob;
        self.keep_prob = tf.placeholder(tf.float32)
        self.input = inpt
        # 计算输出;
        self.output = tf.nn.dropout(self.input, keep_prob=self.keep_prob)
        # 设置train_dicts;
        self.train_dicts = {self.keep_prob: keep_prob}
        # 设置pred_dicts;
        self.pred_dicts = {self.keep_prob: 1.0}

if __name__ == "__main__":
    # mnist数据集;
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
    # 定义输入输出占位符;
    x = tf.placeholder(tf.float32, shape=[None, 784])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])
    # reshape输入;
    inpt = tf.reshape(x, shape=[-1, 28, 28, 1])

    # 创建网络;
    # 训练参数设定;
    # 卷积和池化层layer0;
    layer0_conv = ConvLayer(inpt, filter_shape=[5, 5, 1, 32], strides=[1, 1, 1, 1], activation=tf.nn.relu,
                            padding="SAME")              # [?, 28, 28, 32]
    layer0_pool = MaxPoolLayer(layer0_conv.output, ksize=[1, 2, 2, 1],
                               strides=[1, 2, 2, 1])                       # [?, 14, 14, 32]
    # 卷积和池化层layer1;
    layer1_conv = ConvLayer(layer0_pool.output, filter_shape=[5, 5, 32, 64], strides=[1, 1, 1, 1],
                            activation=tf.nn.relu, padding="SAME")  # [?, 14, 14, 64]
    layer1_pool = MaxPoolLayer(layer1_conv.output, ksize=[1, 2, 2, 1],
                               strides=[1, 2, 2, 1])              # [?, 7, 7, 64]
    # flatten层;
    layer2_flatten = FlattenLayer(layer1_pool.output, shape=[-1, 7*7*64])
    # fully-connected全连接层;
    layer3_fullyconn = HiddenLayer(layer2_flatten.output, n_in=7*7*64, n_out=256, activation=tf.nn.relu)
    # dropout层;
    layer3_dropout = DropoutLayer(layer3_fullyconn.output, keep_prob=0.5)
    # output层;
    layer4_output = LogisticRegression(layer3_dropout.output, n_in=256, n_out=10)

    # 训练参数;
    params = layer0_conv.params + layer1_conv.params + layer3_fullyconn.params + layer4_output.params
    # dropout的训练dicts;
    train_dicts = layer3_dropout.train_dicts
    # dropout的预测dicts;
    pred_dicts = layer3_dropout.pred_dicts

    # 定义代价cost;
    cost = layer4_output.cost(y_)
    # 定义准确率;
    accuracy = layer4_output.accuarcy(y_)
    # 定义网络的预测;
    predictor = layer4_output.y_pred
    # 定义训练器;
    train_op = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(
        cost, var_list=params)

    # 初始化所有变量
    init = tf.global_variables_initializer()

    # 定义训练参数
    training_epochs = 10
    batch_size = 100
    display_step = 1

    # 开始训练
    print("Start to train...")
    with tf.Session() as sess:
        # 执行初始化;
        sess.run(init)
        # 跑多个epoch;
        for epoch in range(training_epochs):
            # 参数;
            avg_cost = 0.0
            batch_num = int(mnist.train.num_examples / batch_size)
            # 跑多个batch;
            for i in range(batch_num):
                # 获取当前batch样本;
                x_batch, y_batch = mnist.train.next_batch(batch_size)
                # 定义{x: x_batch, y_: y_batch}这个train dict;
                train_dicts.update({x: x_batch, y_: y_batch})

                # 执行训练;
                sess.run(train_op, feed_dict=train_dicts)
                # 定义{x: x_batch, y_: y_batch}这个predict dict;
                pred_dicts.update({x: x_batch, y_: y_batch})
                # 计算cost;
                avg_cost += sess.run(cost, feed_dict=pred_dicts) / batch_num
            # 输出
            if epoch % display_step == 0:
                pred_dicts.update({x: mnist.validation.images,
                                   y_: mnist.validation.labels})
                val_acc = sess.run(accuracy, feed_dict=pred_dicts)
                print("Epoch {0} cost: {1}, validation accuacy: {2}".format(epoch,
                                                                            avg_cost, val_acc))

        print("Finished!")
        # 开始测试;
        # 获取测试数据;
        test_x = mnist.test.images[:10]
        test_y = mnist.test.labels[:10]
        # 比较模型输出和真实labels;
        print("Ture lables:")
        print("  ", np.argmax(test_y, 1))
        print("Prediction:")
        pred_dicts.update({x: test_x})
        print("  ", sess.run(predictor, feed_dict=pred_dicts))
        tf.scan()





猜你喜欢

转载自blog.csdn.net/hanss2/article/details/80999245