手写字符的识别（MINST+CNN+TensorFlow+建模+单个数字识别）

手写数字的识别应该是算是机器学习的入门实验

手写字符的开源数据集：http://yann.lecun.com/exdb/mnist/
如果找不到，可以在我的资源中免费下载，free

数据集是不是到手了，想要达到准确识别的效果，俺给分成了两大块

训练and测试
训练是啥啊，训练的是模型
测试是啥啊，测试的是模型
所以要建立两个程序，“train.py”和“text.py”。

数据集不多说，直接建模，直接搂程序，Python代码走一波

建模需要啥呢？

导入所需库+数据集读取+初始化参数+卷积神经网络搭建+模型性能指标+模型迭代调优+保存深度学习模型

就这几部分，整明白就完事。别墨迹，长篇大论的，谁也不爱看。

（1）导入所需库
手写数字所需要的导入的应该是比较少比较简单的

import tensorflow as tf
# import input_data
from tensorflow.examples.tutorials.mnist import input_data

（2）数据集读取
由于是公开数据集，也就是内置数据集，直接读取即可

mnist = input_data.read_data_sets('E:/MNIST_data（自己的数据集位置）', one_hot=True)               # one_hot 编码 [1 0 0 0]

（3）初始化参数

# 占位符
x = tf.placeholder("float", shape=[None, 784], name='x')   # 输入
y_ = tf.placeholder("float", shape=[None, 10], name='y_')  # 实际值

# 初始化权重
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)   # 产生正态分布 标准差0.1
    return tf.Variable(initial)
# 初始化偏置
def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)    # 定义常量
    return tf.Variable(initial)

# 卷积层
def conv2d(x,W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

# 池化层
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')  # 最大池化

卷积神经网络的参数，基本知识，不过分吧。

（4）卷积神经网络搭建

# 第一层卷积  卷积在每个5*5中算出32个特征
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

x_image = tf.reshape(x, [-1, 28, 28, 1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# 第二层卷积
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# 密集连接层  图片尺寸缩减到了7*7， 本层用1024个神经元处理
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# dropout 防止过拟合
keep_prob = tf.placeholder("float", name='keep_prob')
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# 输出层  最后添加一个Softmax层
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2, name='y_conv')

w是权重，b是偏置，conv代表卷积，relu是激活函数，pool是池化，fc是全连接层，dropout用于防止过拟合，softmax分类器
tf.matmul为矩阵相乘
tf.multiply为矩阵中对应元素各自相乘

接下来引出TensorFlow的核心数据单位——张量（不是张亮，更不是麻辣烫，而是Tensor）
张量是由阵列（任意维数）的原始值组成。张量的阶是它的维数，而它的形状是一个整数元组，指定了阵列每个维度的长度。

介绍几个特殊张量：
tf.placeholder：
创建占位符，相当于形参，说白了就是你杵在车位上帮你父亲抢车位

tf.Variable：创建变量

tf.get_variable：获取变量

tf.constant：创建常量

dtype: 数据类型

shape:
数据形状，默认是 None。[None, 3] 表示列是 3，行不确定。

（5）模型性能指标

cross_entropy = - tf.reduce_sum(y_ * tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.global_variables_initializer())

“cross_entropy”：交叉熵损失函数

“tf.reduce_sum”：降维求和，返回与输入数据类型相同的降维后的结果

“tf.train.AdamOptimizer”：Adam优化算法

“ tf.equal(A,B)”：对比这两个矩阵或者向量的相等的元素，如果不相等，返回数值是A的矩阵维度。

“tf.argmax()”：返回最大的那个数值所在的下标。

“tf.reduce_mean”：张量在某一维度的平均值。

“tf.cast”：就是框架里的类型转换函数

“tf.global_variables_initializer()”：英语直译ok不，为啥用呢，因为tf中建立的变量是没有初始化的，可能不是一个tensor，要是不用，就是loser。

（6）模型迭代调优

for i in range(2000):
    batch = mnist.train.next_batch(50)
    if i % 100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
        print("step %d, training accuracy %g"%(i, train_accuracy))
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

就是跑多少，怎么跑，跑的咋样。

（7）保存深度学习模型

saver.save(sess, "E:/手写字符/ckpt_dir/model.ckpt")

保存下模型，能做到不？

训练阶段这样就完事了，那保存是啥样的呢，如下图所示
在这里插入图片描述
接下来改调用模型了

开始测试

当然，在测试之前，你要自己准备好手写数字的图片，可以网上找，可以直接手写，反正有就可以
在这里插入图片描述
开始测试代码
一样分成几块：
库+读取+导入模型+结果

（1）：库

from PIL import Image
import tensorflow as tf

（2）读取单张图片

def imageprepare():
    file_name = 'E:/手写字符/9.png'
    myimage = Image.open(file_name)
    myimage = myimage.resize((28, 28), Image.ANTIALIAS).convert('L')   
                                     #变换成28*28像素，并转换成灰度图
    tv = list(myimage.getdata())     # 获取像素值
    tva = [(255-x)*1.0/255.0 for x in tv]      # 转换像素范围到[0 1], 0是纯白 1是纯黑
    return tva

（3）导入模型

with tf.Session() as sess:
    sess.run(init)
    saver = tf.train.import_meta_graph('手写字符/ckpt_dir/model.ckpt.meta')  
                                    # 载入模型结构
    saver.restore(sess,  '手写字符/ckpt_dir/model.ckpt')     
                                    # 载入模型参数
    graph = tf.get_default_graph()  # 加载计算图
    x = graph.get_tensor_by_name("x:0")      # 从模型中读取占位符张量
    keep_prob = graph.get_tensor_by_name("keep_prob:0")
    y_conv = graph.get_tensor_by_name("y_conv:0")     # 关键的一句  从模型中读取占位符变量

（4）结果

prediction = tf.argmax(y_conv, 1)
 predint = prediction.eval(feed_dict={x: [result], keep_prob: 1.0}, session=sess)
                       # feed_dict输入数据给placeholder占位符
print(predint[0])      # 打印预测结果

这样就完成了单张数字的识别了，一次可以识别一张

想看一次识别多张图片吗

不要走开，我有一次可多张识别的花朵识别程序，感兴趣的可以给我批评指正一下。

看到这用了不少时间吧，快休息休息吧！

干啥啥不行下课散会

发布了6 篇原创文章 · 获赞 17 · 访问量 268

私信关注

手写字符的识别（MINST+CNN+TensorFlow+建模+单个数字识别）

猜你喜欢