一、概述
- 手写数字识别通常作为第一个深度学习在计算机视觉方面应用的示例,Mnist数据集在这当中也被广泛采用,可用于进行训练及模型性能测试;
- 模型的输入: 32*32的手写字体图片,这些手写字体包含0~9数字,也就是相当于10个类别的图片
- 模型的输出: 分类结果,0~9之间的一个数
- 下面通过多层感知器模型以及卷积神经网络的方式进行实现
二、基于多层感知器的手写数字识别
784个神经元 |
784个神经元 |
10个神经元 |
输入层 |
影藏层 |
输出层 |
import numpy as np
def loadData(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
(x_train, y_train), (x_validation, y_validation) = loadData()
import matplotlib.pyplot as plt
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
def loadData(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
(x_train, y_train), (x_validation, y_validation) = loadData()
plt.subplot(221)
plt.imshow(x_train[0], cmap=plt.get_cmap('gray'))
plt.subplot(222)
plt.imshow(x_train[1], cmap=plt.get_cmap('gray'))
plt.subplot(223)
plt.imshow(x_train[2], cmap=plt.get_cmap('gray'))
plt.subplot(224)
plt.imshow(x_train[3], cmap=plt.get_cmap('gray'))
plt.show()
seed = 7
np.random.seed(seed)
num_pixels = x_train.shape[1] * x_train.shape[2]
print(num_pixels)
x_train = x_train.reshape(x_train.shape[0], num_pixels).astype('float32')
x_validation = x_validation.reshape(x_validation.shape[0], num_pixels).astype('float32')
x_train = x_train/255
x_validation = x_validation/255
y_train = np_utils.to_categorical(y_train)
y_validation = np_utils.to_categorical(y_validation)
num_classes = y_validation.shape[1]
print(num_classes)
def create_model():
model = Sequential()
model.add(Dense(units=num_pixels, input_dim= num_pixels,kernel_initializer='normal', activation='relu'))
model.add(Dense(units=num_classes, kernel_initializer='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = create_model()
model.fit(x_train, y_train, epochs=10, batch_size=200)
score = model.evaluate(x_validation, y_validation)
print('MLP: %.2f%%' % (score[1]*100))
784
10
Epoch 1/10
200/60000 [..............................] - ETA: 4:32 - loss: 2.3038 - acc: 0.1100
600/60000 [..............................] - ETA: 1:37 - loss: 2.0529 - acc: 0.3283
1000/60000 [..............................] - ETA: 1:02 - loss: 1.8041 - acc: 0.4710
...
9472/10000 [===========================>..] - ETA: 0s
10000/10000 [==============================] - 1s 112us/step
MLP: 98.07%
三、基于卷积神经网络的手写数字识别
1 x 28 x 28个输入 |
32maps, 5 x 5 |
2 x 2 |
20% |
|
128个 |
10个 |
输入层 |
卷积层 |
池化层 |
Dropout层 |
Flatten层 |
全连接层 |
输出层 |
Flatten层: Flatten层用来将输入“压平”,即把多维的输入一维化,常用在从卷积层到全连接层的过渡,举例如下
input size |
---->> |
output size |
32 x 32 x 3 |
Flatten–> |
3072 |
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils
from keras import backend
backend.set_image_data_format('channels_first')
def loadData(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
(x_train, y_train), (x_validation, y_validation) = loadData()
seed = 7
np.random.seed(seed)
x_train = x_train.reshape(x_train.shape[0], 1, 28, 28).astype('float32')
x_validation = x_validation.reshape(x_validation.shape[0], 1, 28, 28).astype('float32')
x_train = x_train/255
x_validation = x_validation/255
y_train = np_utils.to_categorical(y_train)
y_validation = np_utils.to_categorical(y_validation)
def create_model():
model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(1, 28, 28), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(units=128, activation='relu'))
model.add(Dense(units=10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
model = create_model()
model.fit(x_train, y_train, epochs=10, batch_size=200, verbose=2)
score = model.evaluate(x_validation, y_validation, verbose=0)
print('CNN_Small: %.2f%%' % (score[1]*100))
Epoch 1/10
- 165s - loss: 0.2226 - acc: 0.9367
Epoch 2/10
- 163s - loss: 0.0713 - acc: 0.9785
Epoch 3/10
- 165s - loss: 0.0512 - acc: 0.9841
Epoch 4/10
- 165s - loss: 0.0391 - acc: 0.9880
Epoch 5/10
- 166s - loss: 0.0325 - acc: 0.9900
Epoch 6/10
- 162s - loss: 0.0268 - acc: 0.9917
Epoch 7/10
- 164s - loss: 0.0221 - acc: 0.9928
Epoch 8/10
- 161s - loss: 0.0190 - acc: 0.9943
Epoch 9/10
- 162s - loss: 0.0156 - acc: 0.9950
Epoch 10/10
- 162s - loss: 0.0143 - acc: 0.9959
CNN_Small: 98.87%