1. LeNet的网络结构
以下用Cx代表convolutional layer,用Sx代表sub-sampling layer,用Fx代表fully-connected layer。
- C1:
C1是convolutional layer,有6个filter map,卷积核大小为5×5” role=”presentation”>5×55×5) - S2:
S2是一个sub-sampling,pooling_size的大小为(2,2), stride的大小为(2,2), 与现在的max pooling不同。LeNet是把2×2” role=”presentation”>2×22×2都是可以训练的,因此有12个训练参数。因此现在的max_pooling是没有参数的,但是LeNet的sub-sampling是有训练参数的。 - C3:
C3是convolutional layer,有16个filter map, 卷积核大小为5×5” role=”presentation”>5×55×5。C3有1516个训练参数,有156000个连接。 - S4:
和S2相同,S2拥有32个训练参数和2000个连接。 - C5:
C5是一个convolutional layer,卷积核的大小为5×5” role=”presentation”>5×55×5而已。 - F6:
F6是一个全连接层,有84个神经元元(之所以选这个数字的原因来自于输出层的设计,下面会有说明),与C5层构成相连。有10164个可训练参数。 output:
output是一个Gaussian连接,与全连接不同的是,全连接是将F6的输出与权重进行点积,再加上一个偏置。然后将结果传入sigmoid函数单元产生一个状态。而gaussian连接的计算方式为:
yi=∑j=1n(xj−wji)2i∈{0,1,⋯,9}” role=”presentation”>yi=∑j=1n(xj−wji)2i∈{0,1,⋯,9}yi=∑j=1n(xj−wji)2i∈{0,1,⋯,9}y_i = \sum_{j=1}^n(x_j-w_{ji})^2 \hspace{1.0cm} i\in\{0,1,\cdots, 9\}
对于本例而言,i∈{0,1,⋯,83}” role=”presentation”>i∈{0,1,⋯,83}i∈{0,1,⋯,83}。即计算ERBF距离,最后的输出层是由欧式径向基单元(Euclidean Radial Badi)组成,每类一个单元,每个有84个输入。每个输出ERBF单元计算输入向量和参数向量之间的欧式距离。输入离参数向量越远,ERBF输出越大。因此,一个ERBF输出可以被理解为衡量输入模式和与ERBF相关联类的一个模型的匹配程度的惩罚项。同时ERBF参数向量起着F6层目标向量的角色。这些向量的成分是+1或者-1,也可以防止F6层的Sigmoid函数饱和。损失函数:
使用MLE(最大似然估计)和MSE(最小均方差)。
2. LeNet的TensorFlow实现
最接近原始的LeNet-5:
2.1 代码
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from tqdm import tqdm
mnist = input_data.read_data_sets("MNIST_data", one_hot=True)
batch_size = 128
epochs = 100
with tf.variable_scope("input"):
x = tf.placeholder(shape=(None, 784), dtype=tf.float32, name="input_x")
y = tf.placeholder("float", name="input_y")
def lenet(input):
with tf.variable_scope("reshape"):
input = tf.reshape(input, [-1, 28, 28, 1], name="reshape_input_x")
with tf.variable_scope("conv_1"):
weights1 = tf.Variable(tf.truncated_normal(shape=[5, 5, 1, 6], mean=0, stddev=0.1), name="weights1")
bias1 = tf.Variable(tf.truncated_normal(shape=[6], mean=0, stddev=0.1), name="bias1")
c1 = tf.nn.conv2d(input=input, filter=weights1, strides=[1,1,1,1], padding="SAME")+bias1
s2 = tf.nn.max_pool(c1, ksize=(1,2,2,1),strides=[1,2,2,1], padding="VALID")
with tf.variable_scope("conv_2"):
weights2 = tf.Variable(tf.truncated_normal(shape=[5, 5, 6, 16], mean=0, stddev=0.1), name="weights2")
bias2 = tf.Variable(tf.truncated_normal(shape=[16]))
c3 = tf.nn.conv2d(input=s2, filter=weights2, strides=[1,1,1,1], padding="VALID")+bias2
s4 = tf.nn.max_pool(c3, ksize=(1,2,2,1),strides=[1,2,2,1], padding="VALID")
with tf.variable_scope("conv_3"):
weights3 = tf.Variable(tf.truncated_normal(shape=[5,5,16,120], mean=0, stddev=0.1))
bias3 = tf.Variable(tf.truncated_normal(shape=[120], mean=0, stddev=0.1))
c5 = tf.nn.conv2d(input=s4, filter=weights3, strides=[1,1,1,1], padding="VALID")+bias3
c5_shape_li = c5.get_shape().as_list()
with tf.variable_scope("fc_1"):
weights4 = tf.Variable(tf.truncated_normal(shape=[120, 84], mean=0, stddev=0.1))
bias4 = tf.Variable(tf.truncated_normal(shape=[84], mean=0, stddev=0.1))
with tf.variable_scope("flatten"):
o5_flatten = tf.reshape(tensor=c5, shape=[-1, c5_shape_li[1] * c5_shape_li[2] * c5_shape_li[3]], name="flatten")
f6 = tf.matmul(o5_flatten, weights4) + bias4
with tf.variable_scope("output"):
weights5 = tf.Variable(tf.truncated_normal(shape=[84, 10], mean=0, stddev=0.1))
bias5 = tf.Variable(tf.truncated_normal(shape=[10], mean=0, stddev=0.1))
output = tf.matmul(f6, weights5) + bias5
return output
predict = lenet(x)
with tf.variable_scope("loss"):
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predict, labels=y))
with tf.variable_scope("train"):
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(loss)
correct_train = tf.equal(tf.arg_max(predict, 1), tf.arg_max(y, 1))
accuracy_train = tf.reduce_mean(tf.cast(correct_train, "float"))
with tf.Session() as sess:
saver = tf.summary.FileWriter("../log/", sess.graph)
sess.run(tf.global_variables_initializer())
for epoch in tqdm(range(10)):
epoch_loss = 0
for _ in range(int(mnist.train.num_examples / batch_size)):
epoch_x, epoch_y = mnist.train.next_batch(batch_size)
_, c = sess.run([optimizer, loss], feed_dict={x: epoch_x, y: epoch_y})
epoch_loss += c
print("accuracy", sess.run(accuracy_train, feed_dict={x: mnist.test.images, y:mnist.test.labels}),
'loss:', epoch_loss)
correct = tf.equal(tf.argmax(predict, -1), tf.argmax(y, -1))
accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
print('Accuracy:', accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
- 54
- 55
- 56
- 57
- 58
- 59
- 60
- 61
- 62
- 63
- 64
- 65
- 66
- 67
- 68
- 69
- 70
- 71
- 72
2.2 结构图:
[链接]://blog.csdn.net/u012897374/article/details/78575594)