熟悉CNN 系列-基础的 cnn 结构


初步介绍一下由tesorflow 实现的 cnn 结构

以下结构,包含一个嵌入层,3个卷积-池化层,dropout 层,输出层;

def neural_network():
    # embedding layer
    with tf.device('/cpu:0'), tf.name_scope("embedding"):
        embedding_size = 128
        W = tf.Variable(tf.random_uniform([input_size, embedding_size], -1.0, 1.0))
        embedded_chars = tf.nn.embedding_lookup(W, X)
        embedded_chars_expanded = tf.expand_dims(embedded_chars, -1)
    # convolution + maxpool layer
    num_filters = 128
    filter_sizes = [3, 4, 5]
    pooled_outputs = []
    for i, filter_size in enumerate(filter_sizes):
        with tf.name_scope("conv-maxpool-%s" % filter_size):
            filter_shape = [filter_size, embedding_size, 1, num_filters]
            W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1))
            b = tf.Variable(tf.constant(0.1, shape=[num_filters]))
            conv = tf.nn.conv2d(embedded_chars_expanded, W, strides=[1, 1, 1, 1], padding="VALID")
            h = tf.nn.relu(tf.nn.bias_add(conv, b))
            pooled = tf.nn.max_pool(h, ksize=[1, input_size - filter_size + 1, 1, 1], strides=[1, 1, 1, 1],
                                    padding='VALID')
            pooled_outputs.append(pooled)

    num_filters_total = num_filters * len(filter_sizes)
    h_pool = tf.concat(3, pooled_outputs)
    h_pool_flat = tf.reshape(h_pool, [-1, num_filters_total])
    # dropout
    with tf.name_scope("dropout"):
        h_drop = tf.nn.dropout(h_pool_flat, dropout_keep_prob)
    # output
    with tf.name_scope("output"):
        W = tf.get_variable("W", shape=[num_filters_total, num_classes],
                            initializer=tf.contrib.layers.xavier_initializer())
        b = tf.Variable(tf.constant(0.1, shape=[num_classes]))
        output = tf.nn.xw_plus_b(h_drop, W, b)

    return output

嵌入层

来源nlp 里面的 word embedding; 本身也主要用于自然语言处理。 嵌入层将正整数(下标)转换为具有固定大小的向量,直白的理解就是将原本词典中的词,根据其下标转换为一个特定的向量来表示该词。解决两个问题:

1.one-hot 编码太稀疏,效率也低; word 之间没有相关性,实际上是有的
2.该向量本身也是可学习的
3.Word embedding就是要从数据中自动学习到Distributed representation。

  • 输入输出
    输入shape
    形如(样本数,词典序列长度)的2D张量;
    输出shape
    形如 (样本数,词典序列长度, 全连接嵌入的维度)的3D张量
    例如:如[[4],[20]]->[[0.25,0.1],[0.6,-0.2]]

如何转换?参考词向量,word2vec 等

卷积和池化层

这里定义 W*x + b 的过程;涉及到stride & padding的计算和设置;
池化层定义采样的方式

dropout 层

随机的按一定的比例令网络中的某些连接失效,优势在于三个臭皮匠胜过诸葛亮,借鉴 bagging 的思路;

下面是两个不同的栗子,来介绍 cnn 的结构实现:

# 定义待训练的神经网络
def convolutional_neural_network(data):
    weights = {'w_conv1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
               'w_conv2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
               'w_fc': tf.Variable(tf.random_normal([7 * 7 * 64, 1024])),
               'out': tf.Variable(tf.random_normal([1024, n_output_layer]))}

    biases = {'b_conv1': tf.Variable(tf.random_normal([32])),
              'b_conv2': tf.Variable(tf.random_normal([64])),
              'b_fc': tf.Variable(tf.random_normal([1024])),
              'out': tf.Variable(tf.random_normal([n_output_layer]))}

    data = tf.reshape(data, [-1, 28, 28, 1])

    conv1 = tf.nn.relu(
        tf.add(tf.nn.conv2d(data, weights['w_conv1'], strides=[1, 1, 1, 1], padding='SAME'), biases['b_conv1']))
    conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    conv2 = tf.nn.relu(
        tf.add(tf.nn.conv2d(conv1, weights['w_conv2'], strides=[1, 1, 1, 1], padding='SAME'), biases['b_conv2']))
    conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    fc = tf.reshape(conv2, [-1, 7 * 7 * 64])
    fc = tf.nn.relu(tf.add(tf.matmul(fc, weights['w_fc']), biases['b_fc']))

    # dropout剔除一些"神经元"
    # fc = tf.nn.dropout(fc, 0.8)

    output = tf.add(tf.matmul(fc, weights['out']), biases['out'])
    return output
  • tflearn 的实现:tflearn 类似 keras,非常简洁的结构描述
# 定义神经网络模型
conv_net = input_data(shape=[None,28,28,1], name='input')
conv_net = conv_2d(conv_net, 32, 2, activation='relu')
conv_net = max_pool_2d(conv_net ,2)
conv_net = conv_2d(conv_net, 64, 2, activation='relu')
conv_net = max_pool_2d(conv_net ,2)
conv_net = fully_connected(conv_net, 1024, activation='relu')
conv_net = dropout(conv_net, 0.8)
conv_net = fully_connected(conv_net, 10, activation='softmax')
conv_net = regression(conv_net, optimizer='adam', loss='categorical_crossentropy', name='output')

model = tflearn.DNN(conv_net)

# 训练
model.fit({'input':train_x}, {'output':train_y}, n_epoch=13,
          validation_set=({'input':test_x}, {'output':test_y}),
          snapshot_step=300, show_metric=True, run_id='mnist')

参考链接
https://www.zhihu.com/question/45027109 详解 word embedding
https://fuhailin.github.io/Embedding/
https://blog.csdn.net/jiangpeng59/article/details/77533309

猜你喜欢

转载自blog.csdn.net/u012384285/article/details/90675819
CNN