图一:RNN数据变换过程
batchsize为5,每次处理5*28大小的数据,共28序列。我们要的结果是最后一次输出的结果。
图二 RNN计算内部过程
每次计算的结果_LSTM 既作为一次输出_O,又作为下一次的输入之一。所以RNN具有记忆作用。
函数:
1、tf.transpose(input, [dimension_1, dimenaion_2,..,dimension_n]):
交换输入张量的不同维度。如果输入张量是二维,作用就是转置。dimension_n是整数,如果张量是三维,就是用0,1,2来表示。这个列表里的每个数对应相应的维度。如果是[2,1,0],就把输入张量的第三维度和第一维度交换。
2、tf.split(dimension, num_split, input):
dimension的意思就是输入张量的哪一个维度,如果是0就表示对第0维度进行切割。num_split就是切割的数量,如果是2就表示输入张量被切成2份,每一份是一个列表。
3、tf.nn.rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, state_is_tuple=True):
n_hidden表示神经元的个数,forget_bias就是LSTM们的忘记系数,如果等于1,就是不会忘记任何信息。如果等于0,就都忘记。state_is_tuple默认就是True,官方建议用True,就是表示返回的状态用一个元祖表示。
这个里面存在一个状态初始化函数,就是zero_state(batch_size,dtype)两个参数。batch_size就是输入样本批次的数目,dtype就是数据类型。
关于RNN CELL 理解:转自:https://blog.csdn.net/mydear_11000/article/details/52414342
图中的context就是一个cell结构,可以看到它接受的输入有input(t),context(t-1),然后输出output(t),比如像我们这个任务中,用到多层堆叠的rnn cell的话,也就是当前层的cell的output还要作为下一层cell的输入,因此可推出每个cell的输入和输出的shape是一样。
代码如下:input_data 在第一篇学习中。
#!/usr/bin/env python # -*- coding:utf-8 -*- #@Time : 2018/12/14 19:11 #@Author: little bear #@File : tf_RNN.py import tensorflow as tf import input_data import numpy as np import matplotlib.pyplot as plt print ("Packages imported") mnist = input_data.read_data_sets("data/", one_hot=True) trainimgs, trainlabels, testimgs, testlabels \ = mnist.train.images, mnist.train.labels, mnist.test.images, mnist.test.labels ntrain, ntest, dim, nclasses \ = trainimgs.shape[0], testimgs.shape[0], trainimgs.shape[1], trainlabels.shape[1] print ("MNIST loaded") diminput = 28 dimhidden = 128 dimoutput = nclasses nsteps = 28 weights = { 'hidden': tf.Variable(tf.random_normal([diminput, dimhidden])), 'out': tf.Variable(tf.random_normal([dimhidden, dimoutput])) } biases = { 'hidden': tf.Variable(tf.random_normal([dimhidden])), 'out': tf.Variable(tf.random_normal([dimoutput])) } def _RNN(_X, _W, _b, _nsteps, _name): # 1. Permute input from [batchsize, nsteps, diminput] [16*28*28] # => [nsteps, batchsize, diminput] _X = tf.transpose(_X, [1, 0, 2]) # 2. Reshape input to [nsteps*batchsize, diminput] _X = tf.reshape(_X, [-1, diminput]) # 3. Input layer => Hidden layer _H = tf.matmul(_X, _W['hidden']) + _b['hidden'] # 4. Splite data to 'nsteps' chunks. An i-th chunck indicates i-th batch data _Hs = tf.split(_H,_nsteps,0) # [28,16,28] nsteps,batchsize,diminput # 5. Get LSTM's final output (_LSTM_O) and state (_LSTM_S) # Both _LSTM_O and _LSTM_S consist of 'batchsize' elements # Only _LSTM_O will be used to predict the output. # with tf.variable_scope(_name,reuse=tf.AUTO_REUSE) as scope: # scope.reuse_variables() lstm_cell =tf.nn.rnn_cell.BasicLSTMCell(num_units=dimhidden,forget_bias=1.0) _LSTM_O, _LSTM_S = tf.nn.dynamic_rnn(lstm_cell,_Hs, dtype=tf.float32) # [ batchsize,steps,input] # 6. Output _O = tf.matmul(_LSTM_O[-1], _W['out']) + _b['out'] # Return! return { 'X': _X, 'H': _H, 'Hsplit': _Hs, 'LSTM_O': _LSTM_O, 'LSTM_S': _LSTM_S, 'O': _O } print("Network ready") lr = 0.001 x = tf.placeholder("float", [None, nsteps, diminput]) y = tf.placeholder("float", [None, dimoutput]) myrnn = _RNN(x, weights, biases, nsteps, 'basic') pred = myrnn['O'] cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) optm = tf.train.GradientDescentOptimizer(learning_rate=lr).minimize(cost) # Adam Optimizer accr = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(pred,1), tf.argmax(y,1)), tf.float32)) init = tf.global_variables_initializer() print ("Network Ready!") training_epochs = 5 batch_size = 16 display_step = 1 sess = tf.Session() sess.run(init) print ("Start optimization") for epoch in range(training_epochs): avg_cost = 0. #total_batch = int(mnist.train.num_examples/batch_size) total_batch = 100 # Loop over all batches for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) batch_xs = batch_xs.reshape((batch_size, nsteps, diminput)) # Fit training using batch data feeds = {x: batch_xs, y: batch_ys} sess.run(optm, feed_dict=feeds) # Compute average loss avg_cost += sess.run(cost, feed_dict=feeds)/total_batch # Display logs per epoch step if epoch % display_step == 0: print ("Epoch: %03d/%03d cost: %.9f" % (epoch, training_epochs, avg_cost)) feeds = {x: batch_xs, y: batch_ys} train_acc = sess.run(accr, feed_dict=feeds) print (" Training accuracy: %.3f" % (train_acc)) testimgs = testimgs.reshape((ntest, nsteps, diminput)) feeds = {x: testimgs, y: testlabels} test_acc = sess.run(accr, feed_dict=feeds) print (" Test accuracy: %.3f" % (test_acc)) print ("Optimization Finished.")