学习《Tensorflow入门教程》记录
一、加载数据集
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('data/', one_hot=True)
print
print (" 类型是 %s" % (type(mnist)))
print (" 训练数据有 %d" % (mnist.train.num_examples))
print (" 测试数据有 %d" % (mnist.test.num_examples))
运行结果:
类型是 <class 'tensorflow.contrib.learn.python.learn.datasets.base.Datasets'>
训练数据有 55000
测试数据有 10000
注意:如果Mnist加载失败,可以自行下载数据集,放在当前路径的data文件夹下。
二、数据集的规格
trainimg = mnist.train.images
trainlabel = mnist.train.labels
testimg = mnist.test.images
testlabel = mnist.test.labels
# 28 * 28 * 1
print (" 数据类型 is %s" % (type(trainimg)))
print (" 标签类型 %s" % (type(trainlabel)))
print (" 训练集的shape %s" % (trainimg.shape,))
print (" 训练集的标签的shape %s" % (trainlabel.shape,))
print (" 测试集的shape' is %s" % (testimg.shape,))
print (" 测试集的标签的shape %s" % (testlabel.shape,))
运行结果:
数据类型 is <class 'numpy.ndarray'>
标签类型 <class 'numpy.ndarray'>
训练集的shape (55000, 784)
训练集的标签的shape (55000, 10)
测试集的shape' is (10000, 784)
测试集的标签的shape (10000, 10)
三、数据集的形式
nsample = 2
randidx = np.random.randint(trainimg.shape[0], size=nsample) #随机选行
for i in randidx:
curr_img = np.reshape(trainimg[i, :], (28, 28)) # 28 by 28 matrix
curr_label = np.argmax(trainlabel[i, :] ) # Label
plt.matshow(curr_img, cmap=plt.get_cmap('gray'))
print ("" + str(i) + "th 训练数据 "
+ "0标签是 " + str(curr_label))
plt.show()
结果是:
48118th 训练数据 标签是 0
8268th 训练数据 标签是 0