25个大型数据集整理说明
相关数据集来源:https://blog.csdn.net/Mbx8X9u/article/details/79849738
https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-learning-datasets/?spm=a2c4e.11153959.blogcont576274.69.16b330274pLaMG
1.MNIST数据集
数据集下载地址:http://yann.lecun.com/exdb/mnist/参考tensorflow说明:http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_download.html
github解压数据地址:https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/mnist
训练集60k张28x28的照片(其中55k用于训练,5k用于验证),测试集10k张28x28的图片
采用的存储格式格式如下,采用非inter的存储模式:
TRAINING SET LABEL FILE (train-labels-idx1-ubyte):
[offset] [type] [value] [description]0000 32 bit integer 0x00000801(2049) magic number (MSB first)
0004 32 bit integer 60000 number of items
0008 unsigned byte ?? label
0009 unsigned byte ?? label
........
xxxx unsigned byte ?? label
The labels values are 0 to 9.
TRAINING SET IMAGE FILE (train-images-idx3-ubyte):
[offset] [type] [value] [description]0000 32 bit integer 0x00000803(2051) magic number
0004 32 bit integer 60000 number of images
0008 32 bit integer 28 number of rows
0012 32 bit integer 28 number of columns
0016 unsigned byte ?? pixel
0017 unsigned byte ?? pixel
........
xxxx unsigned byte ?? pixel
Pixel表示灰度值,0代表白色,255代表黑色
2.COCO数据集
数据信息http://cocodataset.org/#download
附带api
115k训练集/5k验证集,40k测试集
3.ImageNet数据集
训练集,验证集,测试集:以类别分文件夹存放
训练集边框,验证集边框。
4.街景房屋号码SVHN数据集
数据源地址:http://ufldl.stanford.edu/housenumbers/
5.CIFAR-10数据集下载
http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
http://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz