问题描述: 由于存储的是不同大小的图片:直接使用 h5 存储, 报错如下:
TypeError: Object dtype dtype(‘O’) has no native HDF5 equivalent
问题原因:
h5 无法统一处理不同shape 的数据。应对这种情况,有两种方法:
1、散装:
将相同维度的数据放在同一个dataset中,即把原始数据拆分成多个dataset存储。
PS: 我自己有多少图片,新建了多少个dataset:
code:
数据写入:
import h5py
f = h5py.File(self.root_dir + "/data/"+data_set +'_dataset.h5', 'w')
.......
for i in range(len(train_img_list)):
f[ 'train' + str(i) ] = train_img_list[i]
数据读出:
fd = h5py.File(self.root_dir + "/data/" + data_set + '_dataset.h5', 'r')
for k , i in enumerate(filename):
self._roidb.append(i.split()[0])
self.scores.append(float(i.split()[1]))
self.img_data.append(fd['train' +str(k)].value)
2、统装:
数据写入:
# Create a new file
f = h5py.File('data.h5', 'w')
f.create_dataset('X_train', data=Train_image)
f.create_dataset('y_train', data=Train_label)
f.create_dataset('X_test', data=Test_image)
f.create_dataset('y_test', data=Test_label)
f.close()
数据读出:
# Load hdf5 dataset
train_dataset = h5py.File('data.h5', 'r')
train_set_x_orig = np.array(train_dataset['X_train'][:]) # your train set features
train_set_y_orig = np.array(train_dataset['y_train'][:]) # your train set labels
test_set_x_orig = np.array(train_dataset['X_test'][:]) # your train set features
test_set_y_orig = np.array(train_dataset['y_test'][:]) # your train set labels
f.close()
参考链接:
[1] https://blog.csdn.net/qq_41120234/article/details/88543039