使用keras进行迁移学习
概述
针对小型数据集,如果从头训练一个网络效果并不理想,如何进一步提高呢?在上一篇博文的基础上,我们进行预训练。即用迁移学习的思想来提高分类效果。
使用预训练的方法有两种:特征提取和模型微调
提取特征法
提取特征这法又分为两种形式,一种是用之前预训练过的网络直接提取特征然后将这些特征送入新的分类器从头开始训练
这里我们使用VGG16作为预训练模型,先实例化resnet50:
from keras.applications import VGG16
image_size = 200
conv_base = VGG16(weights='imagenet',
include_top=False,
input_shape=(image_size, image_size, 3)
)
conv_base.summary
其中weights指定了模型初始化的权重检查点。include_top指的是加载模型最后是否连接分类器。迁移学习时候选False。 input_shape是输入图片的形状,不选的为任意输入。(这里需要定义的)conv_base.summary 可以看具体数据,我们重点看最后一层,它决定输出的形状
不数据增强的特征提取
首先第一步,对train、test和validation三个数据集进行特征提取,这里我们知道最后一层是[none,7,7,2048],那么么一个样本输出的特征都是[7,7,2048]
Last_Layer = [7,7,2048]
base_dir = './cats_and_dogs_small'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 20
def extract_features(directory, sample_count):
features = np.zeros(shape=(sample_count, Last_Layer[0],Last_Layer[1],Last_Layer[2]))
labels = np.zeros(shape=(sample_count))
generator = datagen.flow_from_directory(
directory,
target_size=(image_size, image_size),
batch_size=batch_size,
class_mode='binary')
i = 0
for inputs_batch, labels_batch in generator:
features_batch = conv_base.predict(inputs_batch)
features[i * batch_size : (i + 1) * batch_size] = features_batch
labels[i * batch_size : (i + 1) * batch_size] = labels_batch
i += 1
if i * batch_size >= sample_count:
# Note that since generators yield data indefinitely in a loop,
# we must `break` after every image has been seen once.
break
return features, labels
train_features, train_labels = extract_features(train_dir, 2000)
validation_features, validation_labels = extract_features(validation_dir, 1000)
test_features, test_labels = extract_features(test_dir, 1000)
因为要直接输入密集连接分类器,所以要将输入拉平
train_features = np.reshape(train_features, (2000, Last_Layer[0] * Last_Layer[1] * Last_Layer[2]))
validation_features = np.reshape(validation_features, (1000, Last_Layer[0] * Last_Layer[1] * Last_Layer[2]))
test_features = np.reshape(test_features, (1000, Last_Layer[0] * Last_Layer[1] * Last_Layer[2]))
设计自己的分类器并训练:
model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_dim = Last_Layer[0] * Last_Layer[1] * Last_Layer[2]))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer=optimizers.RMSprop(lr=2e-5),
loss='binary_crossentropy',
metrics=['acc'])
history = model.fit(train_features, train_labels,
epochs=30,
batch_size=20,
validation_data=(validation_features, validation_labels))
使用数据增强的特征提取
这种方法扩展了conv_base模型,然后在输入数据上端到端地运行模型。
base_dir = './cats_and_dogs_small'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
开始训练前一定要冻结conv_base,方法是conv_base.trainable = False
print('This is the number of trainable weights '
'before freezing the conv base:', len(model.trainable_weights))
conv_base.trainable = False
print('This is the number of trainable weights '
'after freezing the conv base:', len(model.trainable_weights))
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
# This is the target directory
train_dir,
# All images will be resized to 150x150
target_size=(200, 200),
batch_size=20,
# Since we use binary_crossentropy loss, we need binary labels
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(200, 200),
batch_size=20,
class_mode='binary')
model.compile(loss='binary_crossentropy',
optimizer= optimizers.RMSprop(lr=2e-5),
metrics=['acc'])
history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50,
verbose=2)
model.save('cats_and_dogs_small_3.h5')
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
微调模型
模型微调与特征提取互为补充,特征提取是冻结整个预训练模型,而模型微调是将顶部几层解冻。
解冻直到某一层的所有层代码:
conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
if layer.name == 'block5_conv1':
set_trainable = True
if set_trainable:
layer.trainable = True
else:
layer.trainable = False