英文:
Training VGG16 from scratch doesn't improve accuracy in Keras
问题
我正在尝试使用VGG16模型进行迁移学习和从头开始训练。我有一个每个类别包含7,000张图像的数据集,共有4个不同的类别。我成功创建了迁移学习的代码,但是对于从头开始训练的同样的程序似乎不起作用。
迁移学习模型的创建:
base_model = apps.VGG16(
include_top=False, # 这是如果我们想要最终的全连接层
weights="imagenet",
input_shape=input_shape,
classifier_activation="softmax",
pooling = pooling,
)
# 冻结基础模型
for layer in base_model.layers:
layer.trainable = False
# 将基础模型的输出转换为1D向量
x = Flatten()(base_model.output)
# 我们创建fc_count个全连接层,所有层都使用relu激活函数,除了最后一层
x = Dense(units=4096, activation='relu')(x) # relu避免了梯度消失问题
x = Dense(units=4096, activation='relu')(x) # relu避免了梯度消失问题
# 最后一层是softmax层
prediction = Dense(4, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction)
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(learning_rate=0.001),
metrics=['accuracy'])
与此同时,对于从头开始训练:
model = apps.VGG16(
include_top=True, # 这是如果我们想要最终的全连接层
weights=None,
input_shape=input_shape,
classifier_activation="softmax",
pooling = pooling,
classes = 4 # 将输出数量设置为所需的数量
)
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(learning_rate=0.1), # 我尝试过使用低至0.001的学习率
metrics=['accuracy'])
model.summary()
训练通过以下方式进行:
history = model.fit(train_images,
validation_data=val_images,
epochs=epochs,
verbose=1, callbacks=callbacks)
迁移学习大约需要10个epochs才能收敛,而从头开始训练时,我已经尝试了多达20个epochs,最终准确度和验证准确度都固定在0.2637。当进行迁移学习时,我使用了ReduceLROnPlateau回调,这有所帮助。
我正在使用NVIDIA GeForce RTX 3060笔记本电脑GPU进行训练。
编辑:我应该提到,当从头开始训练时,我得到了损失值为"nan"。
英文:
I'm trying to train VGG16 models using both transfer learning and training from scratch. I have a dataset with 7k images per category, and 4 different categories. I managed to come up with the transfer learning code no problem, however, the same program but for training from scratch does not seem to be working.
creating the model for transfer learning:
base_model = apps.VGG16(
include_top=False, # This is if we want the final FC layers
weights="imagenet",
input_shape=input_shape,
classifier_activation="softmax",
pooling = pooling,
)
# Freeze the base model
for layer in base_model.layers:
layer.trainable = False
# convert output of base model to a 1D vector
x = Flatten()(base_model.output)
# We create fc_count fully connected layers, relu for all but the last
x = Dense(units=4096, activation='relu')(x) # relu avoids vanishing gradient problem
x = Dense(units=4096, activation='relu')(x) # relu avoids vanishing gradient problem
# The final layer is a softmax layer
prediction = Dense(4, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction)
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(learning_rate=0.001),
metrics=['accuracy'])
Meanwhile, for training from scratch:
model = apps.VGG16(
include_top=True, # This is if we want the final FC layers
weights=None,
input_shape=input_shape,
classifier_activation="softmax",
pooling = pooling,
classes = 4 # set the number of outputs to required count
)
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(learning_rate=0.1), # I've experimented w values as low as 0.001
metrics=['accuracy'])
model.summary()
and the training is done via
history = model.fit(train_images,
validation_data=val_images,
epochs=epochs,
verbose=1, callbacks=callbacks)
Transfer learning takes around 10 epochs to converge, whereas I've gone up to 20 epochs when training from scratch, converging to an accuracy and val_accuracy of exactly 0.2637. I have a ReduceLROnPlateau that does make a difference when transfer learning.
I'm training on a NVIDIA GeForce RTX 3060 Laptop GPU.
EDIT: I should mention that I am getting loss of nan when training from scratch
答案1
得分: 0
问题通过切换到SGD优化器得以解决。
英文:
Problem got resolved by switching to the SGD optimizer
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论