英文:
State_dict() got an unexpected keyword argument ‘destination’
问题
以下是您提供的内容的中文翻译:
我正在训练一个包含两个深度学习模型的集成,工作流在M1 Mac上训练时正常运行并保存'.pth'文件,但在GPU上出现了错误。我尝试将版本更改为1.12.0(torch),但仍然出现相同的错误。我会提供完整的错误消息和代码片段。请协助。谢谢。
'''
在Trainer类之外保存模型的函数
'''
def save_model(model, model_filename, path='.'):
if not model_filename.endswith('.pth'):
model_filename += '.pth'
filepath = os.path.join(path, model_filename)
io.create_path(filepath)
torch.save(model.state_dict(), filepath)
'''
在Trainer类内部的save_model方法
'''
def save_model(self, model_filename=None, *, models_dir='models'):
if model_filename is None and self.model_filename is None:
raise ValueError('参数 "model_filename" 为 None')
elif model_filename is None:
model_filename = self.model_filename
path = os.path.join(self.path, models_dir)
save_model(self.model, model_filename, path)
'''
保存模型的训练方法的最后部分
'''
if (valid_loss is not None and scheduler is not None
and isinstance(scheduler, lr_scheduler.ReduceLROnPlateau)):
scheduler.step(valid_loss)
if self.primary_metric in scores:
# 保存模型检查点
is_better = scores[self.primary_metric] > best_score
if is_better:
best_score = scores[self.primary_metric]
best_state_dict = deepcopy(self.model.state_dict())
# best_state_dict = self.model.state_dict()
if self.model_filename is not None:
print(f'第{epoch+1}轮 - 保存具有最佳召回率得分的检查点:{best_score:.6f}')
self.save_model(self.model_filename, models_dir='models')
if best_state_dict is not None:
self.model.load_state_dict(best_state_dict)
错误信息是 -
Traceback (most recent call last):
File "/notebooks/ensamble_model/ensamble_models/fine_tune.py", line 129, in <module>
trainer.train(no_epochs=config.no_epochs, lr=config.learning_rate)
File "/notebooks/ensamble_model/ensamble_models/src/core/training.py", line 259, in train
best_state_dict = self.model.state_dict()
File "/notebooks/.env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1381, in state_dict
module.state_dict(destination=destination, prefix=prefix + name + '.',
TypeError: state_dict()接收到了一个意外的关键字参数'destinatio
请注意,错误信息似乎被截断了,因此最后的错误信息没有提供完整的信息。如果您需要更多帮助,可以提供完整的错误消息。
英文:
I am training an ensemble of two deep learning models and the workflow is working fine and saving ‘.pth’ file when trained on M1 mac but on GPU I am getting this error. i tried changing version to 1.12.0 (torch), same error. I am writing full error and code snippet. Please assist. Thank you.
'''
save model function outside Trainer Class
'''
def save_model(model, model_filename, path='.'):
if not model_filename.endswith('.pth'):
model_filename += '.pth'
filepath = os.path.join(path, model_filename)
io.create_path(filepath)
torch.save(model.state_dict(), filepath)
'''
save_model method inside Trainer class
'''
def save_model(self, model_filename=None, *, models_dir='models'):
if model_filename is None and self.model_filename is None:
raise ValueError('Param "model_filename" is None')
elif model_filename is None:
model_filename = self.model_filename
path = os.path.join(self.path, models_dir)
save_model(self.model, model_filename, path)
'''
last part of training method where it saves the model
'''
if (valid_loss is not None and scheduler is not None
and isinstance(scheduler, lr_scheduler.ReduceLROnPlateau)):
scheduler.step(valid_loss)
if self.primary_metric in scores:
# save model checkpoint
is_better = scores[self.primary_metric] > best_score
if is_better:
best_score = scores[self.primary_metric]
best_state_dict = deepcopy(self.model.state_dict())
# best_state_dict = self.model.state_dict()
if self.model_filename is not None:
print(f'Epoch {epoch+1} - Save Checkpoint with Best Recall Score: {best_score:.6f}')
self.save_model(self.model_filename, models_dir='models')
if best_state_dict is not None:
self.model.load_state_dict(best_state_dict)
and error is -
Traceback (most recent call last):
File "/notebooks/ensamble_model/ensamble_models/fine_tune.py", line 129, in <module>
trainer.train(no_epochs=config.no_epochs, lr=config.learning_rate)
File "/notebooks/ensamble_model/ensamble_models/src/core/training.py", line 259, in train
best_state_dict = self.model.state_dict()
File "/notebooks/.env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1381, in state_dict
module.state_dict(destination=destination, prefix=prefix + name + '.', keep_vars=keep_vars)
TypeError: state_dict() got an unexpected keyword argument 'destinatio
答案1
得分: 1
我发现错误与我在模型类中使用的预训练架构有关。我正在结合两个模型,其中一个来自timm库的“vit_base_patch16_384”,我发现如果我改成“vit_small_patch16_384”或其他一些像我尝试过的“efficientnet_b0”,那么我的代码甚至在GPU上也可以正常工作。所以,问题出在我在新的模型类中使用的架构上。
英文:
I found the error is about the pre-trained architecture I am using in my Model Class. I was combining two models one of which was “vit_base_patch16_384” from timm library and I found if I change to “vit_small_patch16_384” or some other like I tried “efficientnet_b0”. then my code works fine even on GPU. So, it was about architecture which I was using in my new Model Class.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论