State_dict() 出现了意外的关键字参数 ‘destination’。

huangapple go评论116阅读模式
英文:

State_dict() got an unexpected keyword argument ‘destination’

问题

以下是您提供的内容的中文翻译:

我正在训练一个包含两个深度学习模型的集成,工作流在M1 Mac上训练时正常运行并保存'.pth'文件,但在GPU上出现了错误。我尝试将版本更改为1.12.0(torch),但仍然出现相同的错误。我会提供完整的错误消息和代码片段。请协助。谢谢。

  1. '''
  2. 在Trainer类之外保存模型的函数
  3. '''
  4. def save_model(model, model_filename, path='.'):
  5. if not model_filename.endswith('.pth'):
  6. model_filename += '.pth'
  7. filepath = os.path.join(path, model_filename)
  8. io.create_path(filepath)
  9. torch.save(model.state_dict(), filepath)
  10. '''
  11. 在Trainer类内部的save_model方法
  12. '''
  13. def save_model(self, model_filename=None, *, models_dir='models'):
  14. if model_filename is None and self.model_filename is None:
  15. raise ValueError('参数 "model_filename" 为 None')
  16. elif model_filename is None:
  17. model_filename = self.model_filename
  18. path = os.path.join(self.path, models_dir)
  19. save_model(self.model, model_filename, path)
  20. '''
  21. 保存模型的训练方法的最后部分
  22. '''
  23. if (valid_loss is not None and scheduler is not None
  24. and isinstance(scheduler, lr_scheduler.ReduceLROnPlateau)):
  25. scheduler.step(valid_loss)
  26. if self.primary_metric in scores:
  27. # 保存模型检查点
  28. is_better = scores[self.primary_metric] > best_score
  29. if is_better:
  30. best_score = scores[self.primary_metric]
  31. best_state_dict = deepcopy(self.model.state_dict())
  32. # best_state_dict = self.model.state_dict()
  33. if self.model_filename is not None:
  34. print(f'第{epoch+1}轮 - 保存具有最佳召回率得分的检查点:{best_score:.6f}')
  35. self.save_model(self.model_filename, models_dir='models')
  36. if best_state_dict is not None:
  37. self.model.load_state_dict(best_state_dict)

错误信息是 -

  1. Traceback (most recent call last):
  2. File "/notebooks/ensamble_model/ensamble_models/fine_tune.py", line 129, in <module>
  3. trainer.train(no_epochs=config.no_epochs, lr=config.learning_rate)
  4. File "/notebooks/ensamble_model/ensamble_models/src/core/training.py", line 259, in train
  5. best_state_dict = self.model.state_dict()
  6. File "/notebooks/.env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1381, in state_dict
  7. module.state_dict(destination=destination, prefix=prefix + name + '.',
  8. TypeError: state_dict()接收到了一个意外的关键字参数'destinatio

请注意,错误信息似乎被截断了,因此最后的错误信息没有提供完整的信息。如果您需要更多帮助,可以提供完整的错误消息。

英文:

I am training an ensemble of two deep learning models and the workflow is working fine and saving ‘.pth’ file when trained on M1 mac but on GPU I am getting this error. i tried changing version to 1.12.0 (torch), same error. I am writing full error and code snippet. Please assist. Thank you.

  1. &#39;&#39;&#39;
  2. save model function outside Trainer Class
  3. &#39;&#39;&#39;
  4. def save_model(model, model_filename, path=&#39;.&#39;):
  5. if not model_filename.endswith(&#39;.pth&#39;):
  6. model_filename += &#39;.pth&#39;
  7. filepath = os.path.join(path, model_filename)
  8. io.create_path(filepath)
  9. torch.save(model.state_dict(), filepath)
  10. &#39;&#39;&#39;
  11. save_model method inside Trainer class
  12. &#39;&#39;&#39;
  13. def save_model(self, model_filename=None, *, models_dir=&#39;models&#39;):
  14. if model_filename is None and self.model_filename is None:
  15. raise ValueError(&#39;Param &quot;model_filename&quot; is None&#39;)
  16. elif model_filename is None:
  17. model_filename = self.model_filename
  18. path = os.path.join(self.path, models_dir)
  19. save_model(self.model, model_filename, path)
  20. &#39;&#39;&#39;
  21. last part of training method where it saves the model
  22. &#39;&#39;&#39;
  23. if (valid_loss is not None and scheduler is not None
  24. and isinstance(scheduler, lr_scheduler.ReduceLROnPlateau)):
  25. scheduler.step(valid_loss)
  26. if self.primary_metric in scores:
  27. # save model checkpoint
  28. is_better = scores[self.primary_metric] &gt; best_score
  29. if is_better:
  30. best_score = scores[self.primary_metric]
  31. best_state_dict = deepcopy(self.model.state_dict())
  32. # best_state_dict = self.model.state_dict()
  33. if self.model_filename is not None:
  34. print(f&#39;Epoch {epoch+1} - Save Checkpoint with Best Recall Score: {best_score:.6f}&#39;)
  35. self.save_model(self.model_filename, models_dir=&#39;models&#39;)
  36. if best_state_dict is not None:
  37. self.model.load_state_dict(best_state_dict)

and error is -

  1. Traceback (most recent call last):
  2. File &quot;/notebooks/ensamble_model/ensamble_models/fine_tune.py&quot;, line 129, in &lt;module&gt;
  3. trainer.train(no_epochs=config.no_epochs, lr=config.learning_rate)
  4. File &quot;/notebooks/ensamble_model/ensamble_models/src/core/training.py&quot;, line 259, in train
  5. best_state_dict = self.model.state_dict()
  6. File &quot;/notebooks/.env/lib/python3.9/site-packages/torch/nn/modules/module.py&quot;, line 1381, in state_dict
  7. module.state_dict(destination=destination, prefix=prefix + name + &#39;.&#39;, keep_vars=keep_vars)
  8. TypeError: state_dict() got an unexpected keyword argument &#39;destinatio

答案1

得分: 1

我发现错误与我在模型类中使用的预训练架构有关。我正在结合两个模型,其中一个来自timm库的“vit_base_patch16_384”,我发现如果我改成“vit_small_patch16_384”或其他一些像我尝试过的“efficientnet_b0”,那么我的代码甚至在GPU上也可以正常工作。所以,问题出在我在新的模型类中使用的架构上。

英文:

I found the error is about the pre-trained architecture I am using in my Model Class. I was combining two models one of which was “vit_base_patch16_384” from timm library and I found if I change to “vit_small_patch16_384” or some other like I tried “efficientnet_b0”. then my code works fine even on GPU. So, it was about architecture which I was using in my new Model Class.

huangapple
  • 本文由 发表于 2023年5月14日 01:43:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76244130.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定