State_dict() 出现了意外的关键字参数 ‘destination’。

huangapple go评论65阅读模式
英文:

State_dict() got an unexpected keyword argument ‘destination’

问题

以下是您提供的内容的中文翻译:

我正在训练一个包含两个深度学习模型的集成,工作流在M1 Mac上训练时正常运行并保存'.pth'文件,但在GPU上出现了错误。我尝试将版本更改为1.12.0(torch),但仍然出现相同的错误。我会提供完整的错误消息和代码片段。请协助。谢谢。

'''
在Trainer类之外保存模型的函数
'''

def save_model(model, model_filename, path='.'):
    if not model_filename.endswith('.pth'):
        model_filename += '.pth'
    filepath = os.path.join(path, model_filename)
    io.create_path(filepath)
    torch.save(model.state_dict(), filepath)

'''
在Trainer类内部的save_model方法
'''

def save_model(self, model_filename=None, *, models_dir='models'):
        if model_filename is None and self.model_filename is None:
            raise ValueError('参数 "model_filename" 为 None')
        elif model_filename is None:
            model_filename = self.model_filename
        path = os.path.join(self.path, models_dir)
        save_model(self.model, model_filename, path)

'''
保存模型的训练方法的最后部分
'''

if (valid_loss is not None and scheduler is not None
    and isinstance(scheduler, lr_scheduler.ReduceLROnPlateau)):
    scheduler.step(valid_loss)

if self.primary_metric in scores:
    # 保存模型检查点
    is_better = scores[self.primary_metric] > best_score
    if is_better:
        best_score = scores[self.primary_metric]
        best_state_dict = deepcopy(self.model.state_dict())
        # best_state_dict = self.model.state_dict()
        if self.model_filename is not None:
            print(f'第{epoch+1}轮 - 保存具有最佳召回率得分的检查点:{best_score:.6f}')
            self.save_model(self.model_filename, models_dir='models')

if best_state_dict is not None:
    self.model.load_state_dict(best_state_dict)

错误信息是 -

Traceback (most recent call last):
  File "/notebooks/ensamble_model/ensamble_models/fine_tune.py", line 129, in <module>
    trainer.train(no_epochs=config.no_epochs, lr=config.learning_rate)
  File "/notebooks/ensamble_model/ensamble_models/src/core/training.py", line 259, in train
    best_state_dict = self.model.state_dict()
  File "/notebooks/.env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1381, in state_dict
    module.state_dict(destination=destination, prefix=prefix + name + '.',
TypeError: state_dict()接收到了一个意外的关键字参数'destinatio

请注意,错误信息似乎被截断了,因此最后的错误信息没有提供完整的信息。如果您需要更多帮助,可以提供完整的错误消息。

英文:

I am training an ensemble of two deep learning models and the workflow is working fine and saving ‘.pth’ file when trained on M1 mac but on GPU I am getting this error. i tried changing version to 1.12.0 (torch), same error. I am writing full error and code snippet. Please assist. Thank you.

&#39;&#39;&#39;
save model function outside Trainer Class
&#39;&#39;&#39;

def save_model(model, model_filename, path=&#39;.&#39;):
    if not model_filename.endswith(&#39;.pth&#39;):
        model_filename += &#39;.pth&#39;
    filepath = os.path.join(path, model_filename)
    io.create_path(filepath)
    torch.save(model.state_dict(), filepath)

&#39;&#39;&#39;
save_model method inside Trainer class
&#39;&#39;&#39;

def save_model(self, model_filename=None, *, models_dir=&#39;models&#39;):
        if model_filename is None and self.model_filename is None:
            raise ValueError(&#39;Param &quot;model_filename&quot; is None&#39;)
        elif model_filename is None:
            model_filename = self.model_filename
        path = os.path.join(self.path, models_dir)
        save_model(self.model, model_filename, path)

&#39;&#39;&#39;
last part of training method where it saves the model
&#39;&#39;&#39;

 
            if (valid_loss is not None and scheduler is not None
                and isinstance(scheduler, lr_scheduler.ReduceLROnPlateau)):
                scheduler.step(valid_loss)

            if self.primary_metric in scores:
                # save model checkpoint
                is_better = scores[self.primary_metric] &gt; best_score
                if is_better:
                    best_score = scores[self.primary_metric]
                    best_state_dict = deepcopy(self.model.state_dict())
                    # best_state_dict = self.model.state_dict()
                    if self.model_filename is not None:
                        print(f&#39;Epoch {epoch+1} - Save Checkpoint with Best Recall Score: {best_score:.6f}&#39;)
                        self.save_model(self.model_filename, models_dir=&#39;models&#39;)

        if best_state_dict is not None:
            self.model.load_state_dict(best_state_dict)

and error is -

Traceback (most recent call last):
  File &quot;/notebooks/ensamble_model/ensamble_models/fine_tune.py&quot;, line 129, in &lt;module&gt;
    trainer.train(no_epochs=config.no_epochs, lr=config.learning_rate)
  File &quot;/notebooks/ensamble_model/ensamble_models/src/core/training.py&quot;, line 259, in train
    best_state_dict = self.model.state_dict()
  File &quot;/notebooks/.env/lib/python3.9/site-packages/torch/nn/modules/module.py&quot;, line 1381, in state_dict
    module.state_dict(destination=destination, prefix=prefix + name + &#39;.&#39;, keep_vars=keep_vars)
TypeError: state_dict() got an unexpected keyword argument &#39;destinatio

答案1

得分: 1

我发现错误与我在模型类中使用的预训练架构有关。我正在结合两个模型,其中一个来自timm库的“vit_base_patch16_384”,我发现如果我改成“vit_small_patch16_384”或其他一些像我尝试过的“efficientnet_b0”,那么我的代码甚至在GPU上也可以正常工作。所以,问题出在我在新的模型类中使用的架构上。

英文:

I found the error is about the pre-trained architecture I am using in my Model Class. I was combining two models one of which was “vit_base_patch16_384” from timm library and I found if I change to “vit_small_patch16_384” or some other like I tried “efficientnet_b0”. then my code works fine even on GPU. So, it was about architecture which I was using in my new Model Class.

huangapple
  • 本文由 发表于 2023年5月14日 01:43:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76244130.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定