得分在使用PyKEEN管道进行迭代训练中没有改善。

huangapple go评论72阅读模式
英文:

Score not improving in iterative training using PyKEEN pipeline

问题

我正在尝试使用Python中的PyKEEN库实现迭代训练过程。我的目标是通过使用上一次迭代的训练模型来提高每次迭代的分数。然而,我遇到了一个问题,即在第一次迭代后分数没有改善。

from pykeen.pipeline import pipeline
from pykeen.datasets import Nations
from pykeen.models import TransE

dataset = Nations()
initial_model = TransE
model = initial_model
seed = 0
epochs = 10
budget = 4
scores = []

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler='basic'
    )

    model = test.model

    scores.append(round(test.metric_results.get_metric("hits_at_10"), 3))

print(scores)

在每次迭代中,我使用PyKEEN管道训练模型,并使用当前迭代的训练模型更新模型变量。然而,在第一次迭代后,分数保持不变,表明模型没有改善。

我怀疑在每次迭代中分配model = test.model可能是导致问题的原因。是否应该采取不同的方法来确保管道使用上一次迭代的模型进行训练?

我尝试过设置model.training = True,但没有效果。

我还尝试了查看模型是否在后续迭代中发生了变化,但模型并没有发生变化。初始模型与后续迭代不同,但所有迭代都具有相同的模型,即使在训练后也是如此。这表明模型根本没有训练,这也可以从分数中看出,因为这些分数在后续迭代中也没有改变。

我将感激任何有关如何解决这个问题的见解或建议。谢谢!

英文:

I am trying to implement an iterative training process using the PyKEEN library in Python. My goal is to improve the score in each iteration by training the model using the previous iteration's trained model. However, I have encountered an issue where the score does not improve after the first iteration.

from pykeen.pipeline import pipeline
from pykeen.datasets import Nations
from pykeen.models import TransE

dataset = Nations()
initial_model = TransE
model = initial_model
seed = 0
epochs = 10
budget = 4
scores = []

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler='basic'
    )

    model = test.model

    scores.append(round(test.metric_results.get_metric("hits_at_10"), 3))

print(scores)

In each iteration, I train the model using the PyKEEN pipeline and update the model variable with the trained model from the current iteration. However, the scores remain the same after the first iteration, indicating that the model is not improving.

I suspect that assigning model = test.model in each iteration might be causing the issue. Is there a different approach I should take to ensure that the pipeline trains using the model from the previous iteration?

I tried setting model.training = True, but this had no effect.

I also tried to see whether the model changed at all in subsequent iterations, which it does not.
The initial model is different from subsequent iterations, but all iterations have the same model, even after training. This suggests the model doesn't train at all, which can also be seen in the scores, as these also do not change in subsequent iterations.

I would appreciate any insights or suggestions on how to resolve this issue. Thank you!

答案1

得分: 0

我建议您在循环中明确保存模型并重新加载它,pykeen提供了一种将模型保存到路径的方法这里

import torch

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler='basic'
    )

    scores.append(round(test.metric_results.get_metric("hits_at_10"), 3))

    test.save_to_directory('./path/to/your/directory')
    model = torch.load('./path/to/your/directory/trained_model.pkl')

print(scores)
英文:

I suggest you explicitly save the model and reload it in your loop, pykeen provides a way to save it to a path here

import torch

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler='basic'
    )

    scores.append(round(test.metric_results.get_metric("hits_at_10"), 3))
    
    test.save_to_directory('./path/to/your/directory')
    model = torch.load('./path/to/your/directory/trained_model.pkl')

print(scores)

答案2

得分: 0

我发现它没有改变的原因是因为我存储对象的方式不同。而不是制作副本,它都引用了相同的对象,导致结果相同。我使用了 copy.deepcopy() 来解决这个问题。

英文:

I found out the reason it didn't change is due to how I stored the object. Instead of making a copy, it all referenced the same object which caused the results to be the same. I used copy.deepcopy() to solve this.

model = copy.deepcopy(test.model)

huangapple
  • 本文由 发表于 2023年7月3日 23:20:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/76606108.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定