2023年7月3日 23:20:13go评论92阅读模式

英文:

Score not improving in iterative training using PyKEEN pipeline

问题

我正在尝试使用Python中的PyKEEN库实现迭代训练过程。我的目标是通过使用上一次迭代的训练模型来提高每次迭代的分数。然而，我遇到了一个问题，即在第一次迭代后分数没有改善。

from pykeen.pipeline import pipeline
from pykeen.datasets import Nations
from pykeen.models import TransE

dataset = Nations()
initial_model = TransE
model = initial_model
seed = 0
epochs = 10
budget = 4
scores = []

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler='basic'
    )

    model = test.model

    scores.append(round(test.metric_results.get_metric("hits_at_10"), 3))

print(scores)

在每次迭代中，我使用PyKEEN管道训练模型，并使用当前迭代的训练模型更新模型变量。然而，在第一次迭代后，分数保持不变，表明模型没有改善。

我怀疑在每次迭代中分配model = test.model可能是导致问题的原因。是否应该采取不同的方法来确保管道使用上一次迭代的模型进行训练？

我尝试过设置model.training = True，但没有效果。

我还尝试了查看模型是否在后续迭代中发生了变化，但模型并没有发生变化。初始模型与后续迭代不同，但所有迭代都具有相同的模型，即使在训练后也是如此。这表明模型根本没有训练，这也可以从分数中看出，因为这些分数在后续迭代中也没有改变。

我将感激任何有关如何解决这个问题的见解或建议。谢谢！

英文:

I am trying to implement an iterative training process using the PyKEEN library in Python. My goal is to improve the score in each iteration by training the model using the previous iteration's trained model. However, I have encountered an issue where the score does not improve after the first iteration.

from pykeen.pipeline import pipeline
from pykeen.datasets import Nations
from pykeen.models import TransE

dataset = Nations()
initial_model = TransE
model = initial_model
seed = 0
epochs = 10
budget = 4
scores = []

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler=&#39;basic&#39;
    )

    model = test.model

    scores.append(round(test.metric_results.get_metric(&quot;hits_at_10&quot;), 3))

print(scores)

In each iteration, I train the model using the PyKEEN pipeline and update the model variable with the trained model from the current iteration. However, the scores remain the same after the first iteration, indicating that the model is not improving.

I suspect that assigning model = test.model in each iteration might be causing the issue. Is there a different approach I should take to ensure that the pipeline trains using the model from the previous iteration?

I tried setting model.training = True, but this had no effect.

I also tried to see whether the model changed at all in subsequent iterations, which it does not.
The initial model is different from subsequent iterations, but all iterations have the same model, even after training. This suggests the model doesn't train at all, which can also be seen in the scores, as these also do not change in subsequent iterations.

I would appreciate any insights or suggestions on how to resolve this issue. Thank you!

答案1

得分: 0

我建议您在循环中明确保存模型并重新加载它，pykeen提供了一种将模型保存到路径的方法这里

import torch

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler='basic'
    )

    scores.append(round(test.metric_results.get_metric("hits_at_10"), 3))

    test.save_to_directory('./path/to/your/directory')
    model = torch.load('./path/to/your/directory/trained_model.pkl')

print(scores)

英文:

I suggest you explicitly save the model and reload it in your loop, pykeen provides a way to save it to a path here

import torch

for i in range(budget):
    test = pipeline(
        dataset=dataset,
        model=model,
        training_kwargs=dict(
            num_epochs=epochs,
            use_tqdm_batch=False,
        ),
        random_seed=seed,
        negative_sampler=&#39;basic&#39;
    )

    scores.append(round(test.metric_results.get_metric(&quot;hits_at_10&quot;), 3))
    
    test.save_to_directory(&#39;./path/to/your/directory&#39;)
    model = torch.load(&#39;./path/to/your/directory/trained_model.pkl&#39;)

print(scores)

答案2

得分: 0

我发现它没有改变的原因是因为我存储对象的方式不同。而不是制作副本，它都引用了相同的对象，导致结果相同。我使用了 copy.deepcopy() 来解决这个问题。

英文:

I found out the reason it didn't change is due to how I stored the object. Instead of making a copy, it all referenced the same object which caused the results to be the same. I used copy.deepcopy() to solve this.

model = copy.deepcopy(test.model)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

得分在使用PyKEEN管道进行迭代训练中没有改善。

问题

答案1

答案2

制作一个高度和宽度可变的半圆楔形。

使用密钥对消息进行异或操作：TypeError：’int’对象不可调用

Switching dataframe integers to string so i can add text like '$' and '5.78 / Million' into the dataframe

datetime struggle with a pandas dataframe

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论