Sentence-Transformer训练和验证损失

huangapple go评论55阅读模式
英文:

Sentence-Transformer Training and Validation Loss

问题

我正在使用Sentence-Transformers模型在自定义数据集上进行Fine Tune(使用PyTorch),该数据集与语义文本相似性(STS)数据集相同。

在训练过程中,我无法获取(或打印)训练或验证错误。我正在尝试找到如何在训练期间或之后监控这些错误,查阅了不同的文档并尝试了解决方案,但仍然无法监控这些错误。

以下是代码的训练部分。如何在SBERT的训练过程中了解训练和验证错误?

train_loss = losses.MultipleNegativesRankingLoss(model)
evaluator = EmbeddingSimilarityEvaluator.from_input_examples(val_set, name='sts-dev')
num_epochs = 20
warmup_steps = int(len(train_dataloader) * num_epochs * 0.1)

model.fit(train_objectives=[(train_dataloader, train_loss)],
          evaluator=evaluator,
          epochs=num_epochs,
          evaluation_steps=1000,
          warmup_steps=warmup_steps,
          show_progress_bar=True)

您可以查看训练的进度条以供参考

英文:

I am using the Sentence-Transformers model to Fine Tune(using PyTorch) it on a custom dataset which is the same as the Semantic Text Similarity (STS) Dataset.

I am unable to get(or print) the training or validation error during training. I am trying to find how to monitor these errors during or after training, explored different documentation, and tried solutions but still unable to monitor those errors.

Below is the training part of the code. How can one know the training and validation error during the training for SBERT?

train_loss = losses.MultipleNegativesRankingLoss(model)
evaluator = EmbeddingSimilarityEvaluator.from_input_examples(val_set, name='sts-dev')
num_epochs = 20
warmup_steps = int(len(train_dataloader) * num_epochs * 0.1)

model.fit(train_objectives=[(train_dataloader, train_loss)],
          evaluator=evaluator,
          epochs=num_epochs,
          evaluation_steps=1000,
          warmup_steps=warmup_steps,
          show_progress_bar=True)

You can see the progress bar of training for reference

答案1

得分: 1

在此时,似乎有一些关于SentenceTransformers GitHub存储库的问题#336#510#1021正在请求这一功能。这些线程中的一些评论提供了一些中间解决方案,同时正在处理拉取请求#1606,以提供用于跟踪损失的通用框架。

总的来说,中间解决方案建议修改SentenceTransformers.py中的fit()函数的行为,通过传递的损失函数passed in来显示损失,或者实现一个基于SentenceEvaluator的自定义评估器,提供与其中一个具体评估器(例如EmbeddingSimilarityEvaluator)相同的功能,但也显示/记录损失。

英文:

At this point in time, there appears to be issues #336, #510 and #1021 open on SentenceTransformers GitHub repository requesting this functionality. Some of the comments on those threads offer some intermediary solutions whilst pull request #1606 is worked on to provide a generic framework for loss tracking.

In general, the interim solutions suggested either modify the behaviour of the fit() function in SentenceTransformers.py via the loss function passed in to display the loss, or implement a custom evaluator based on the the SentenceEvaluator that provides the same functionality as one of the concrete evaluators (e.g. the EmbeddingSimilarityEvaluator) but also displays/logs the loss.

huangapple
  • 本文由 发表于 2023年3月7日 05:15:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/75655918.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定