问题

我正在使用Sentence-Transformers模型在自定义数据集上进行Fine Tune（使用PyTorch），该数据集与语义文本相似性（STS）数据集相同。

在训练过程中，我无法获取（或打印）训练或验证错误。我正在尝试找到如何在训练期间或之后监控这些错误，查阅了不同的文档并尝试了解决方案，但仍然无法监控这些错误。

以下是代码的训练部分。如何在SBERT的训练过程中了解训练和验证错误？

train_loss = losses.MultipleNegativesRankingLoss(model)
evaluator = EmbeddingSimilarityEvaluator.from_input_examples(val_set, name='sts-dev')
num_epochs = 20
warmup_steps = int(len(train_dataloader) * num_epochs * 0.1)

model.fit(train_objectives=[(train_dataloader, train_loss)],
          evaluator=evaluator,
          epochs=num_epochs,
          evaluation_steps=1000,
          warmup_steps=warmup_steps,
          show_progress_bar=True)

您可以查看训练的进度条以供参考

英文:

I am using the Sentence-Transformers model to Fine Tune(using PyTorch) it on a custom dataset which is the same as the Semantic Text Similarity (STS) Dataset.

I am unable to get(or print) the training or validation error during training. I am trying to find how to monitor these errors during or after training, explored different documentation, and tried solutions but still unable to monitor those errors.

Below is the training part of the code. How can one know the training and validation error during the training for SBERT?

train_loss = losses.MultipleNegativesRankingLoss(model)
evaluator = EmbeddingSimilarityEvaluator.from_input_examples(val_set, name=&#39;sts-dev&#39;)
num_epochs = 20
warmup_steps = int(len(train_dataloader) * num_epochs * 0.1)

model.fit(train_objectives=[(train_dataloader, train_loss)],
          evaluator=evaluator,
          epochs=num_epochs,
          evaluation_steps=1000,
          warmup_steps=warmup_steps,
          show_progress_bar=True)

You can see the progress bar of training for reference

答案1

得分: 1

在此时，似乎有一些关于SentenceTransformers GitHub存储库的问题#336，#510和#1021正在请求这一功能。这些线程中的一些评论提供了一些中间解决方案，同时正在处理拉取请求#1606，以提供用于跟踪损失的通用框架。

总的来说，中间解决方案建议修改SentenceTransformers.py中的fit()函数的行为，通过传递的损失函数passed in来显示损失，或者实现一个基于SentenceEvaluator的自定义评估器，提供与其中一个具体评估器（例如EmbeddingSimilarityEvaluator）相同的功能，但也显示/记录损失。

英文:

At this point in time, there appears to be issues #336, #510 and #1021 open on SentenceTransformers GitHub repository requesting this functionality. Some of the comments on those threads offer some intermediary solutions whilst pull request #1606 is worked on to provide a generic framework for loss tracking.

In general, the interim solutions suggested either modify the behaviour of the fit() function in SentenceTransformers.py via the loss function passed in to display the loss, or implement a custom evaluator based on the the SentenceEvaluator that provides the same functionality as one of the concrete evaluators (e.g. the EmbeddingSimilarityEvaluator) but also displays/logs the loss.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Sentence-Transformer训练和验证损失

问题

答案1

卡在从Huggingface下载shards以加载LLM模型。

如何在Python中使用Rake算法设置提取的关键词数量？

Pytorch深度学习

Pycaret在变换前后导出训练和测试数据。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论