问题

OpenAI的API包括一个微调服务，将任务分为“提示”和“完成”。

文档中指出，准确性指标是根据完成部分计算的。但对于损失，文档中说它是在“训练批次”上计算的。

我的理解是，GPT模型的第一次训练总是在可用的最大批次中进行的，使用特殊令牌来分隔上下文，但始终要求预测所有条目的下一个令牌。因此，在这里，损失函数是所有输出的明显交叉熵。但在微调中，有机会学习预测“模板提示”或不学习。这两种决策都是有道理的；学习模板相当于训练解析，遮盖模板可以避免过拟合。

那么，在OpenAI的当前实践中，是如何处理的呢？

英文:

OpenAI api includes a finetuning service that divides the task in "prompt" and "completion"

https://platform.openai.com/docs/guides/fine-tuning

The documentation says that the accuracy metrics are calculated respect to the completion. But for the loss it is said that it is calculated "on the training batch".

My understanding is that the first training of a GPT model always happen in batches of max available size, using an special token to separate contexts but always asking to predict the next token for all the entries. So here the loss function is the obvious cross entropy over all the outputs. But in fine tuning, there is the opportunity to learn to predict the "template prompt" or not. Both decisions can be sensible; learning the template amounts to train a parsing, masking the template can avoid overfitting.

So, what is the current practice in OpenAI?

答案1

得分: 2

Open AI API有一个参数prompt_loss_weight，其默认值为0.01，与始终具有1.0权重的完成相比。所以是的，它将提示的预测视为损失函数的一部分。

这种用法似乎与使用其他工具进行微调的教程不同，例如Huggingface transformers库，它允许使用掩码来丢弃输出的一部分，但不考虑损失的不同权重。

英文:

Open AI API has a parameter prompt_loss_weight whose default is 0.01, as compared to the completion which always has a weight of 1.0. So yes, it considers the prediction of the prompt as part of the loss function.

This usage seems different to fine-tuning tutorials with other tools as Huggingface transformers library, that allow for a mask to discard part of the output but does not consider different weight of the losses.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

OpenAI GPT微调是否考虑提示在损失函数中？

问题

答案1

无法将提示模板传递给语言链中的检索问答模块。

如何将抓取的页面内容加载到Langchain的VectorstoreIndexCreator中？

OpenAI gpt-3.5-turbo：请求失败，状态码为400

在Ruby on Rails中使用ruby-openai API gem：如何实现流式对话？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论