问题

这两个Huggingface上的Auto类之间有什么不同？

我尝试阅读文档，但没有找到区分它们的信息。

英文:

As per the title, how are these two Auto Classes on Huggingface different from each other?

I tried reading the documentation but did not find differentiating information

答案1

得分: 11

"AutoModelForSeq2SeqLM" 直观地用于具有编码器-解码器架构的语言模型，例如 T5 和 BART，而 "AutoModelForCausalLM" 用于自回归语言模型，如所有的 GPT 模型。

这两个类是概念性的 API，用于自动推断这两种类型模型的特定模型类，例如，使用 "AutoModelForCausalLM.from_pretrained('gpt2')" 可以使用 "GPT2LMHeadModel"。例如，您可以查看所有推断模型的源代码（MODEL_FOR_CAUSAL_LM_MAPPING 和 MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING）。

英文:

Intuitively, AutoModelForSeq2SeqLM is used for language models with encoder-decoder architecture like T5 and BART, while AutoModelForCausalLM is used for auto-regressive language models like all the GPT models.

These two classes are conceptual APIs to automatically infer a specific model class for the two types of models, e.g., GPT2LMHeadModel using AutoModelForCausalLM.from_pretrained('gpt2'). For example, You can see the source code for all the inference models. (MODEL_FOR_CAUSAL_LM_MAPPING and MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

AutoModelForSeq2SeqLM 和 AutoModelForCausalLM 之间的区别是什么？

问题

答案1

为什么需要位置编码，而输入的id已经表示了Bert中单词的顺序？

如何使用Huggingface GenerationMixin（或其束搜索）与我的自定义模型？

为什么我的代码的并行版本比串行版本运行得更慢？

KeyError: "marketplace" while downloading "amazon_us_reviews" dataset – huggingface datasets

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论