AutoModelForSeq2SeqLM 和 AutoModelForCausalLM 之间的区别是什么?

huangapple go评论93阅读模式
英文:

Difference between AutoModelForSeq2SeqLM and AutoModelForCausalLM

问题

这两个Huggingface上的Auto类之间有什么不同?

我尝试阅读文档,但没有找到区分它们的信息。

英文:

As per the title, how are these two Auto Classes on Huggingface different from each other?

I tried reading the documentation but did not find differentiating information

答案1

得分: 11

"AutoModelForSeq2SeqLM" 直观地用于具有编码器-解码器架构的语言模型,例如 T5 和 BART,而 "AutoModelForCausalLM" 用于自回归语言模型,如所有的 GPT 模型。

这两个类是概念性的 API,用于自动推断这两种类型模型的特定模型类,例如,使用 "AutoModelForCausalLM.from_pretrained('gpt2')" 可以使用 "GPT2LMHeadModel"。例如,您可以查看所有推断模型的源代码MODEL_FOR_CAUSAL_LM_MAPPINGMODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING)。

英文:

Intuitively, AutoModelForSeq2SeqLM is used for language models with encoder-decoder architecture like T5 and BART, while AutoModelForCausalLM is used for auto-regressive language models like all the GPT models.

These two classes are conceptual APIs to automatically infer a specific model class for the two types of models, e.g., GPT2LMHeadModel using AutoModelForCausalLM.from_pretrained('gpt2'). For example, You can see the source code for all the inference models. (MODEL_FOR_CAUSAL_LM_MAPPING and MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING)

huangapple
  • 本文由 发表于 2023年2月24日 03:45:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/75549632.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定