如何在我的数据上运行Hugging Face的预训练模型?

huangapple go评论62阅读模式
英文:

How do I get a pretrained model from hugging face running on my own data?

问题

"path_to_saved_model" 意味着保存的模型文件的路径,而 "path_to_tokenizer" 意味着保存的分词器(tokenizer)文件的路径。你需要将这两个路径替换为你本地存储的预训练模型和分词器文件的实际路径,以便在你的数据上运行该模型。

英文:

I found a pretrained model in this repository: https://github.com/causalNLP/logical-fallacy and I want to get it running on my own data locally.

In the description it says:

import transformers
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model =  AutoModelForSequenceClassification.from_pretrained('path_to_saved_model', num_labels=3)
tokenizer = AutoTokenizer.from_pretrained('path_to_tokenizer', do_lower_case=True)

what is meant by path_to_saved_model and path_to_tokenizer?

答案1

得分: 1

我建议你阅读Hugging Face网站提供的文档。

回答你关于Auto tokenizers的问题:path_to_saved_model 表示:

pretrained_model_name_or_path(字符串或os.PathLike)可以是以下之一:

  • 字符串,预定义的令牌生成器模型的模型ID,托管在huggingface.co上的模型存储库内。有效的模型ID可以位于根级别,如bert-base-uncased,或位于用户或组织名称下的命名空间下,例如dbmdz/bert-base-german-cased。
  • 包含令牌生成器所需的词汇文件的目录路径,例如使用save_pretrained()方法保存的目录,例如./my_model_directory/。
  • 仅当令牌生成器仅需要单个词汇文件(例如Bert或XLNet)时,才能使用单个保存的词汇文件的路径或URL,例如./my_model_directory/vocab.txt。(不适用于所有派生类)

同样适用于AutoModelForSequenceClassification

英文:

I recommend you read the documentation provided in the hugging face website.

To answer your question for Auto tokenizers 'path_to_saved_model` stands for :

pretrained_model_name_or_path (str or os.PathLike) — Can be either:

  • A string, the model id of a predefined tokenizer hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.
  • A path to a directory containing vocabulary files required by the tokenizer, for instance saved using the save_pretrained() method, e.g., ./my_model_directory/.
    A path or url to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: ./my_model_directory/vocab.txt. (Not applicable to all derived classes)

Same thing for AutoModelForSequenceClassification

huangapple
  • 本文由 发表于 2023年5月6日 20:28:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76188888.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定