Hugging Face Transformer:模型 bio_ClinicalBERT 没有针对任何任务进行训练吗?

huangapple go评论77阅读模式
英文:

Hugging face transformer: model bio_ClinicalBERT not trained for any of the task?

问题

这可能是最基础的问题之一:sweat:。

我刚开始学习NLP和hugging face。我现在尝试的第一件事是在一些临床笔记数据上应用一个bioBERT模型,然后再进行模型微调。看起来"emilyalsentzer/Bio_ClinicalBERT" 是我数据中最接近的模型。

但是当我尝试用它进行任何分析时,我总是得到这个警告。
> 在初始化BertForSequenceClassification模型时,emilyalsentzer/Bio_ClinicalBERT模型检查点的一些权重未被使用:['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']

从hugging face课程第二章我了解到这是什么意思。
> 这是因为BERT没有经过预训练来对句子对进行分类,因此预训练模型的头部已被丢弃,而适用于序列分类的新头部已被添加。警告表明一些权重未被使用(对应于丢弃的预训练头部)并且一些权重被随机初始化(用于新头部的权重)。最后鼓励您训练模型,这正是我们现在要做的。

所以我继续测试可以使用"emilyalsentzer/Bio_ClinicalBERT"做哪些NLP任务,直接使用。

from transformers import pipeline, AutoModel
checkpoint = "emilyalsentzer/Bio_ClinicalBERT"

nlp_task = ['conversational', 'feature-extraction', 'fill-mask', 'ner', 'question-answering', 'sentiment-analysis', 'text-classification', 'token-classification', 'zero-shot-classification']

for task in nlp_task:
    print(task)
    process = pipeline(task=task, model = checkpoint)

对于所有NLP任务,我得到了相同的警告消息,所以对我来说看起来我不应该/不建议在任何任务上使用这个模型。这真的让我困扰。原始的bio_clinicalBERT模型论文指出他们在几个不同的任务上取得了良好的结果。所以肯定这个模型是针对这些任务进行了训练的。我在其他模型上也遇到了类似的问题,即博客或研究论文说一个模型在特定任务上取得了好的结果,但是当我尝试用pipeline应用时,它会出现警告消息。为什么头部层没有包含在模型中有什么原因吗?

我只有几百条临床笔记(也没有注释:disappointed:),所以看起来这对于训练来说不够大。有没有办法在不训练的情况下使用我的数据进行模型?

谢谢您的时间。

英文:

This maybe the most beginner question of all :sweat:.

I just started learning about NLP and hugging face. The first thing I'm trying to do is to apply one the bioBERT models on some clinical note data and see what I do, before moving on to the fine-tuning the model. And it looks like "emilyalsentzer/Bio_ClinicalBERT" to be the closest model for my data.

But as I try to use it for any of the analyses I always get this warning.
> Some weights of the model checkpoint at emilyalsentzer/Bio_ClinicalBERT were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']

From the hugging face course chapter 2 I understand this meant.
> This is because BERT has not been pretrained on classifying pairs of sentences, so the head of the pretrained model has been discarded and a new head suitable for sequence classification has been added instead. The warnings indicate that some weights were not used (the ones corresponding to the dropped pretraining head) and that some others were randomly initialized (the ones for the new head). It concludes by encouraging you to train the model, which is exactly what we are going to do now.

So I went on to test which NLP task I can use "emilyalsentzer/Bio_ClinicalBERT" for, out of the box.

from transformers import pipeline, AutoModel
checkpoint = "emilyalsentzer/Bio_ClinicalBERT"

nlp_task = ['conversational', 'feature-extraction', 'fill-mask', 'ner', 
'question-answering', 'sentiment-analysis', 'text-classification',  
'token-classification', 
'zero-shot-classification' ]

for task in nlp_task:
    print(task)
    process = pipeline(task=task, model = checkpoint)

And I got the same warning message for all the NLP tasks, so it appears to me that I shouldn't/advised not to use the model for any of the tasks. This really confuses me. The original bio_clinicalBERT model paper stated that they had good results on a few different tasks. So certainly the model was trained for those tasks. I also have similar issue with other models as well, i.e. the blog or research papers said a model obtained good results with a specific task but when I tried to apply with pipeline it gives the warning message. Is there any reason why the head layers were not included in the model?

I only have a few hundreds clinical notes (also unannotated :frowning_face:), so it doesn't look like it's big enough for training. Is there any way I could use the model on my data without training?

Thank you for your time.

答案1

得分: 0

这个Bio_ClinicalBERT模型是为了遮蔽语言模型(MLM)任务进行训练的。该任务基本上用于学习语言/领域中标记的语义关系。对于下游任务,您可以使用您的小数据集微调模型的标头,或者您可以使用一个像Bio_ClinicalBERT-finetuned-medicalcondition这样的微调模型,它是相同模型的微调版本。您可以在HuggingFace中通过搜索'bio-clinicalBERT'来找到所有微调过的模型,链接在这里

英文:

This Bio_ClinicalBERT model is trained for Masked Language Model (MLM) task. This task basically used for learning the semantic relation of the token in the language/domain. For downstream tasks, you can fine-tune the model's header with your small dataset, or you can use a fine-tuned model like Bio_ClinicalBERT-finetuned-medicalcondition which is the fine-tuned version of the same model. You can find all the fine-tuned models in HuggingFace by searching 'bio-clinicalBERT' as in the link.

huangapple
  • 本文由 发表于 2023年2月19日 12:29:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75497996.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定