英文:
What is the best approach to creating a question generation model using GPT and Bert architectures?
问题
我想制作一个问题生成模型,从问题和上下文中生成问题。我应该使用基于GPT的模型还是基于Bert的架构。
GPT能够执行这些任务,但有时会返回与上下文本身不相关的模糊问题。当我使用WizardLM(7B)时,我能够从上下文中获得更自然、几乎符合要求的通用问题,只要保持在3个问题以内。
英文:
I want to make a question generation model from questions as well as context. Should I make use of GPT based models or Bert Based architectures.
GPT is able to perform the tasks but sometimes returns with vague questions that were not in the context itself. When I made use of WizardLM(7B), I was able to get generalized questions from the context itself which sounded more natural and were nearly to the point when kept within limit of 3.
答案1
得分: 1
在处理文本生成时,使用Transformer解码模型,如GPT-*模型,更加直接。尽管类似BERT的模型也能够生成文本,但这是一个相当复杂的过程,不是从这些模型预训练的任务自然而然地产生的。
我假设你正在比较GPT-2和WizardLM(7B)。通过使用更大的模型,增加参数数量,预计模型在这个任务上的性能会提高。我建议你尝试像Alpaca-LoRA、Dolly或GPT-J这样的大型语言模型(在这里查看如何在Colab Pro上运行GPT-J的方式)。
英文:
When dealing with text generation, it is more straightforward to work with Transformer decoder models such as GPT-* models. Although BERT-like models are also capable of text generation, it is a quite convoluted process and not something that follows naturally from the tasks for which these models have been pretrained.
I assume you are comparing GPT-2 and WizardLM (7B). The performance of the model on this task is expected to improve as you scale up the number of parameters by using larger models. I would recommend you to try LLMs such as Alpaca-LoRA, Dolly or GPT-J ( see here how to run GPT-J on Colab Pro ).
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论