如何通过Langchain获得更详细的结果来源。

huangapple go评论100阅读模式
英文:

How to get more detailed results sources with Langchain

问题

我正在尝试使用Langchain创建一个简单的“带来源的问答”,并使用特定的URL作为数据源。该URL包含一个单独的页面,上面有大量信息。

问题是,“RetrievalQAWithSourcesChain”只返回整个URL作为结果的来源,这在这种情况下并不是很有用。

是否有一种方法可以获取更详细的来源信息?
也许是页面上特定部分的标题?
甚至提供到页面正确部分的可点击URL会更有帮助!

我不太确定“结果来源”的生成是语言模型、URL加载器的功能,还是仅仅是“RetrievalQAWithSourcesChain”本身。

我尝试使用“UnstructuredURLLoader”和“SeleniumURLLoader”,希望更详细地读取和输入数据可能会有所帮助 - 可惜没有。

相关代码摘录:

llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
chain = RetrievalQAWithSourcesChain.from_llm(llm=llm, retriever=VectorStore.as_retriever())

result = chain({"question": question})

print(result['answer'])
print("\n Sources : ", result['sources'])

希望这有所帮助!

英文:

I am trying to put together a simple "Q&A with sources" using Langchain and a specific URL as the source data. The URL consists of a single page with quite a lot of information on it.

The problem is that RetrievalQAWithSourcesChain is only giving me the entire URL back as the source of the results, which is not very useful in this case.

Is there a way to get more detailed source info?
Perhaps the heading of the specific section on the page?
A clickable URL to the correct section of the page would be even more helpful!

I am slightly unsure whether the generating of the result source is a function of the language model, URL loader or simply RetrievalQAWithSourcesChain alone.

I have tried using UnstructuredURLLoader and SeleniumURLLoader with the hope that perhaps more detailed reading and input of the data would help - sadly not.

Relevant code excerpt:

llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
chain = RetrievalQAWithSourcesChain.from_llm(llm=llm, retriever=VectorStore.as_retriever())

result = chain({"question": question})

print(result['answer'])
print("\n Sources : ",result['sources'] )

答案1

得分: 1

ChatGPT非常灵活,你越明确,结果就越好。这个链接显示你正在使用的函数的文档。有一个参数是langchain.prompts.BasePromptTemplate,可以让你给ChatGPT提供更明确的指示。

它看起来基本的提示模板是这样的

使用以下知识三元组来回答末尾的问题。如果你不知道答案,就说你不知道,不要试图编造答案。\n\n{context}\n\n问题:{question}\n有用的回答:

你可以再加一句话,给ChatGPT更明确的指示

请使用JSON格式化回答,形式为 { "answer": "{your_answer}", "relevant_quotes": ["引用列表"] }。将your_answer替换为问题的答案,同时在列表中包括来自来源材料的相关引用。

你可能需要稍微调整一下,让ChatGPT的回应更好。然后你应该能解析它。

ChatGPT在API中有3种消息类型

  • User - 用户发给模型的消息
  • model - 模型发给用户的消息
  • system - 提示工程师发给模型的消息,以添加指示。Lang chain不使用这个,因为它是一次性提示

我强烈推荐这些课程关于ChatGPT,因为它们来自Andrew Ng,质量很高。

英文:

ChatGPT is very flexible, and the more explicit you are better results you can get. This link show the docs for the function you are using. there is a parameter for langchain.prompts.BasePromptTemplate that allows you to give ChatGPT more explicit instructions.

It looks like the base prompt template is this

>Use the following knowledge triplets to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n{context}\n\nQuestion: {question}\nHelpful Answer:

You can add in another sentence giving ChatGPT more clear instructions

> Please format the answer with JSON of the form { "answer": "{your_answer}", "relevant_quotes": ["list of quotes"] }. Substitutde your_answer as the answer to the question, but also include relevant quotes from the source material in the list.

You may need to tweak it a little bit to get ChatGPT responding well. Then you should be able to parse it.

ChatGPT has 3 message types in the API

  • User - a message from an end user to the model
  • model - a message from the model to the end user
  • system - a message from the prompt engineer to model to add instructions. Lang chain doesn't use this since it's a one-shot prompt

I strongly recommend these courses on ChatGPT since they are from Andrew Ng and very high quality.

huangapple
  • 本文由 发表于 2023年6月15日 19:31:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76482024.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定