2023年6月15日 19:31:53go评论100阅读模式

英文:

How to get more detailed results sources with Langchain

问题

我正在尝试使用Langchain创建一个简单的“带来源的问答”，并使用特定的URL作为数据源。该URL包含一个单独的页面，上面有大量信息。

问题是，“RetrievalQAWithSourcesChain”只返回整个URL作为结果的来源，这在这种情况下并不是很有用。

是否有一种方法可以获取更详细的来源信息？
也许是页面上特定部分的标题？
甚至提供到页面正确部分的可点击URL会更有帮助！

我不太确定“结果来源”的生成是语言模型、URL加载器的功能，还是仅仅是“RetrievalQAWithSourcesChain”本身。

我尝试使用“UnstructuredURLLoader”和“SeleniumURLLoader”，希望更详细地读取和输入数据可能会有所帮助 - 可惜没有。

相关代码摘录：

llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
chain = RetrievalQAWithSourcesChain.from_llm(llm=llm, retriever=VectorStore.as_retriever())

result = chain({"question": question})

print(result['answer'])
print("\n Sources : ", result['sources'])

希望这有所帮助！

英文:

I am trying to put together a simple "Q&A with sources" using Langchain and a specific URL as the source data. The URL consists of a single page with quite a lot of information on it.

The problem is that RetrievalQAWithSourcesChain is only giving me the entire URL back as the source of the results, which is not very useful in this case.

Is there a way to get more detailed source info?
Perhaps the heading of the specific section on the page?
A clickable URL to the correct section of the page would be even more helpful!

I am slightly unsure whether the generating of the result source is a function of the language model, URL loader or simply RetrievalQAWithSourcesChain alone.

I have tried using UnstructuredURLLoader and SeleniumURLLoader with the hope that perhaps more detailed reading and input of the data would help - sadly not.

Relevant code excerpt:

llm = ChatOpenAI(temperature=0, model_name=&#39;gpt-3.5-turbo&#39;)
chain = RetrievalQAWithSourcesChain.from_llm(llm=llm, retriever=VectorStore.as_retriever())

result = chain({&quot;question&quot;: question})

print(result[&#39;answer&#39;])
print(&quot;\n Sources : &quot;,result[&#39;sources&#39;] )

答案1

得分: 1

ChatGPT非常灵活，你越明确，结果就越好。这个链接显示你正在使用的函数的文档。有一个参数是langchain.prompts.BasePromptTemplate，可以让你给ChatGPT提供更明确的指示。

它看起来基本的提示模板是这样的

使用以下知识三元组来回答末尾的问题。如果你不知道答案，就说你不知道，不要试图编造答案。\n\n{context}\n\n问题：{question}\n有用的回答：

你可以再加一句话，给ChatGPT更明确的指示

请使用JSON格式化回答，形式为 { "answer": "{your_answer}", "relevant_quotes": ["引用列表"] }。将your_answer替换为问题的答案，同时在列表中包括来自来源材料的相关引用。

你可能需要稍微调整一下，让ChatGPT的回应更好。然后你应该能解析它。

ChatGPT在API中有3种消息类型

User - 用户发给模型的消息
model - 模型发给用户的消息
system - 提示工程师发给模型的消息，以添加指示。Lang chain不使用这个，因为它是一次性提示

我强烈推荐这些课程关于ChatGPT，因为它们来自Andrew Ng，质量很高。

英文:

ChatGPT is very flexible, and the more explicit you are better results you can get. This link show the docs for the function you are using. there is a parameter for langchain.prompts.BasePromptTemplate that allows you to give ChatGPT more explicit instructions.

It looks like the base prompt template is this

>Use the following knowledge triplets to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n{context}\n\nQuestion: {question}\nHelpful Answer:

You can add in another sentence giving ChatGPT more clear instructions

> Please format the answer with JSON of the form { "answer": "{your_answer}", "relevant_quotes": ["list of quotes"] }. Substitutde your_answer as the answer to the question, but also include relevant quotes from the source material in the list.

You may need to tweak it a little bit to get ChatGPT responding well. Then you should be able to parse it.

ChatGPT has 3 message types in the API

User - a message from an end user to the model
model - a message from the model to the end user
system - a message from the prompt engineer to model to add instructions. Lang chain doesn't use this since it's a one-shot prompt

I strongly recommend these courses on ChatGPT since they are from Andrew Ng and very high quality.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何通过Langchain获得更详细的结果来源。

问题

答案1

返回一个列表的列表 Pytorch

将行类别转换为列，同时保留数据框的其余部分，使用Python。

从 clutch.io 收集数据：在 Colab 上使用 BS4 时出现了一些问题。

Access to fetch at https://api-test-license.onrender.com/licenses'from origin https://license-frontend.onrender.com has been blocked by CORS policy

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论