2023年5月28日 23:05:58go评论153阅读模式

英文:

How can I optimize my OpenAI-based chatbot's NLP processing time in Python?

问题

我运行了以下基于Langchain和Chroma的代码，它应该在生产环境中作为客户端聊天机器人运行，所以我期望它能在一两秒内运行，但实际上发现这部分代码运行需要大约半分钟，这对于我的用例来说速度太慢了。如果您有任何建议，我会非常感激，以使它更快。非常感谢您宝贵的时间和帮助！

def generate_chain():
    chain = load_qa_with_sources_chain(
        OpenAI(temperature=0, openai_api_key=openai.api_key),
        chain_type="map_reduce"
    )
    return chain

def ask_docs(relevant_documents, query):
    chain = generate_chain()
    sourced_answer_obj = chain(
        {"input_documents": [relevant_document[0] for relevant_document in relevant_documents],
         "question": query}, return_only_outputs=True)
    sourced_answer_str = sourced_answer_obj['output_text'].strip()
    return sourced_answer_str

我尝试了上面的代码，我期望它需要大约一秒钟或更短的时间，但实际上需要半分钟。

英文:

I ran the following code which is based on Langchain and Chroma and it is supposed to be functioning in a production environment with users as a client facing chat bot so I expect it or maybe I should rather say I desire that it should run in one or two seconds but I instead found that it takes about half a minute just for this portion to run and that is way too slow for my use case and if you have any advice I would really appreciate that to make it quicker thank you so much for your very valuable time and assistance!

def generate_chain():
    chain = load_qa_with_sources_chain(
        OpenAI(temperature=0, openai_api_key=openai.api_key),
        chain_type=&quot;map_reduce&quot;
    )
    return chain

def ask_docs(relevant_documents, query):
    chain = generate_chain()
    sourced_answer_obj = chain(
        {&quot;input_documents&quot;: [relevant_document[0] for relevant_document in relevant_documents],
         &quot;question&quot;: query}, return_only_outputs=True)
    sourced_answer_str = sourced_answer_obj[&#39;output_text&#39;].strip()
    return sourced_answer_str

I tried the code above I expected it to take about a second or less and it ended up taking half a minute

答案1

得分: 3

以下是翻译好的部分：

有一些潜在的优化方法可以提高您的代码性能。以下是一些建议：

避免每次调用ask_docs函数时重新生成链：
目前，每次调用ask_docs函数时都会调用generate_chain函数。生成链涉及加载问答模型及其相关资源，这可能是一个耗时的操作。为了提高性能，您可以生成链一次，并在后续查询中重复使用它。

例如：

# 在ask_docs函数外部定义链
chain = generate_chain()

# 多次调用ask_docs函数，重用链
answer1 = ask_docs(relevant_documents1, query1)
answer2 = ask_docs(relevant_documents2, query2)

在generate_chain函数外部加载OpenAI API密钥：
在generate_chain函数中，您每次调用它时都会加载OpenAI API密钥。通过在函数外加载API密钥并将其作为参数传递，可以进行优化。这样，您只需在必要时加载API密钥一次并重复使用它。

对输入文档进行批处理：
不必将每个相关文档单独传递给链，您可以考虑将文档进行批处理。批处理可以帮助减少API调用次数，可能提高性能。修改ask_docs函数，使其接受相关文档的列表而不是单个文档，然后将批处理的文档传递给链。

这是应用了这些优化的代码的更新版本：

# 在函数外加载OpenAI API密钥
openai_api_key = openai.api_key

def generate_chain():
    chain = load_qa_with_sources_chain(
        OpenAI(temperature=0, openai_api_key=openai_api_key),
        chain_type="map_reduce"
    )
    return chain

# 在函数外生成链一次
chain = generate_chain()

def ask_docs(relevant_documents, query):
    sourced_answer_obj = chain(
        {"input_documents": [doc[0] for doc in relevant_documents],
         "question": query}, return_only_outputs=True)
    sourced_answer_str = sourced_answer_obj['output_text'].strip()
    return sourced_answer_str

通过应用这些优化，您应该能够观察到在运行代码时性能有所提高。请记得根据您的具体用例进行相应的调整并验证结果。

英文:

There are a few potential optimizations that can be made to improve the performance of your code. Here are some suggestions:

Avoid regenerating the chain every time the ask_docs function is called:
Currently, the generate_chain function is called each time ask_docs is invoked. Generating the chain involves loading the QA model and its associated resources, which can be a time-consuming operation. To improve performance, you can generate the chain once and reuse it for subsequent queries.

For example:

    # Define the chain outside of the ask_docs function
chain = generate_chain()

# Call ask_docs function multiple times, reusing the chain
answer1 = ask_docs(relevant_documents1, query1)
answer2 = ask_docs(relevant_documents2, query2)

Load the OpenAI API key outside the generate_chain function:
In the generate_chain function, you are loading the OpenAI API key each time it is called. This can be optimized by loading the API key outside of the function and passing it as an argument. This way, you only load the API key once and reuse it when necessary.

Batch the input documents:
Instead of passing each relevant document individually to the chain, you can consider batching the documents together. Batching can help reduce the number of API calls and potentially improve performance. Modify the ask_docs function to accept a list of relevant documents instead of a single document, and then pass the batched documents to the chain.

Here's an updated version of your code with these optimizations applied:

 # Load the OpenAI API key outside the function
openai_api_key = openai.api_key

def generate_chain():
    chain = load_qa_with_sources_chain(
        OpenAI(temperature=0, openai_api_key=openai_api_key),
        chain_type=&quot;map_reduce&quot;
    )
    return chain

# Generate the chain once outside the function
chain = generate_chain()

def ask_docs(relevant_documents, query):
    sourced_answer_obj = chain(
        {&quot;input_documents&quot;: [doc[0] for doc in relevant_documents],
         &quot;question&quot;: query}, return_only_outputs=True)
    sourced_answer_str = sourced_answer_obj[&#39;output_text&#39;].strip()
    return sourced_answer_str

By applying these optimizations, you should observe improved performance when running your code. Remember to adapt the changes to your specific use case and verify the results accordingly.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何优化我的基于OpenAI的聊天机器人的Python自然语言处理处理时间？

问题

答案1

在另一个表中查找数值，如果找不到，则返回空白/Databricks，pyspark。

移除QLabel中不必要的空格。

如何在Pandas中优化不等式连接？

从Pandas数据框内部的for循环中创建NumPy数组。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论