问题

我一直在尝试使用GPT4ALL作为llm和Hugging Face的instructor-large模型来创建一个文档问答聊天机器人，我成功创建了索引，但是在获取以下响应时出现了问题，这并不是一个真正的错误，因为没有回溯，但是它只是显示以下内容：

ERROR: The prompt size exceeds the context window size and cannot be processed.ERROR: The prompt size exceeds the context window size and cannot be processed

这是以下问题的后续问题父问题（已解决）

from llama_index import VectorStoreIndex, SimpleDirectoryReader
from InstructorEmbedding import INSTRUCTOR
from llama_index import PromptHelper, ServiceContext
from llama_index import LangchainEmbedding
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import OpenLLM
# from langchain.chat_models.human import HumanInputChatModel
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

documents = SimpleDirectoryReader(r'C:\Users\avish.wagde\Documents\work_avish\LLM_trials\instructor_large').load_data()

print('document loaded in memory.......') 

model_id = 'hkunlp/instructor-large'

model_path = "..\models\GPT4All-13B-snoozy.ggmlv3.q4_0.bin"

callbacks = [StreamingStdOutCallbackHandler()]

# Verbose is required to pass to the callback manager
llm = GPT4All(model = model_path, callbacks=callbacks, verbose=True)

print('llm model ready.............')

embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name = model_id))

print('embedding model ready.............')

# define prompt helper
# set maximum input size
max_input_size = 4096
# set number of output tokens
num_output = 256
# set maximum chunk overlap
max_chunk_overlap = 0.2

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

service_context = ServiceContext.from_defaults(chunk_size= 1024, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)

print('service context set...........')

index = VectorStoreIndex.from_documents(documents, service_context= service_context)

print('indexing done................')

query_engine = index.as_query_engine()

print('query set...........')

response = query_engine.query("What is apple's finnacial situation")
print(response)

这是我收到的响应的屏幕截图：
enter image description here

我在GitHub上检查过，许多人提出了这个问题，但我找不到解决办法。
查询的GitHub链接

英文:

I have been trying to create a document QA chatbot using GPT4ALL as the llm and hugging face's instructor-large model for embedding, I was able to create the index, but getting the following as a response, it's not really a error which I'm getting as there is no traceback but it's just showing me the following

ERROR: The prompt size exceeds the context window size and cannot be processed.ERROR: The prompt size exceeds the context window size and cannot be processed

This is a follow up question for the following question parent question (this was resolved)

from llama_index import VectorStoreIndex, SimpleDirectoryReader
from InstructorEmbedding import INSTRUCTOR
from llama_index import PromptHelper, ServiceContext
from llama_index import LangchainEmbedding
from langchain.chat_models import ChatOpenAI
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import OpenLLM
# from langchain.chat_models.human import HumanInputChatModel
from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

documents = SimpleDirectoryReader(r&#39;C:\Users\avish.wagde\Documents\work_avish\LLM_trials\instructor_large&#39;).load_data()

print(&#39;document loaded in memory.......&#39;) 

model_id = &#39;hkunlp/instructor-large&#39;

model_path = &quot;..\models\GPT4All-13B-snoozy.ggmlv3.q4_0.bin&quot;

callbacks = [StreamingStdOutCallbackHandler()]

# Verbose is required to pass to the callback manager
llm = GPT4All(model = model_path, callbacks=callbacks, verbose=True)

print(&#39;llm model ready.............&#39;)

embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name = model_id))

print(&#39;embedding model ready.............&#39;)

# define prompt helper
# set maximum input size
max_input_size = 4096
# set number of output tokens
num_output = 256
# set maximum chunk overlap
max_chunk_overlap = 0.2

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

service_context = ServiceContext.from_defaults(chunk_size= 1024, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)

print(&#39;service context set...........&#39;)

index = VectorStoreIndex.from_documents(documents, service_context= service_context)

print(&#39;indexing done................&#39;)

query_engine = index.as_query_engine()

print(&#39;query set...........&#39;)

response = query_engine.query(&quot;What is apple&#39;s finnacial situation&quot;)
print(response)

here is the screenshot of response i got..
enter image description here

I check over GitHub, many people raised this but I couldn't find anything to resolve this..
The GitHub link for the query

答案1

得分: 0

GPT4ALL似乎具有2048的最大输入尺寸(?)，但您将最大尺寸设置为4096。

（根据我从谷歌找到的随机评论，不能完全确认这个尺寸：https://github.com/nomic-ai/gpt4all/issues/178）

您可以重新调整您的 chunk_size 和 max_input_size 来考虑这一点

# 定义提示辅助函数
# 设置最大输入尺寸
max_input_size = 2048
# 设置输出标记的数量
num_output = 256
# 设置最大块重叠
max_chunk_overlap = 0.2

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)

英文:

GPT4ALL seems to have a max input size of 2048(?), but you are setting the max size to 4096.

(Not totally able to confirm this size, based on random comments I found from google: https://github.com/nomic-ai/gpt4all/issues/178)

You can re-adjust your chunk_size and max_input_size to account for this

# define prompt helper
# set maximum input size
max_input_size = 2048
# set number of output tokens
num_output = 256
# set maximum chunk overlap
max_chunk_overlap = 0.2

prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)

service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

错误：提示大小超过上下文窗口大小，无法处理

问题

答案1

如何仅通过对一个数据集进行采样来修复交错的数据集？

如何通过Langchain获得更详细的结果来源。

可以使用StructuredTool与任何代理吗？

Why am I getting an authentication error when trying to run a LangChain tutorial on FAISS vector database with OpenAI API?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论