错误:提示大小超过上下文窗口大小,无法处理

huangapple go评论271阅读模式
英文:

ERROR: The prompt size exceeds the context window size and cannot be processed

问题

我一直在尝试使用GPT4ALL作为llm和Hugging Face的instructor-large模型来创建一个文档问答聊天机器人,我成功创建了索引,但是在获取以下响应时出现了问题,这并不是一个真正的错误,因为没有回溯,但是它只是显示以下内容:

  1. ERROR: The prompt size exceeds the context window size and cannot be processed.ERROR: The prompt size exceeds the context window size and cannot be processed

这是以下问题的后续问题父问题(已解决)

  1. from llama_index import VectorStoreIndex, SimpleDirectoryReader
  2. from InstructorEmbedding import INSTRUCTOR
  3. from llama_index import PromptHelper, ServiceContext
  4. from llama_index import LangchainEmbedding
  5. from langchain.chat_models import ChatOpenAI
  6. from langchain.embeddings import HuggingFaceEmbeddings
  7. from langchain.llms import OpenLLM
  8. # from langchain.chat_models.human import HumanInputChatModel
  9. from langchain import PromptTemplate, LLMChain
  10. from langchain.llms import GPT4All
  11. from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
  12. documents = SimpleDirectoryReader(r'C:\Users\avish.wagde\Documents\work_avish\LLM_trials\instructor_large').load_data()
  13. print('document loaded in memory.......')
  14. model_id = 'hkunlp/instructor-large'
  15. model_path = "..\models\GPT4All-13B-snoozy.ggmlv3.q4_0.bin"
  16. callbacks = [StreamingStdOutCallbackHandler()]
  17. # Verbose is required to pass to the callback manager
  18. llm = GPT4All(model = model_path, callbacks=callbacks, verbose=True)
  19. print('llm model ready.............')
  20. embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name = model_id))
  21. print('embedding model ready.............')
  22. # define prompt helper
  23. # set maximum input size
  24. max_input_size = 4096
  25. # set number of output tokens
  26. num_output = 256
  27. # set maximum chunk overlap
  28. max_chunk_overlap = 0.2
  29. prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
  30. service_context = ServiceContext.from_defaults(chunk_size= 1024, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)
  31. print('service context set...........')
  32. index = VectorStoreIndex.from_documents(documents, service_context= service_context)
  33. print('indexing done................')
  34. query_engine = index.as_query_engine()
  35. print('query set...........')
  36. response = query_engine.query("What is apple's finnacial situation")
  37. print(response)

这是我收到的响应的屏幕截图:
enter image description here

我在GitHub上检查过,许多人提出了这个问题,但我找不到解决办法。
查询的GitHub链接

英文:

I have been trying to create a document QA chatbot using GPT4ALL as the llm and hugging face's instructor-large model for embedding, I was able to create the index, but getting the following as a response, it's not really a error which I'm getting as there is no traceback but it's just showing me the following

ERROR: The prompt size exceeds the context window size and cannot be processed.ERROR: The prompt size exceeds the context window size and cannot be processed

This is a follow up question for the following question parent question (this was resolved)

  1. from llama_index import VectorStoreIndex, SimpleDirectoryReader
  2. from InstructorEmbedding import INSTRUCTOR
  3. from llama_index import PromptHelper, ServiceContext
  4. from llama_index import LangchainEmbedding
  5. from langchain.chat_models import ChatOpenAI
  6. from langchain.embeddings import HuggingFaceEmbeddings
  7. from langchain.llms import OpenLLM
  8. # from langchain.chat_models.human import HumanInputChatModel
  9. from langchain import PromptTemplate, LLMChain
  10. from langchain.llms import GPT4All
  11. from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
  12. documents = SimpleDirectoryReader(r'C:\Users\avish.wagde\Documents\work_avish\LLM_trials\instructor_large').load_data()
  13. print('document loaded in memory.......')
  14. model_id = 'hkunlp/instructor-large'
  15. model_path = "..\models\GPT4All-13B-snoozy.ggmlv3.q4_0.bin"
  16. callbacks = [StreamingStdOutCallbackHandler()]
  17. # Verbose is required to pass to the callback manager
  18. llm = GPT4All(model = model_path, callbacks=callbacks, verbose=True)
  19. print('llm model ready.............')
  20. embed_model = LangchainEmbedding(HuggingFaceEmbeddings(model_name = model_id))
  21. print('embedding model ready.............')
  22. # define prompt helper
  23. # set maximum input size
  24. max_input_size = 4096
  25. # set number of output tokens
  26. num_output = 256
  27. # set maximum chunk overlap
  28. max_chunk_overlap = 0.2
  29. prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
  30. service_context = ServiceContext.from_defaults(chunk_size= 1024, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)
  31. print('service context set...........')
  32. index = VectorStoreIndex.from_documents(documents, service_context= service_context)
  33. print('indexing done................')
  34. query_engine = index.as_query_engine()
  35. print('query set...........')
  36. response = query_engine.query("What is apple's finnacial situation")
  37. print(response)

here is the screenshot of response i got..
enter image description here

I check over GitHub, many people raised this but I couldn't find anything to resolve this..
The GitHub link for the query

答案1

得分: 0

GPT4ALL似乎具有2048的最大输入尺寸(?),但您将最大尺寸设置为4096。

(根据我从谷歌找到的随机评论,不能完全确认这个尺寸:https://github.com/nomic-ai/gpt4all/issues/178)

您可以重新调整您的 chunk_sizemax_input_size 来考虑这一点

  1. # 定义提示辅助函数
  2. # 设置最大输入尺寸
  3. max_input_size = 2048
  4. # 设置输出标记的数量
  5. num_output = 256
  6. # 设置最大块重叠
  7. max_chunk_overlap = 0.2
  8. prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
  9. service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)
英文:

GPT4ALL seems to have a max input size of 2048(?), but you are setting the max size to 4096.

(Not totally able to confirm this size, based on random comments I found from google: https://github.com/nomic-ai/gpt4all/issues/178)

You can re-adjust your chunk_size and max_input_size to account for this

  1. # define prompt helper
  2. # set maximum input size
  3. max_input_size = 2048
  4. # set number of output tokens
  5. num_output = 256
  6. # set maximum chunk overlap
  7. max_chunk_overlap = 0.2
  8. prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
  9. service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm, prompt_helper=prompt_helper, embed_model=embed_model)

huangapple
  • 本文由 发表于 2023年8月10日 15:23:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76873456.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定