问题

我有一个现有的索引，是使用 GPTVectorStoreIndex 创建的。然而，当我尝试使用 insert 方法将新文档添加到现有索引时，我遇到了以下错误：

AttributeError: 'list' object has no attribute 'get_text'

我更新索引的代码如下：

max_input_size = 4096
num_outputs = 5000
max_chunk_overlap = 256
chunk_size_limit = 3900
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo", max_tokens=num_outputs))
    
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

directory_path = "./trial_docs"
file_metadata = lambda x: {"filename": x}
reader = SimpleDirectoryReader(directory_path, file_metadata=file_metadata)
    
documents = reader.load_data()
print(type(documents))
index.insert(document=documents, service_context=service_context)

英文:

I have an existing index that is created using GPTVectorStoreIndex. However, when I am trying to add a new document to the existing index using the insert method, I am getting the following error :

AttributeError: 'list' object has no attribute 'get_text'

my code for updating the index is as follows :

max_input_size = 4096
num_outputs = 5000
max_chunk_overlap = 256
chunk_size_limit = 3900
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name=&quot;gpt-3.5-turbo&quot;, max_tokens=num_outputs))
    
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

directory_path = &quot;./trial_docs&quot;
file_metadata = lambda x : {&quot;filename&quot;: x}
reader = SimpleDirectoryReader(directory_path, file_metadata=file_metadata)
    
documents = reader.load_data()
print(type(documents))
index.insert(document = documents, service_context = service_context)

答案1

得分: 2

我明白了，我之前做错的地方是将文档作为整体传递，这是一个 List 对象。正确的更新方式如下：

max_input_size = 4096
num_outputs = 5000
max_chunk_overlap = 256
chunk_size_limit = 3900
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo", max_tokens=num_outputs))
    
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

directory_path = "./trial_docs"
file_metadata = lambda x : {"filename": x}
reader = SimpleDirectoryReader(directory_path, file_metadata=file_metadata)
    
documents = reader.load_data()
print(type(documents))
for d in documents:
    index.insert(document = d, service_context = service_context)

英文:

I got it right, the mistake I was doing it was passing documents as a whole, which is a List object. The right way to update is as follows

max_input_size = 4096
num_outputs = 5000
max_chunk_overlap = 256
chunk_size_limit = 3900
prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit)
llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name=&quot;gpt-3.5-turbo&quot;, max_tokens=num_outputs))
    
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper)

directory_path = &quot;./trial_docs&quot;
file_metadata = lambda x : {&quot;filename&quot;: x}
reader = SimpleDirectoryReader(directory_path, file_metadata=file_metadata)
    
documents = reader.load_data()
print(type(documents))
for d in documents:
    index.insert(document = d, service_context = service_context)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将文档列表添加到现有的llama-index索引中？

问题

答案1

获取从node_sources引用的文档。

How to fix `transformers` package not found error in a Python project with `py-langchain`, `llama-index`, and `gradio`?

Langchain使用Huggingface进行嵌入，无法通过访问令牌。

RetryError：重试错误[<在0x7f89bc35eb90处完成的未来状态引发了AuthenticationError>]

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论