如何将LangChain Vector Store以及整个文档传递给OpenAI?

huangapple go评论90阅读模式
英文:

How to pass LangChain Vector Store plus entire documents to OpenAI?

问题

这段代码从langchain文档中几乎完全复制而来。

该代码从一组.txt文档创建了一个向量存储库。我假设随后的代码会从存储库中找到与问题相关的内容,并将所选数据用于进行API调用,而不是将数百行.txt直接传递到载荷中。

这非常有用,因为我有许多文档需要AI参考,但特别有一个文档指定了一些我希望AI执行的具体任务。我需要将这个完整的文档与从存储库返回的内容一起传递给req以供OpenAI使用。请问是否有人知道langchain是否支持这种类型的任务?我对这方面非常陌生,需要一些指导。

    const docs = await textSplitter.createDocuments(txtFiles)

    // 从这些文档中创建一个向量存储库。
    const vectorStore = await HNSWLib.fromDocuments(docs, embeddings)

    // 创建一个使用OpenAI LLM和HNSWLib向量存储库的链。
    const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever())

    // 使用提示调用该链。
    const chatGptRes = await chain.call({
      query: prompt,
    })
英文:

I have some code pretty much copied verbatom from the langchain docs,

The code creates a vector store from a list of .txt documents. My assumption is the code that follows finds what it needs from the store relative to the question and uses the selected data to make the api call, instead of passing hundreds of lines of .txt into the payload.

This is pretty useful because I have many documents I need the AI to reference, but there is one document in particular which specifies some specific tasks I would like the AI to perform. I need to pass in this full document along with whatever comes back from the store to the req to openAI. Does anyone know if langchain offers support for this kind of task? I'm very new to this so my knowledge is very limited. Looking for some guidance

    const docs = await textSplitter.createDocuments(txtFiles)

    // Create a vector store from the documents.
    const vectorStore = await HNSWLib.fromDocuments(docs, embeddings)

    // Create a chain that uses the OpenAI LLM and HNSWLib vector store.
    const chain = RetrievalQAChain.fromLLM(model, vectorStore.asRetriever())

    // Call the chain with the prompt.
    const chatGptRes = await chain.call({
      query: prompt,
    })

答案1

得分: 1

以下是翻译好的部分:

这是文档建议的方式,希望对你有帮助。

Langchain的Discord也很不错,值得一看。

一个示例可能是使用目录加载器来获取所有.json文件,例如在./docs目录下:

    llm = // 在这里添加你的OpenAI实例

    // 从请求中获取查询
    const userPrompt = "宇宙有多大";

    // 包含你的文档的目录 ./docs
    const directoryLoader = new DirectoryLoader("docs", {
      ".json": (path) => new JSONLoader(path),
    });

    // 将文档加载到docs变量中
    const docs = await directoryLoader.load();

    // 从目录加载向量存储
    const directory = "./vectorstore";
    const loadedVectorStore = await HNSWLib.load(
      directory,
      new OpenAIEmbeddings()
    );

    const chain = RetrievalQAChain.fromLLM(
      llm,
      loadedVectorStore.asRetriever()
    );

    const res = await chain.call({
      input_documents: docs,
      query: userPrompt,
    });

希望对你有所帮助。

英文:

This is how the documentation suggests it, hope this helps.

The langchain discord is pretty good too, well worth a look

one example might be to use the directory loader to get all the .json files for example in ./docs

    llm = //add yourr openai instance here

    //get the query from the request
    const userPrompt = "how big is the universe";
   
   // your directory holding your docs ./docs
    const directoryLoader = new DirectoryLoader("docs", {
      ".json": (path) => new JSONLoader(path),
    });
    
    // load the documents into the docs variable
    const docs = await directoryLoader.load();


    // Load the vector store from the directory
    const directory = "./vectorstore";
    const loadedVectorStore = await HNSWLib.load(
      directory,
      new OpenAIEmbeddings()
    );

    const chain = RetrievalQAChain.fromLLM(
      llm,
      loadedVectorStore.asRetriever()
    );
    
    const res = await chain.call({
      input_documents: docs,
      query: userPrompt,
    });



huangapple
  • 本文由 发表于 2023年6月26日 06:29:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76552646.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定