如何在Langchain Faiss检索器中指定相似度阈值?

huangapple go评论128阅读模式
英文:

how to specify similarity threshold in langchain faiss retriever?

问题

我想要传递一个相似性阈值给检索器。到目前为止,我只能找到如何传递一个k值,但这不是我想要的。我怎样才能传递一个阈值呢?

from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

def get_conversation_chain(vectorstore):
    llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
    qa = ConversationalRetrievalChain.from_llm(llm=llm, retriever=vectorstore.as_retriever(search_kwargs={'k': 2}), return_source_documents=True, verbose=True)
    return qa

loader = PyPDFLoader("sample.pdf")
# get pdf raw text
pages = loader.load_and_split()
faiss_index = FAISS.from_documents(list_of_documents, OpenAIEmbeddings())
# create conversation chain
chat_history = []
qa = get_conversation_chain(faiss_index)
query = "What is a sunflower?"
result = qa({"question": query, "chat_history": chat_history}) 

希望这个翻译对您有帮助。

英文:

I would like to pass to the retriever a similarity threshold. So far I could only figure out how to pass a k value but this was not what I wanted. How can I pass a threshold instead?

from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings

def get_conversation_chain(vectorstore):
    llm = ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
    qa = ConversationalRetrievalChain.from_llm(llm=llm, retriever=vectorstore.as_retriever(search_kwargs={'k': 2}), return_source_documents=True, verbose=True)
    return qa

loader = PyPDFLoader("sample.pdf")
# get pdf raw text
pages = loader.load_and_split()
faiss_index = FAISS.from_documents(list_of_documents, OpenAIEmbeddings())
# create conversation chain
chat_history = []
qa = get_conversation_chain(faiss_index)
query = "What is a sunflower?"
result = qa({"question": query, "chat_history": chat_history}) 

答案1

得分: 1

这是来自api文档的答案 search_kwargs={'score_threshold': 0.3}

英文:

This was the answer search_kwargs={'score_threshold': 0.3}) from the api docs.

答案2

得分: 1

你可以使用以下内容作为VectorStoreRetriever,就像你说的那样,但要加上search_type参数。

retriever = dbFAISS.as_retriever(search_type="similarity_score_threshold", 
                                 search_kwargs={"score_threshold": .5, 
                                                "k": top_k})
英文:

You can use the following as a VectorStoreRetriever as you say but with the search_type parameter.

retriever = dbFAISS.as_retriever(search_type="similarity_score_threshold", 
                                 search_kwargs={"score_threshold": .5, 
                                                "k": top_k})

huangapple
  • 本文由 发表于 2023年7月20日 20:18:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76729793.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定