-
Notifications
You must be signed in to change notification settings - Fork 19.2k
Closed
Description
Hi everyone,
I'm trying to do something and I haven´t found enough information on the internet to make it work properly with Langchain. Here it is:
I want to develop a QA chat using markdown documents as knowledge source, using as relevant documents the ones corresponding to a certain documentation's version that the user will choose with a select box. To achieve that:
- I've built a FAISS vector store from documents located in two different folders, representing the documentation's versions. The folder structure looks like this:
.
├── 4.14.2
│ ├── folder1
│ │ └── file1.md
│ ├── folder2
│ │ └── file2.md
└── 4.18.1
├── folder1
│ └── file3.md
└── folder2
└── file4.md
- Each document's metadata looks something like this:
{'source': 'app/docs-versions/4.14.2/folder1/file1.md'}
- With all this I'm using a ConversationalRetrievalChain to retrieve info from the vector store and using an llm to answer questions entered via prompt:
memory = st.session_state.memory = ConversationBufferMemory(
memory_key="chat_history", return_messages=True, output_key="answer"
)
source_filter = f'app/docs-versions/{version}/'
chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=store.as_retriever(
search_kwargs={'filter': {'source': source_filter}}
),
memory=memory,
verbose=False,
return_source_documents=True,
)
As you can see, as a summary, my goal is to filter the documents retrieved to use only the ones contained in a certain directory, representing the documentation's version.
Does anyone know how can I achieve this? The approximation I've tried doesn't seem to work for what I want to do and the retrieved documents are contained in both folders.
dro14 and RamaTadidosubot, dogaerdik and chrvt
Metadata
Metadata
Assignees
Labels
No labels