Skip to content

Filtering retrieval with ConversationalRetrievalChain #7474

@jorrgme

Description

@jorrgme

Hi everyone,

I'm trying to do something and I haven´t found enough information on the internet to make it work properly with Langchain. Here it is:

I want to develop a QA chat using markdown documents as knowledge source, using as relevant documents the ones corresponding to a certain documentation's version that the user will choose with a select box. To achieve that:

  1. I've built a FAISS vector store from documents located in two different folders, representing the documentation's versions. The folder structure looks like this:
.
├── 4.14.2
│   ├── folder1
│   │   └── file1.md
│   ├── folder2
│   │   └── file2.md
└── 4.18.1
    ├── folder1
    │   └── file3.md
    └── folder2
        └── file4.md
  1. Each document's metadata looks something like this: {'source': 'app/docs-versions/4.14.2/folder1/file1.md'}
  2. With all this I'm using a ConversationalRetrievalChain to retrieve info from the vector store and using an llm to answer questions entered via prompt:
memory = st.session_state.memory = ConversationBufferMemory(
    memory_key="chat_history", return_messages=True, output_key="answer"
)

source_filter = f'app/docs-versions/{version}/'
chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=store.as_retriever(
        search_kwargs={'filter': {'source': source_filter}}
    ),
    memory=memory,
    verbose=False,
    return_source_documents=True,
)

As you can see, as a summary, my goal is to filter the documents retrieved to use only the ones contained in a certain directory, representing the documentation's version.

Does anyone know how can I achieve this? The approximation I've tried doesn't seem to work for what I want to do and the retrieved documents are contained in both folders.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions