You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
Commit to Help
I commit to help with one of those options 👆
Example Code
importuuidfromlangchain.retrievers.multi_vectorimportMultiVectorRetrieverfromlangchain.storageimportInMemoryStore, LocalFileStorefromlangchain_community.vectorstoresimportChromafromlangchain_core.documentsimportDocumentdefcreate_multi_vector_retriever(store, vectorstore, id_key, text_summaries=None, texts=None):
""" Create a multi-vector retriever that indexes summaries and stores embeddings in a folder. Args: folder_path: Path to the folder for storing embeddings. text_summaries: List of text summaries for new documents (optional). texts: List of corresponding texts/images/etc. for new documents ek(optional). Returns: The multi-vector retriever instance. """# Create the multi-vector retrieverretriever=MultiVectorRetriever(
vectorstore=vectorstore,
docstore=store,
id_key=id_key,
search_kwargs={"k": 5},
)
defadd_documents(retriever, doc_summaries, doc_contents):
doc_ids= [str(uuid.uuid4()) for_indoc_contents]
summary_docs= [
Document(page_content=s, metadata={id_key: doc_ids[i]})
fori,s, inenumerate(doc_summaries)
]
retriever.vectorstore.add_documents(summary_docs)
retriever.docstore.mset(list(zip(doc_ids, doc_contents)))
print("added")
# Filter out empty text_summariesnon_empty_text_summaries= [summaryforsummaryintext_summariesifsummary.strip()]
# Add texts, tables, and images if summaries are not emptyifnon_empty_text_summaries:
add_documents(retriever, non_empty_text_summaries, texts)
returnretrievervectorstore=Chroma(
collection_name="second",
persist_directory="/content/embeddings12",
embedding_function=VertexAIEmbeddings(model_name="textembedding-gecko@latest"),
)
docstore=InMemoryStore()
id_key="doc_id"# Create Retrieverretriever_multi_vector_img=create_multi_vector_retriever(
docstore,
vectorstore,
id_key,
doc_summaries, #description of the imagesdoc_img_base64_list#the actual images
)
defmulti_modal_rag_chain(retriever):
""" Multi-modal RAG chain """# Multi-modal LLMmodel=ChatVertexAI(
temperature=0, model_name="gemini-pro-vision", max_output_tokens=2048, safety_settings=safety_settings
)
# RAG Pipelinechain= (
{
"context": retriever|RunnableLambda(split_image_text_types),
"question": RunnablePassthrough()
}
|RunnableLambda(img_prompt_func)
|model|StrOutputParser()
)
returnchain# Create RAG chainchain_multimodal_rag=multi_modal_rag_chain(retriever_multi_vector_img)
Description
i have been using the multi vector retriever from langchain for storing images and their description to my embeddings db. i am embedding the descriptions and storing the images as it is for retrieval. the documents are labelled by doc_id which i am storing in a InMemoryStore but i would want this to be in a persistent directory or variable (which i can store in a directory).
i have tried using the LocalFileStore but it stores byte-like object and the docstore needs to be in str format (documentation) so this approach threw TypeError.
is there anyway to implement this functionality? please help me, i am just a beginner with langchain and llms.
Above is my code for the retriever and the llm chain
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Checked other resources
Commit to Help
Example Code
Description
i have been using the multi vector retriever from langchain for storing images and their description to my embeddings db. i am embedding the descriptions and storing the images as it is for retrieval. the documents are labelled by doc_id which i am storing in a InMemoryStore but i would want this to be in a persistent directory or variable (which i can store in a directory).
i have tried using the LocalFileStore but it stores byte-like object and the docstore needs to be in str format (documentation) so this approach threw TypeError.
is there anyway to implement this functionality? please help me, i am just a beginner with langchain and llms.
Above is my code for the retriever and the llm chain
thanks!!
System Info
running the code on google colab
Beta Was this translation helpful? Give feedback.
All reactions