A smart medical chatbot webapp that leverages RAG (Retrieval-Augmented Generation) techniques to deliver accurate, context-aware responses using a custom knowledge base and advanced language models.
This project is a smart medical chatbot web application built using the following technologies:
- Langchain for Retrieval-Augmented Generation (RAG) pipeline.
- Context-aware embeddings for building a vector database.
- Quantized Llama-2 Large Language Model (LLM) for efficient query answering.
- Web interface built with Python backend (Flask or FastAPI), HTML, CSS and served via a user-friendly UI.
The solution boosts response accuracy by up to 20% using context-based document retrieval before generation.
RAG_GENAI_DocBot/
├── data/ # Contains datasets or documents used to build the knowledge base
├── model/ # Pre-trained and quantized models (Llama-2 or others)
├── research/ # Notebooks or scripts for experiments and testing
├── src/ # Source code for core logic (retrieval, embedding, RAG pipeline)
├── static/ # Static files (CSS, JS, images) for the frontend
├── templates/ # HTML templates for the web interface
├── .gitignore # Files and folders to be ignored by Git
├── LICENSE # License information
├── README.md # Project overview and instructions
├── app.py # Main application (runs the web server)
├── requirements.txt # Project dependencies
├── setup.py # Setup script for packaging (optional for installation)
├── store_index.py # Script to create/store vector embeddings and index
├── template.py # Utility template Python file
- Retrieval-Augmented Generation (RAG): Retrieves relevant documents from a custom knowledge base before generating responses.
- Context-aware Embeddings: Ensures that responses are accurate and specific to the user's query.
- Quantized Llama-2 LLM: Efficient, lightweight model serving for quick responses.
- Web Application Interface: User-friendly interface for entering queries and viewing answers.
- Python 3.8+
- Git
- Virtual Environment (optional but recommended)
git clone https://github.com/Aniketkumar121/RAG_GENAI_DocBot.git
cd RAG_GENAI_DocBot
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
python store_index.py
This will generate embeddings and store them in the vector database.
python app.py
This will start a local web server.
Open your browser and visit:
http://127.0.0.1:5000
to interact with the chatbot UI.
- User inputs a query via web interface.
- Query is embedded and compared with pre-stored document embeddings in the vector database.
- Top-k relevant documents are retrieved.
- LLM (Llama-2) takes retrieved documents as context and generates an accurate response.
- Response is displayed back to the user.
- Langchain
- Pinecone VectorDB (vector database)
- Quantized Llama-2 LLM
- Flask (or FastAPI) for serving webapp
- HTML + CSS for frontend
- Add user authentication.
- Track chat history.
- Improve vector store management.
- Deploy to cloud (AWS, GCP, Azure, etc.)