๐น Live Demo (https://covid-rag.streamlit.app)
๐ฏ Quick Demo: Upload a PDF โ Ask questions โ Get intelligent, source-cited answers in seconds
Two Intelligent Modes in One App:
- ๐ฉบ COVID-19 Assistant: Pre-loaded with NIH treatment guidelines
- ๐ง Custom Document Bot: Upload your own PDFs/docs for instant Q&A
Key Features:
- โก Real-time RAG: Semantic search + GPT responses with conversation memory
- ๐ Smart Citations: Every answer includes source page references
- ๐ง Context Awareness: Remembers conversation history for follow-up questions
- ๐ Multi-format Support: PDF, TXT, Markdown files
- ๐จ Clean UI: Intuitive Streamlit interface with mode switching
- ๐พ Persistent Storage: ChromaDB vector database with session management
๐ Documents โ ๐ Text Splitting โ ๐งฎ Embeddings โ ๐๏ธ ChromaDB
โ
๐ค LLM Response โ ๐ Prompt Engineering โ ๐ Semantic Search
Tech Stack:
- Backend: Python, LangChain, ChromaDB
- Embeddings: HuggingFace (all-MiniLM-L6-v2)
- LLM: OpenAI GPT-3.5-turbo via OpenRouter
- Frontend: Streamlit
- Vector Store: ChromaDB with cosine similarity
git clone https://github.com/fomativeh/RAG-Chatbot.git
cd RAG-Chatbot
pip install -r requirements.txt
Create .env
file:
OPENROUTER_API_KEY=your_openrouter_api_key_here
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
COVID_DOC_URL=https://www.ncbi.nlm.nih.gov/books/NBK570371/pdf/Bookshelf_NBK570371.pdf
streamlit run app.py
Option A - COVID Mode:
- Add PDF files to
data/covid_docs/
directory - Select "COVID-19 Assistant" mode
- Start asking medical guideline questions
Option B - Custom Mode:
- Select "Your Custom Bot" mode
- Upload PDFs via the sidebar
- Ask questions about your documents
- Semantic search using HuggingFace embeddings
- Top-K document retrieval (configurable)
- Metadata preservation for accurate citations
- Maintains context across multiple exchanges
- Configurable conversation history depth
- Smart prompt engineering with context injection
- Recursive text splitting for optimal chunks
- Multi-format support (PDF, TXT, MD)
- Automatic metadata extraction (source, page numbers)
- Persistent ChromaDB storage
- Session-based collections for uploaded files
- Automatic cleanup and error handling
streamlit>=1.28.0
langchain>=0.1.0
langchain-openai>=0.0.5
langchain-community>=0.0.13
chromadb>=0.4.15
sentence-transformers>=2.2.2
PyPDF2>=3.0.1
python-dotenv>=1.0.0
requests>=2.31.0
- Mode Switching: Toggle between COVID and custom document modes
- File Upload: Drag-and-drop interface with progress indicators
- Real-time Status: Loading spinners, success/error messages
- Citation Display: Source references with page numbers
- Conversation Management: Clear chat functionality
- Responsive Design: Clean, professional Streamlit interface
Modify prompts in /prompts/
directory:
covid_assistant_prompt.txt
- COVID mode system promptcustom_bot_prompt.txt
- Custom document mode prompt