A robust, production-quality Retrieval-Augmented Generation (RAG) system template built with LangChain and ChromaDB.
Suitable for local use, internal enterprise deployments, or as a foundation for building scalable, production-ready RAG applications. While not yet a fully production-hardened or scalable solution, it demonstrates enterprise-level best practices in code quality, testing, and extensibility.
Demo: Web UI in Action
- Features
- Requirements
- Installation
- Configuration
- Usage
- Project Structure
- Testing
- Development
- Monitoring
- Deployment
- Contributing
- License
- Acknowledgments
- Support
- Consultation
- Multi-format Document Support: PDF, TXT, DOCX, MD, CSV, XLSX
- Advanced Text Processing: Intelligent chunking with configurable overlap
- Vector Storage: ChromaDB integration with persistent storage
- Embedding Generation: Sentence Transformers models via Hugging Face with customizable model selection
- Dual Interface: Command-line and Streamlit web UI
- Structured Logging: Logging with correlation IDs
- Type Safety: Comprehensive type hints with MyPy integration
- Testing: Comprehensive test suite with enterprise-grade fixtures
- Code Quality: Black, isort, flake8, autoflake, and bandit security scanning
- Python 3.12+ (required - the project uses Python 3.12 features)
- CUDA-compatible GPU (optional, for faster embedding generation)
# 1. Clone the repository
git clone <repository-url>
cd llm-rag-chroma-demo
# 2. Create virtual environment with Python 3.12
python3.12 -m venv venv312 # For Python 3.12 (required)
# OR
python -m venv venv # Uses your default Python (must be 3.12+)
# 3. Activate virtual environment that you have created above
source venv312/bin/activate # On Linux/Mac (if using venv312)
# OR
source venv/bin/activate # On Linux/Mac (if using venv)
# OR
venv312\Scripts\activate # On Windows (if using venv312)
# OR
venv\Scripts\activate # On Windows (if using venv)
# 4. Install dependencies
make install-dev
# 5. Verify installation
make info
# Navigate to your existing project directory
cd llm-rag-chroma-demo
# Activate your existing virtual environment (must be Python 3.12+)
source venv/bin/activate # On Linux/Mac
# OR
venv\Scripts\activate # On Windows
# Install dependencies
make install-dev
The project is designed to work out-of-the-box with sensible defaults. Configuration is managed via environment variables, which you can set up using a .env
file.
A template file, .env.default
, is provided in the project root. To get started, copy this file to .env
:
cp .env.default .env
You can then edit .env
to customize your settings as needed.
- The project will run with or without an OpenAI API key.
With an OpenAI API Key:
If you provide an OpenAI API key, the system can combine your enterprise's private or customer documents with powerful LLMs (like OpenAI) to deliver richer, more accurate, and context-aware responses. This enables advanced inference by leveraging both your internal knowledge base and state-of-the-art language models.
Without an OpenAI API Key:
If you do not provide an OpenAI API key, the RAG system will still function as a robust query engine over your embedded document store. In this mode, responses are generated purely from your indexed documents, without LLM-powered augmentation. This is suitable for environments where external API calls are restricted or not desired, but may result in less nuanced or generative answers.
To use an OpenAI API key, add it to your .env
file:
OPENAI_API_KEY=your-openai-api-key
You can further customize the system by editing other variables in your .env
file, such as:
- Logging level
- Supported file types
- Chunk size and overlap
- ChromaDB settings
- Embedding model
All available options are documented in .env.default
with comments.
Summary:
- Copy
.env.default
to.env
and edit as needed. - OpenAI API key is optional, but recommended for best inference quality.
- The project works out-of-the-box with default settings.
# Run the demo
make run-demo
# Or start the web interface
make run-ui
Below is a demonstration of how to run the demo and what output to expect:
# Interactive mode
rag-demo interactive
# Ingest all documents
rag-demo ingest
# Query the system
rag-demo query "What are the HR policies?"
# Get system statistics
rag-demo stats
# Clear the database
rag-demo clear
# Start Streamlit UI
make run-ui
# or
streamlit run rag_web_interface.py
from rag_system import RAGSystem
# Initialize the system
rag = RAGSystem()
# Ingest documents
stats = rag.ingest_documents()
print(f"Processed {stats['documents_stored']} documents")
# Query the system
results = rag.query("What are the vacation policies?")
for doc in results:
print(f"Source: {doc.metadata['source']}")
print(f"Content: {doc.page_content[:200]}...")
rag-system/
βββ rag_system/ # Main package
β βββ core/ # Core functionality
β β βββ config.py # Configuration management
β β βββ logging.py # Structured logging
β βββ ingestion/ # Document processing
β β βββ document_loader.py # Multi-format document loading
β β βββ text_processor.py # Text chunking and embedding
β β βββ vector_store.py # ChromaDB integration
β βββ ui/ # User interfaces
β β βββ streamlit_app.py # Streamlit web UI
β βββ cli.py # Command-line interface
β βββ rag_system.py # Main orchestrator
βββ tests/ # Test suite
β βββ test_core_config.py # Configuration tests
β βββ test_ingestion.py # Ingestion component tests
βββ data/ # Document storage
βββ rag_demo.py # Demo script
βββ rag_web_interface.py # Web interface
βββ pyproject.toml # Project configuration
βββ Makefile # Development workflow
βββ README.md # This file
# Run all tests
make test
# Run tests with coverage
make test-cov
# Run tests in watch mode
make test-watch
# Quick test cycle (format + lint + test)
make quick-test
# Run specific test file
pytest tests/test_ingestion.py -v
# Run tests with specific marker
pytest -m "slow" -v
This project maintains enterprise-grade code quality with:
- Comprehensive Type Annotations: All functions, methods, and variables have type hints
- Static Type Checking: MyPy integration ensures type safety across the codebase
- Code Formatting: Black and isort ensure consistent code style
- Linting: Flake8 and autoflake maintain code quality
- Security Scanning: Bandit identifies potential security issues
- Testing: Comprehensive test suite with clean fixtures
# Format code (black + isort)
make format
# Run linting (flake8 + autoflake)
make lint
# Type checking (mypy)
make type-check
# Security scanning (bandit)
make security-check
# All quality checks
make check-all
# Clean unused imports and variables
make clean-imports
# Pre-commit validation (runs automatically on staged files)
# The custom pre-commit hook runs the same tools as make check-all
# but only on staged files during git commit
# Complete development setup
make dev-setup
# Clean build artifacts
make clean
# (Recommended) Unset all environment variables from .env in your current shell
source clean-env.sh
# Build package
make build
# System information
make info
The system includes comprehensive logging:
import structlog
# Structured logging with correlation IDs
logger = structlog.get_logger(__name__)
logger.info("Processing document",
document_id="doc_123",
file_type="pdf",
chunk_count=15)
-
Environment Configuration:
export PRODUCTION=true export LOG_LEVEL=WARNING export CHROMA_PERSIST_DIRECTORY=chroma_db
-
Install Production Dependencies:
make install
-
Initialize System:
rag-demo ingest
# Build image
make docker-build
# Run container
make docker-run
We welcome contributions! Please see our CONTRIBUTING.md for guidelines and best practices.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature
- Make your changes and add tests
- Run quality checks:
make check-all
- Commit your changes:
git commit -m 'Add amazing feature'
- Push to the branch:
git push origin feature/amazing-feature
- Open a Pull Request
- Use snake_case for files and functions
- Use PascalCase for classes
- Write comprehensive docstrings
- Maintain 80%+ test coverage
- Follow comprehensive type hints throughout
- Raise exceptions rather than returning error strings
- Never commit
.env
files - use.env.default
as template - Run
make check-all
before committing - Pre-commit hooks run automatically on staged files during commit
This project is licensed under the GNU General Public License v3 - see the LICENSE file for details.
- LangChain for the RAG framework
- ChromaDB for vector storage
- Sentence Transformers for embeddings via Hugging Face
- Streamlit for the web interface
For support and questions:
- Create an issue
- Check the documentation
- Review the examples
Need help implementing this RAG system in your organization? I offer:
- Custom RAG Solutions: Tailored implementations for your specific use case
- Architecture Review: Optimize your AI/ML infrastructure design
- Production Deployment: From prototype to production-ready systems
- Team Training: Workshops on RAG systems and best practices
- Ongoing Support: Maintenance, optimization, and feature development
Ready to accelerate your AI initiatives? Book a consultation to discuss your project requirements.
Built with β€οΈ for production AI applications