GenAI Labs BE/AI Candidate Challenge - Complete Implementation
An AI-powered research assistant with semantic search, content generation, and enterprise-grade authentication built with FastAPI and modern frontend technologies.
- Python 3.8+
- Node.js 16+
- OpenAI API Key
- Pinecone Local (Docker)
# Start Pinecone Local container
docker run -p 8080:8080 pinecone/pinecone-local:latest
# Navigate to backend directory
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your API keys:
# OPENAI_API_KEY=your_openai_key
# PINECONE_API_KEY=pclocal
# PINECONE_ENVIRONMENT=localhost:8080
# Clear database for fresh start (optional)
python clear_database.py
# Start the backend server
uvicorn main:app --host 0.0.0.0 --port 8006 --reload
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# Start the development server
npm run dev
- Frontend: http://localhost:3001
- Backend API: http://localhost:8006
- API Documentation: http://localhost:8006/docs
- Demo Video:
demo_video/demo_video.mov
The system includes enterprise-grade JWT authentication with role-based access control.
Username | Password | Role | Permissions |
---|---|---|---|
admin |
admin123 |
Admin | Full access (upload, search, generate, analytics) |
researcher |
research123 |
User | Read + Generate (search, content generation, analytics) |
- Navigate to http://localhost:3001
- You'll see a beautiful login screen (dashboard is hidden until authenticated)
- Click "Admin Demo" or "Researcher Demo" for quick access
- Or use the Login/Register buttons for custom authentication
- Dashboard appears after successful authentication
- Document Upload & Processing -
/api/upload
- Semantic Search -
/api/similarity_search
- Document Retrieval -
/api/{journal_id}
- Frontend UI - Complete React-like interface with modern design
- Content Generation APIs - Document summarization, paper comparison, research insights
- Persistent Usage Tracking - Database analytics with usage statistics
- Unit Tests - Comprehensive testing suite for all components
- Authentication & Roles - JWT-based auth with role-based permissions
backend/
βββ main.py # FastAPI application entry point
βββ models.py # Pydantic data models
βββ config.py # Configuration management
βββ requirements.txt # Python dependencies
βββ services/
β βββ vector_service.py # Pinecone vector database operations
β βββ embedding_service.py # OpenAI embeddings generation
β βββ rag_service.py # RAG pipeline with content generation
β βββ auth_service.py # JWT authentication & user management
βββ tests/
βββ test_auth_service.py # Authentication service unit tests (22/22 passing)
integration_tests/
βββ test_basic.py # Basic functionality tests
βββ test_api_basic.py # API endpoint tests
βββ test_upload.py # Upload functionality tests
βββ test_full_system.py # End-to-end system tests
βββ test_openai_connection.py # OpenAI API tests
βββ test_api_endpoints.sh # API testing script
frontend/
βββ index.html # Main HTML file
βββ package.json # Node.js dependencies
βββ vite.config.js # Vite build configuration
βββ js/
β βββ main.js # Application initialization
β βββ api.js # API communication layer
β βββ ui.js # UI management & authentication
β βββ analytics.js # Analytics dashboard
β βββ config.js # Frontend configuration
βββ styles/
βββ main.css # Core styles & design system
βββ components.css # Component-specific styles
βββ responsive.css # Mobile responsiveness
PUT /api/upload
- Upload and process document chunksPOST /api/similarity_search
- Semantic search with AI responsesGET /api/{journal_id}
- Retrieve specific document
POST /api/generate/summary
- Generate document summariesPOST /api/generate/compare
- Compare multiple papersPOST /api/generate/insights
- Generate research insights
POST /api/auth/login
- User authenticationPOST /api/auth/register
- User registrationGET /api/auth/me
- Get current userPUT /api/protected/upload
- Admin-only upload
GET /api/analytics/usage
- Usage statisticsGET /api/analytics/popular
- Popular papers
curl -X POST http://localhost:8006/api/generate/summary \
-H "Content-Type: application/json" \
-d '{
"journal_ids": ["extension_brief_mucuna.pdf"],
"summary_type": "abstract"
}'
curl -X POST http://localhost:8006/api/generate/compare \
-H "Content-Type: application/json" \
-d '{
"journal_ids": ["paper1.pdf", "paper2.pdf"],
"comparison_aspects": ["methodology", "results"]
}'
curl -X POST http://localhost:8006/api/generate/insights \
-H "Content-Type: application/json" \
-d '{
"topic": "machine learning in agriculture",
"time_range": "2020-2023",
"insight_type": "trends"
}'
- Video:
demo_video/demo_video.mov
- Complete demonstration - GIF:
demo_video/demo.gif
- Quick visual overview - Content: Complete demonstration of all required APIs from PDF challenge
- Shows: Upload sample data, similarity search, document retrieval, frontend interface
# Clear database for fresh demo (from backend directory)
cd backend && python clear_database.py
# This script will:
# - Connect to Pinecone Local
# - Show current vector count
# - Clear all vectors for clean demo start
# - Provide next steps for demo recording
# Run backend unit tests (22/22 passing - 100% success rate)
cd backend && python -m pytest tests/ -v
# Run specific authentication tests
cd backend && python -m pytest tests/test_auth_service.py -v
# Basic functionality tests (no API keys needed)
python integration_tests/test_basic.py # 5/5 tests passed
python integration_tests/test_api_basic.py # 4/4 tests passed
# API connection tests (requires API keys)
python integration_tests/test_openai_connection.py # 2/2 tests passed
# Full system tests (requires all services)
python integration_tests/test_full_system.py # Complete system test passed
python integration_tests/test_upload.py # Upload workflow test
# Live API endpoint tests (requires running server)
bash integration_tests/test_api_endpoints.sh # All endpoints working
- Backend Unit Tests: 22/22 passing (100% success rate)
- Integration Tests: 5 test files, all working perfectly
- API Endpoints: 8+ endpoints tested and working with live server
- System Components: Vector DB, OpenAI, RAG pipeline all validated
- Professional Structure: Clean separation of unit vs integration tests
- Backend Unit Tests: Authentication service (JWT, passwords, permissions)
- Integration Tests: API endpoints, upload workflows, system components
- End-to-End Tests: Complete system functionality with real APIs
- Live API Tests: All endpoints working with running server
- Service Tests: Vector DB, embeddings, RAG pipeline, health checks
- JWT Authentication with 30-minute token expiration
- Password Hashing using bcrypt
- Role-Based Access Control (Admin vs User permissions)
- CORS Protection with configurable origins
- Input Validation using Pydantic models
- Authentication Guards protecting frontend routes
The system includes a real-time analytics dashboard showing:
- Document indexing statistics
- Search performance metrics
- System health monitoring
- Usage tracking and popular papers
- Modern Design with gradient themes and smooth animations
- Responsive Layout supporting desktop, tablet, and mobile
- Dark Mode design with professional styling
- Loading States with animated progress indicators
- Error Handling with user-friendly messages
- Accessibility with proper ARIA labels and keyboard navigation
OPENAI_API_KEY=your_openai_api_key
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment
PINECONE_INDEX_NAME=research-assistant
DEBUG=true
const CONFIG = {
API_BASE_URL: 'http://localhost:8006',
DEFAULT_SEARCH: {
k: 10,
min_score: 0.25
}
};
This project is licensed under the MIT License.
- GenAI Labs Candidate - Full Stack Implementation - Research Assistant AI System
Built for GenAI Labs BE/AI Candidate Challenge | All 5 Optional Stretch Features Completed β