Skip to content

Deadsec69/research-assistant-ai-challenge

Repository files navigation

Research Assistant AI System

GenAI Labs BE/AI Candidate Challenge - Complete Implementation

An AI-powered research assistant with semantic search, content generation, and enterprise-grade authentication built with FastAPI and modern frontend technologies.

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • Node.js 16+
  • OpenAI API Key
  • Pinecone Local (Docker)

1. Start Pinecone Local

# Start Pinecone Local container
docker run -p 8080:8080 pinecone/pinecone-local:latest

2. Backend Setup

# Navigate to backend directory
cd backend

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your API keys:
# OPENAI_API_KEY=your_openai_key
# PINECONE_API_KEY=pclocal
# PINECONE_ENVIRONMENT=localhost:8080

# Clear database for fresh start (optional)
python clear_database.py

# Start the backend server
uvicorn main:app --host 0.0.0.0 --port 8006 --reload

3. Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Start the development server
npm run dev

4. Access the Application

πŸ” Authentication

The system includes enterprise-grade JWT authentication with role-based access control.

Demo Accounts

Username Password Role Permissions
admin admin123 Admin Full access (upload, search, generate, analytics)
researcher research123 User Read + Generate (search, content generation, analytics)

Login Flow

  1. Navigate to http://localhost:3001
  2. You'll see a beautiful login screen (dashboard is hidden until authenticated)
  3. Click "Admin Demo" or "Researcher Demo" for quick access
  4. Or use the Login/Register buttons for custom authentication
  5. Dashboard appears after successful authentication

πŸ“š Features Implemented

βœ… Core Requirements (100% Complete)

  1. Document Upload & Processing - /api/upload
  2. Semantic Search - /api/similarity_search
  3. Document Retrieval - /api/{journal_id}

βœ… Optional Stretch Features (5/5 - 100% Complete)

  1. Frontend UI - Complete React-like interface with modern design
  2. Content Generation APIs - Document summarization, paper comparison, research insights
  3. Persistent Usage Tracking - Database analytics with usage statistics
  4. Unit Tests - Comprehensive testing suite for all components
  5. Authentication & Roles - JWT-based auth with role-based permissions

πŸ›  Architecture

Backend (FastAPI)

backend/
β”œβ”€β”€ main.py                 # FastAPI application entry point
β”œβ”€β”€ models.py              # Pydantic data models
β”œβ”€β”€ config.py              # Configuration management
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ services/
β”‚   β”œβ”€β”€ vector_service.py     # Pinecone vector database operations
β”‚   β”œβ”€β”€ embedding_service.py  # OpenAI embeddings generation
β”‚   β”œβ”€β”€ rag_service.py       # RAG pipeline with content generation
β”‚   └── auth_service.py      # JWT authentication & user management
└── tests/
    └── test_auth_service.py     # Authentication service unit tests (22/22 passing)

Integration Tests

integration_tests/
β”œβ”€β”€ test_basic.py          # Basic functionality tests
β”œβ”€β”€ test_api_basic.py      # API endpoint tests
β”œβ”€β”€ test_upload.py         # Upload functionality tests
β”œβ”€β”€ test_full_system.py    # End-to-end system tests
β”œβ”€β”€ test_openai_connection.py # OpenAI API tests
└── test_api_endpoints.sh  # API testing script

Frontend (Modern Web Technologies)

frontend/
β”œβ”€β”€ index.html            # Main HTML file
β”œβ”€β”€ package.json          # Node.js dependencies
β”œβ”€β”€ vite.config.js       # Vite build configuration
β”œβ”€β”€ js/
β”‚   β”œβ”€β”€ main.js             # Application initialization
β”‚   β”œβ”€β”€ api.js              # API communication layer
β”‚   β”œβ”€β”€ ui.js               # UI management & authentication
β”‚   β”œβ”€β”€ analytics.js        # Analytics dashboard
β”‚   └── config.js           # Frontend configuration
└── styles/
    β”œβ”€β”€ main.css            # Core styles & design system
    β”œβ”€β”€ components.css      # Component-specific styles
    └── responsive.css      # Mobile responsiveness

πŸ”Œ API Endpoints

Core Endpoints

  • PUT /api/upload - Upload and process document chunks
  • POST /api/similarity_search - Semantic search with AI responses
  • GET /api/{journal_id} - Retrieve specific document

Content Generation

  • POST /api/generate/summary - Generate document summaries
  • POST /api/generate/compare - Compare multiple papers
  • POST /api/generate/insights - Generate research insights

Authentication

  • POST /api/auth/login - User authentication
  • POST /api/auth/register - User registration
  • GET /api/auth/me - Get current user
  • PUT /api/protected/upload - Admin-only upload

Analytics

  • GET /api/analytics/usage - Usage statistics
  • GET /api/analytics/popular - Popular papers

πŸ“Š Content Generation Examples

Document Summarization

curl -X POST http://localhost:8006/api/generate/summary \
  -H "Content-Type: application/json" \
  -d '{
    "journal_ids": ["extension_brief_mucuna.pdf"],
    "summary_type": "abstract"
  }'

Paper Comparison

curl -X POST http://localhost:8006/api/generate/compare \
  -H "Content-Type: application/json" \
  -d '{
    "journal_ids": ["paper1.pdf", "paper2.pdf"],
    "comparison_aspects": ["methodology", "results"]
  }'

Research Insights

curl -X POST http://localhost:8006/api/generate/insights \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "machine learning in agriculture",
    "time_range": "2020-2023",
    "insight_type": "trends"
  }'

🎬 Demo Materials

Demo Video & GIF

  • Video: demo_video/demo_video.mov - Complete demonstration
  • GIF: demo_video/demo.gif - Quick visual overview
  • Content: Complete demonstration of all required APIs from PDF challenge
  • Shows: Upload sample data, similarity search, document retrieval, frontend interface

Demo GIF

Database Management

# Clear database for fresh demo (from backend directory)
cd backend && python clear_database.py

# This script will:
# - Connect to Pinecone Local
# - Show current vector count
# - Clear all vectors for clean demo start
# - Provide next steps for demo recording

πŸ§ͺ Testing

Backend Unit Tests βœ…

# Run backend unit tests (22/22 passing - 100% success rate)
cd backend && python -m pytest tests/ -v

# Run specific authentication tests
cd backend && python -m pytest tests/test_auth_service.py -v

Integration Tests βœ…

# Basic functionality tests (no API keys needed)
python integration_tests/test_basic.py                    # 5/5 tests passed
python integration_tests/test_api_basic.py                # 4/4 tests passed

# API connection tests (requires API keys)
python integration_tests/test_openai_connection.py        # 2/2 tests passed

# Full system tests (requires all services)
python integration_tests/test_full_system.py              # Complete system test passed
python integration_tests/test_upload.py                   # Upload workflow test

# Live API endpoint tests (requires running server)
bash integration_tests/test_api_endpoints.sh              # All endpoints working

Test Results Summary βœ…

  • Backend Unit Tests: 22/22 passing (100% success rate)
  • Integration Tests: 5 test files, all working perfectly
  • API Endpoints: 8+ endpoints tested and working with live server
  • System Components: Vector DB, OpenAI, RAG pipeline all validated
  • Professional Structure: Clean separation of unit vs integration tests

Test Coverage

  • Backend Unit Tests: Authentication service (JWT, passwords, permissions)
  • Integration Tests: API endpoints, upload workflows, system components
  • End-to-End Tests: Complete system functionality with real APIs
  • Live API Tests: All endpoints working with running server
  • Service Tests: Vector DB, embeddings, RAG pipeline, health checks

πŸ›‘ Security Features

  • JWT Authentication with 30-minute token expiration
  • Password Hashing using bcrypt
  • Role-Based Access Control (Admin vs User permissions)
  • CORS Protection with configurable origins
  • Input Validation using Pydantic models
  • Authentication Guards protecting frontend routes

πŸ“ˆ Analytics Dashboard

The system includes a real-time analytics dashboard showing:

  • Document indexing statistics
  • Search performance metrics
  • System health monitoring
  • Usage tracking and popular papers

🎨 UI/UX Features

  • Modern Design with gradient themes and smooth animations
  • Responsive Layout supporting desktop, tablet, and mobile
  • Dark Mode design with professional styling
  • Loading States with animated progress indicators
  • Error Handling with user-friendly messages
  • Accessibility with proper ARIA labels and keyboard navigation

πŸ”§ Configuration

Backend Configuration (backend/.env)

OPENAI_API_KEY=your_openai_api_key
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_environment
PINECONE_INDEX_NAME=research-assistant
DEBUG=true

Frontend Configuration (frontend/js/config.js)

const CONFIG = {
  API_BASE_URL: 'http://localhost:8006',
  DEFAULT_SEARCH: {
    k: 10,
    min_score: 0.25
  }
};

πŸ“„ License

This project is licensed under the MIT License.

πŸ‘₯ Authors

  • GenAI Labs Candidate - Full Stack Implementation - Research Assistant AI System

Built for GenAI Labs BE/AI Candidate Challenge | All 5 Optional Stretch Features Completed βœ…

About

Research assistant task code and its implmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published