Skip to content

8thlight/vibe-llm-gateway

Repository files navigation

LLM Gateway

A centralized platform to manage access to various LLM providers (OpenAI, AWS Bedrock, Ollama, etc.) with a REST API similar to the OpenAI spec.

Features

  • REST API compatible with OpenAI spec
  • Chat requests with tool calling support
  • OAuth authentication and API key management
  • Rate limiting and usage tracking
  • API key proxying to upstream providers
  • Admin panel for managing models, keys, permissions, and metrics
  • Caching and prompt enhancement capabilities

Current Status: Stage 4 Complete ✅

Implemented Features

  • Foundation & Configuration (Stage 1)

    • Project structure and dependencies
    • Configuration management with environment variables
    • Logging setup with tracing
    • Docker and docker-compose setup
  • Data Models & Database (Stage 2)

    • Database schema with migrations
    • Rust structs for API keys, chat requests, and request logs
    • SQLx integration with PostgreSQL
  • Core Services (Stage 3)

    • API key service with CRUD operations
    • OpenAI service (mock implementation)
    • Metrics service for request logging
    • Service initialization and dependency injection
  • HTTP Server & API Endpoints (Stage 4)

    • Axum HTTP server with proper state management
    • Health check endpoint (GET /health)
    • Chat completions endpoint (POST /v1/chat/completions)
    • Request/response handling with proper error types

Next Steps (Stage 5+)

  • 🔄 Authentication & Middleware (Stage 5)

    • API key authentication middleware
    • Rate limiting middleware
    • Request logging middleware
    • CORS configuration
  • 📊 Admin Panel (Stage 6)

    • API key management interface
    • Usage metrics and analytics
    • Model configuration
  • 🔗 OpenAI Integration (Stage 7)

    • Real OpenAI API integration
    • Request/response proxying
    • Error handling and retries

Quick Start

Prerequisites

  • Rust 1.70+
  • Docker and Docker Compose
  • PostgreSQL (via Docker)

Environment Setup

  1. Clone the repository:

    git clone <repository-url>
    cd llm-gateway
  2. Set up environment variables:

    # Copy example config
    cp config.yaml.example config.yaml
    
    # Set your OpenAI API key
    export OPENAI_API_KEY="your-openai-api-key-here"
  3. Start the database:

    make db-start
  4. Run database migrations:

    make migrate
  5. Build and run:

    make run

The server will start on http://localhost:3000

Testing the API

Health Check:

curl http://localhost:3000/health

Chat Completions (Mock):

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "max_tokens": 100
  }'

Development

Available Make Commands

  • make build - Build the project
  • make run - Run the application
  • make check - Run cargo check
  • make test - Run tests
  • make db-start - Start PostgreSQL database
  • make db-stop - Stop PostgreSQL database
  • make migrate - Run database migrations
  • make clean - Clean build artifacts

Project Structure

src/
├── api/           # HTTP API endpoints
│   ├── health.rs  # Health check endpoint
│   └── chat/      # Chat completions endpoint
├── config/        # Configuration management
├── database/      # Database connection and setup
├── models/        # Data models and database structs
├── services/      # Business logic services
├── providers/     # LLM provider integrations
└── utils/         # Utilities (logging, errors)

Configuration

The application uses environment variables for configuration:

  • DATABASE_URL - PostgreSQL connection string
  • OPENAI_API_KEY - OpenAI API key
  • OPENAI_BASE_URL - OpenAI API base URL (default: https://api.openai.com)
  • PORT - HTTP server port (default: 3000)
  • RUST_LOG - Logging level (default: info)

API Documentation

Endpoints

GET /health

Health check endpoint that returns application status.

Response:

{
  "status": "healthy",
  "timestamp": "2025-07-08T22:36:52.493744+00:00",
  "version": "0.1.0"
}

POST /v1/chat/completions

Chat completions endpoint (currently returns mock response).

Request:

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {"role": "user", "content": "Hello, how are you?"}
  ],
  "temperature": 0.7,
  "max_tokens": 100
}

Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1752014221,
  "model": "gpt-3.5-turbo",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "This is a mock response from the LLM Gateway. OpenAI integration coming soon!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 20,
    "total_tokens": 30
  }
}

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

[Add your license here]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published