Build an AI phone assistant that actually understands and responds naturally to your callers.
This project connects SignalWire's telephony platform with OpenAI's GPT-4 Realtime API to create voice assistants that can answer phone calls, have natural conversations, and help callers with real informationβall in real-time.
- Introduction
- Prerequisites
- Quick Start
- How It Works
- Configuration
- Production Deployment
- Development
- Troubleshooting
- Project Structure
This application creates a bidirectional audio streaming bridge between phone calls and OpenAI's Realtime API. The result is an AI assistant that can:
- Introduction
- Prerequisites
- Quick Start
- How It Works
- Configuration
- Production Deployment
- Development
- Troubleshooting
- Project Structure
This application creates a bidirectional audio streaming bridge between phone calls and OpenAI's Realtime API. The result is an AI assistant that can:
- Have natural, flowing conversations with zero buffering delays
- Answer questions and provide information in real-time
- Check the weather for any US city
- Tell the current time
- Handle interruptions naturally (no more talking over each other!)
All with crystal-clear HD voice quality and true real-time bidirectional communication.
π Technical Overview
- Incoming Call β SignalWire receives the call and streams audio via WebSocket to our server
- Audio Processing β Our TypeScript server forwards the audio stream to OpenAI's Realtime API using the official SDK
- Function Call Processing β When AI needs information (weather, time, etc.), function calls are processed locally on our server
- AI Response β OpenAI processes speech and function results in real-time, generating audio responses
- Audio Feedback β AI responses stream back through our WebSocket server to SignalWire
- Caller Hears AI β SignalWire feeds the AI audio directly back into the call
- @openai/agents - OpenAI's official SDK for GPT-4 Realtime API
- @openai/agents-realtime - Real-time audio streaming with OpenAI
- Fastify - High-performance web framework
- TypeScript - Type-safe JavaScript
You'll need:
- Node.js 20+ - Download here
- OpenAI API Key - Get one here (requires paid account)
- SignalWire Account - Sign up free (for phone integration)
- ngrok (for local development) - Install ngrok to expose your local server
- Docker (optional) - Install Docker for containerized deployment
Follow these three high-level steps to get your AI voice assistant running:
π Configure SignalWire for Voice Streaming
Follow the SignalWire Getting Started Guide to:
- Create your SignalWire project
- Set up your workspace
Sign up for free at SignalWire
Before you can assign webhook URLs, you need to create a cXML webhook resource:
- In your SignalWire dashboard, go to My Resources
- Click Create Resource
- Select Script as the resource type, then select
cXML
- Set the resource to
Handle Using
asExternal Url
- Set the
Primary Script URL
to your server's webhook endpoint (you'll configure this in step 3):https://your-ngrok-url.ngrok.io/incoming-call
π¨ Critical: You MUST include
/incoming-call
at the end of your URL - Give it a descriptive name (e.g., "AI Voice Assistant")
- Create the resource
π Learn More: SignalWire Call Fabric Resources Guide
To test your AI assistant, create a SIP address that connects to your cXML resource:
- From the resource page of the resource you just created, click the
Addresses & Phone Numbers
tab - Click Add to create a new address
- Select SIP Address as the address type
- Fill out the address information
- Save the configuration
π Learn More: SignalWire Call Fabric Addresses Guide π Learn More: SignalWire Call Fabric Addresses Guide
π‘ Tip: You can also purchase a regular phone number and link it to your cXML resource if you prefer traditional phone number calling.
βοΈ Install and Set Up Your Code
Option 1: Try in Replit
π Note: Clicking the button above will take you to Replit where you can import this GitHub repository. After importing, you'll need to configure your OpenAI API key as a Replit Secret. Add
OPENAI_API_KEY
as a secret in your Repl.
Option 2: Clone Locally
git clone <repository-url>
cd cXML-realtime-agent-stream
npm install
Choose ONE method based on how you'll run the app:
βοΈ Option A: Replit (using Replit Secrets)
- Go to the "Secrets" tab in your Repl (lock icon in sidebar)
- Add a new secret:
OPENAI_API_KEY
with your API key value - Learn more about Replit Secrets
π΅ Option B: Local Development (using .env file)
cp .env.example .env
# Edit .env and add your OpenAI API key:
# OPENAI_API_KEY=sk-your-actual-api-key-here
π³ Option C: Docker Deployment (using secrets folder)
mkdir -p secrets
echo "sk-your-actual-api-key-here" > secrets/openai_api_key.txt
Note: Use only ONE method. Replit uses Secrets, local development uses .env, and Docker uses the secrets folder.
π Get Your API Key: OpenAI Platform (requires paid account)
π Expose Your Local Server & Test
For Local Development:
npm run build
npm start
For Docker:
docker-compose up --build signalwire-assistant
β
Your AI assistant is now running at http://localhost:5050/incoming-call
In a new terminal, run:
npx ngrok http 5050
You'll get a public URL like: https://abc123.ngrok.io
- Go back to your SignalWire cXML resource (from Step 1)
- Update the
Primary Script URL
to:https://abc123.ngrok.io/incoming-call
- Save the configuration
β οΈ Important: ngrok URLs change each time you restart it. Update your SignalWire webhook URL whenever you restart ngrok.
Call the SIP address you created in Step 1:
- Using a SIP Phone or Softphone, dial:
sip:[email protected]
- Replace with the actual SIP address you created
The call flow will be:
Your SIP call β SignalWire β ngrok β Your local server β OpenAI β Response β Caller
π± Alternative: If you purchased a regular phone number and linked it to your cXML resource, you can call that number directly.
Phone Call β SignalWire β Your Server β OpenAI β Real-time Response β Caller
- Someone calls your SignalWire number
- SignalWire streams the audio to your server via WebSocket
- Your server forwards it to OpenAI's Realtime API
- OpenAI processes speech and generates responses instantly
- Responses stream back to the caller in real-time
The magic is in the real-time streamingβthere's no "recording, processing, playing back." It's a continuous, natural conversation.
Environment Variables
Configure your assistant using the following variables. Each variable is handled differently depending on your deployment method:
Variable | Local Development | Docker Deployment | Type | Required |
---|---|---|---|---|
OPENAI_API_KEY |
.env file |
Docker secrets file (secrets/openai_api_key.txt ) |
Secret | Yes |
PORT |
.env file |
docker-compose environment section | Environment Variable | No |
AUDIO_FORMAT |
.env file |
docker-compose environment section | Environment Variable | No |
For Local Development:
Create a .env
file in your project root:
OPENAI_API_KEY=sk-your-actual-api-key-here
PORT=5050 # optional, defaults to 5050
AUDIO_FORMAT=pcm16 # optional
For Docker Deployment:
OPENAI_API_KEY
: Createsecrets/openai_api_key.txt
with your API keyPORT
: Already configured indocker-compose.yml
(can be modified there)AUDIO_FORMAT
: Already configured indocker-compose.yml
(can be modified there)
pcm16
- High Definition Audio (24kHz) - Crystal clear voice quality, best for demosg711_ulaw
- Standard Telephony (8kHz) - Traditional phone quality (default)
π Security Note: Docker uses secrets for sensitive data like API keys, while regular environment variables are used for configuration options.
Customize Your Assistant
Edit src/config.ts
to change your AI's personality:
export const AGENT_CONFIG = {
voice: 'alloy', // Choose: alloy, echo, fable, onyx, nova, shimmer
instructions: `Your custom personality here...`
}
Add New Capabilities
Create new tools in src/tools/
- see weather.tool.ts
for an example.
For production deployment, we recommend using Docker. See the Docker Setup Guide for:
- External secrets management
- Health checks and monitoring
- Docker Swarm configuration
- Troubleshooting tips
# Development with hot reload
npm run dev
# Type checking
npm run typecheck
# View debug logs
DEBUG=openai-agents:* npm run dev
Common Issues & Solutions
"Missing OPENAI_API_KEY"
- Make sure your
.env
file exists and contains your actual API key
"SignalWire client connection error"
- Ensure your webhook URL is publicly accessible (use ngrok for local testing)
- Check that port 5050 is not blocked
Audio quality issues
- HD voice requires
L16@24000h
codec in SignalWire webhook - Standard quality: Remove the codec parameter
Can't receive calls
- Verify SignalWire webhook is set to your public URL with
/incoming-call
endpoint - Check ngrok is still running and URL hasn't changed
- Common mistake: Using base URL without
/incoming-call
(calls won't work!) - Look at console logs for connection messages
src/
βββ config.ts # AI assistant configuration
βββ index.ts # Server setup
βββ routes/ # HTTP endpoints
β βββ webhook.ts # Handles incoming calls
β βββ streaming.ts # WebSocket audio streaming
β βββ health.ts # Health check endpoint
βββ tools/ # AI capabilities (weather, time, etc.)
βββ transports/ # SignalWire β OpenAI bridge
Built with TypeScript, Fastify, and WebSockets. MIT Licensed.