Agent Voice Response

Agent Voice Response (AVR)

Transform Your IVR with AI-Powered Voice Conversations

The Agent Voice Response (AVR) is a revolutionary open-source platform that transforms traditional Interactive Voice Response (IVR) systems into intelligent, conversational AI agents. Built on Asterisk's AudioSocket technology, AVR seamlessly integrates with your existing telephony infrastructure while providing unprecedented flexibility in AI service selection and deployment.

🚀 Quick Start

Get AVR running in minutes with our pre-configured Docker Compose templates:

# Clone the infrastructure repository
git clone https://github.com/agentvoiceresponse/avr-infra.git
cd avr-infra

# Choose your AI provider combination and start
docker-compose -f docker-compose-openai.yml up -d

# Test with SIP client (username: 1000, password: 1000)
# Call extension 5001 to interact with your AI agent

Need help choosing? Check our Configuration Guide below.

📋 Overview

AVR (Agent Voice Response) is a comprehensive platform that bridges the gap between traditional telephony systems and modern AI capabilities. At its core, AVR manages real-time voice communication between callers and AI-powered conversational agents through Asterisk's AudioSocket protocol.

Core Components

AVR orchestrates five key AI services to create seamless voice conversations:

🎤 ASR (Automatic Speech Recognition): Converts incoming audio streams into accurate text transcriptions
🧠 LLM (Large Language Model): Processes text input and generates intelligent, context-aware responses
🔊 TTS (Text-to-Speech): Transforms AI-generated text into natural-sounding speech
📝 STT (Speech-to-Text): Alternative transcription service for providers without native ASR support
🗣️ STS (Speech-to-Speech): Direct voice-to-voice AI communication for ultra-low latency interactions

Why AVR?

🔌 Universal Compatibility: Works with any ASR, LLM, or TTS provider via standard HTTP APIs
🏢 Enterprise Ready: Seamlessly integrates with existing Asterisk, FreePBX, VitalPBX, and Vicidial systems
💰 Cost Effective: Mix and match providers to optimize costs and performance
🚀 Rapid Deployment: Docker-based architecture for quick setup and scaling
🛡️ Production Grade: Built for reliability with comprehensive error handling and monitoring

🏗️ Architecture

AVR follows a modular, microservices architecture that enables flexible deployment and easy scaling.

Traditional Flow (ASR + LLM + TTS)

graph LR
    A[Caller] --> B[Asterisk PBX]
    B --> C[AVR Core]
    C --> D[ASR Service]
    D --> E[LLM Service]
    E --> F[TTS Service]
    F --> C
    C --> B
    B --> A

Step-by-step process:

📞 Call Initiation: Customer dials extension, Asterisk answers and generates UUID
🔗 AudioSocket Connection: Asterisk establishes TCP connection with AVR Core
🎤 Speech Recognition: AVR Core streams audio to ASR service for real-time transcription
🧠 AI Processing: Transcribed text sent to LLM service for intelligent response generation
🔊 Voice Synthesis: AI response converted to speech via TTS service
📢 Audio Playback: Synthesized voice streamed back to caller through Asterisk

Modern Flow (Speech-to-Speech)

graph LR
    A[Caller] --> B[Asterisk PBX]
    B --> C[AVR Core]
    C --> D[STS Service]
    D --> C
    C --> B
    B --> A

Ultra-low latency process:

📞 Direct Connection: Customer connects via Asterisk AudioSocket
🗣️ Real-time Processing: Audio directly processed by STS service (OpenAI Realtime, Ultravox, etc.)
⚡ Instant Response: AI-generated speech immediately streamed back to caller

✨ Key Features

🔌 Universal Provider Support

50+ AI Providers: Pre-built integrations with OpenAI, Anthropic, Google, Deepgram, ElevenLabs, and more
Cloud & Local Options: Choose from cloud services or self-hosted solutions (Vosk, Ollama, CoquiTTS)
Mix & Match: Combine different providers for optimal cost and performance
Custom Integration: Easy HTTP API integration for any provider

🚀 Performance & Reliability

Real-time Streaming: Sub-second latency with optimized audio processing
Voice Activity Detection: Intelligent speech detection for natural conversations
Multi-codec Support: Automatic detection of μ-law, A-law, and PCM audio formats
Error Recovery: Robust error handling and automatic retry mechanisms
Horizontal Scaling: Docker-based microservices for easy scaling

🎛️ Advanced Capabilities

Function Calling: Support for OpenAI and Anthropic function calling
Ambient Noise: Configurable background sounds for realistic environments
Webhook Integration: Real-time event notifications for call analytics
Multi-language: Support for 100+ languages and dialects
Custom Voices: Personalized AI voice characteristics and personalities

🛡️ Enterprise Features

Asterisk Integration: Native support for Asterisk 18+ with AudioSocket
PBX Compatibility: Works with FreePBX, VitalPBX, Vicidial, and custom setups
Security: End-to-end encryption and secure API communication
Monitoring: Comprehensive logging and performance metrics
High Availability: Built-in redundancy and failover capabilities

📊 Analytics & Monitoring

Call Metrics: Detailed performance and usage statistics
Real-time Dashboards: Live monitoring of system health and call quality
Webhook Events: Custom event notifications for integration with external systems
Debugging Tools: Comprehensive logging for troubleshooting

🔗 Explore all available integrations: Agent Voice Response Integrations

🔧 Installation

Prerequisites

Before installing AVR, ensure you have the following components:

🐳 Docker & Docker Compose: Latest versions installed and running
📞 Asterisk Server: Version 18+ with AudioSocket module enabled (included in Docker setup)
🔑 API Credentials: Access keys for your chosen AI service providers
🌐 Network Access: Internet connectivity for cloud-based AI services (optional for local deployments)

Quick Installation

📥 Clone the Infrastructure Repository

git clone https://github.com/agentvoiceresponse/avr-infra.git
cd avr-infra

⚙️ Configure Environment Variables

# Copy the example environment file
cp .env.example .env

# Edit with your preferred editor
nano .env

🚀 Choose Your Deployment

# For OpenAI + Deepgram (recommended for beginners)
docker-compose -f docker-compose-openai.yml up -d

# For local/self-hosted (no API keys needed)
docker-compose -f docker-compose-vosk.yml up -d

# For Speech-to-Speech (ultra-low latency)
docker-compose -f docker-compose-openai-realtime.yml up -d

✅ Verify Installation

# Check if all services are running
docker-compose ps

# View logs for troubleshooting
docker-compose logs avr-core

Detailed Setup Instructions

For comprehensive setup guides, advanced configurations, and provider-specific instructions, visit our detailed documentation:

⚙️ Configuration

Provider Combinations

AVR supports multiple deployment patterns to match your needs:

Use Case	Recommended Setup	Benefits
🆕 Getting Started	OpenAI + Deepgram	Easy setup, excellent quality
💰 Cost Optimized	Vosk + Ollama + CoquiTTS	Free, self-hosted solution
⚡ Ultra-Low Latency	OpenAI Realtime STS	<200ms response times
🌐 Multi-Language	Google + OpenRouter	100+ languages supported
🏢 Enterprise	Anthropic + ElevenLabs	Advanced features, compliance

Environment Configuration

Key environment variables for different deployment types:

# Traditional ASR + LLM + TTS
ASR_URL=http://avr-asr-deepgram:6001/speech-to-text-stream
LLM_URL=http://avr-llm-openai:6005/prompt-stream  
TTS_URL=http://avr-tts-deepgram:6003/text-to-speech-stream

# Speech-to-Speech (STS)
STS_URL=ws://avr-sts-openai:6030

# Optional Advanced Features
WEBHOOK_URL=https://your-webhook-endpoint.com/avr-events
AMBIENT_NOISE_FILE=/path/to/background.mp3
AMBIENT_NOISE_LEVEL=0.2

SIP Client Testing

Once deployed, test your setup:

📱 Install SIP Client: Use Telephone, MicroSIP, or any SIP client
🔐 Register: Username 1000, Password 1000, Server localhost:5060
📞 Test Basic Connectivity: Call extension 600 (echo test)
🤖 Test AI Agent: Call extension 5001 to interact with your AI

🎯 Use Cases

🏢 Customer Service

24/7 Support: Automated customer service with natural conversations
Call Routing: Intelligent call routing based on customer intent
FAQ Handling: Automated responses to common questions
Escalation: Seamless handoff to human agents when needed

📞 Sales & Marketing

Lead Qualification: Automated lead scoring and qualification
Appointment Scheduling: AI-powered appointment booking
Product Information: Detailed product explanations and recommendations
Follow-up Calls: Automated follow-up and nurturing campaigns

🏥 Healthcare

Appointment Scheduling: Patient appointment management
Symptom Triage: Basic health screening and routing
Prescription Refills: Automated prescription renewal requests
Health Reminders: Medication and appointment reminders

🏦 Financial Services

Account Inquiries: Balance and transaction information
Loan Applications: Initial loan screening and information collection
Fraud Detection: Automated fraud monitoring and alerts
Payment Processing: Automated payment collection and processing

🎓 Education

Student Services: Course information and enrollment assistance
Campus Information: General information and directions
Emergency Notifications: Automated emergency communication
Alumni Relations: Alumni engagement and donation campaigns

🔍 Troubleshooting

Common Issues & Solutions

🚨 Connection Issues

# Check service status
docker-compose ps

# View detailed logs
docker-compose logs avr-core
docker-compose logs avr-asterisk

# Restart services
docker-compose restart

🎵 Audio Quality Problems

Echo Issues: Check Asterisk echo cancellation settings
Poor Transcription: Verify ASR service configuration and audio codec
Delayed Responses: Monitor network latency and service performance
Audio Dropouts: Check bandwidth and Docker resource allocation

⚡ Performance Optimization

# Monitor resource usage
docker stats

# Scale services if needed
docker-compose up -d --scale avr-llm-openai=3

# Optimize LLM prompts for faster responses

🔧 Configuration Issues

API Key Errors: Verify credentials in .env file
Service Unreachable: Check network connectivity and firewall settings
Wrong Extensions: Verify Asterisk dialplan configuration
Audio Codec Mismatch: Ensure codec compatibility between Asterisk and AVR

Getting Help

📚 Documentation: Check our comprehensive wiki
❓ FAQ: Review frequently asked questions
💬 Community: Join our Discord server for real-time help
🐛 Issues: Report bugs on GitHub

🤝 Community & Support

Join our growing community of developers, businesses, and AI enthusiasts building the future of voice AI:

📚 Resources

🌐 Website: agentvoiceresponse.com - Official website with demos and case studies
📖 Documentation: wiki.agentvoiceresponse.com - Comprehensive guides and tutorials
🐳 Docker Hub: hub.docker.com/u/agentvoiceresponse - Official Docker images
📦 NPM Packages: npmjs.com/~agentvoiceresponse - Node.js packages and tools

💬 Community Channels

💬 Discord: discord.gg/DFTU69Hg74 - Real-time chat, support, and discussions
🐙 GitHub: github.com/agentvoiceresponse - Source code, issues, and contributions
📧 Email: [email protected] - Business inquiries and partnerships

🚀 Get Involved

⭐ Star our repositories to show your support
🐛 Report bugs and request features on GitHub
💡 Share your use cases and success stories
🤝 Contribute to the project with code, documentation, or testing
📢 Spread the word about AVR in your network

💖 Support AVR Development

AVR is 100% free and open-source for both personal and commercial use. If you find AVR valuable for your projects or business, consider supporting its continued development:

Your support helps us:

🚀 Maintain and improve the core platform
🔧 Add new AI provider integrations
📚 Create better documentation and tutorials
🐛 Fix bugs and add requested features
🌍 Make AI voice technology accessible to everyone

📄 License

MIT License - Free for personal and commercial use

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Ready to transform your IVR with AI?

🚀 Get Started Now • 💬 Join Community • 📖 Read Docs

Built with ❤️ by the Agent Voice Response team