The Agent Voice Response (AVR) agentvoiceresponse.com is an advanced IVR solution that integrates with AI, providing a voicebot interface through Asterisk's AudioSocket application. This architecture allows for the replacement of traditional IVR systems with AI-powered conversational agents.
AVR Core manages real-time voice communication between customers and a Asterisk AudioSocket application and interacts with various AI services:
- ASR (Automatic Speech Recognition): Transcribes the incoming audio stream from the customer into text
- LLM (Large Language Model): Interprets the text and generates an appropriate response.
- TTS (Text-to-Speech): Converts the generated text response back into speech, which is then played to the customer.
- STT (Speech-to-Text): Provides accurate transcription of spoken language into text, supporting multiple languages and dialects.
- STS (Speech-to-Speech): Enables direct voice-to-voice communication with AI agents, creating natural and fluid conversations.
AVR Core is designed to be flexible, allowing users to integrate any ASR, LLM, and TTS services by interacting via HTTP API Streams. This modularity allows you to develop your own middleware between AVR Core and the services of your choice. In recent versions, AVR Core has been enhanced with STT (Speech-to-Text) integration to support providers that don't yet offer ASR capabilities, and STS (Speech-to-Speech) integration for direct connection with Conversational AI services like OpenAI Realtime, bypassing the need for separate ASR, LLM, and TTS components.
- Asterisk sends the audio stream to the AVR Core.
- AVR Core forwards the audio to an ASR service for transcription (e.g.,
ASR_URL=http://localhost:6001/speech-to-text-stream
). - Once transcription is received, AVR Core sends the text to an LLM service (e.g.,
LLM_URL=http://localhost:6005/prompt-stream
). - The LLM generates a response, which is sent to a TTS service for voice synthesis (e.g.,
TTS_URL=http://localhost:6003/text-to-speech-stream
). - The synthesized voice is played back to the customer via Asterisk.
- Plug-and-play architecture: Easily swap out different ASR, LLM, and TTS services.
- Real-time voice-to-text and text-to-voice streaming: Handles customer interactions seamlessly via HTTP API streams.
- Scalable design: Integrate your own AI services using custom middleware.
- Multi-language support: Handle conversations in multiple languages.
- Customizable voice personalities: Configure different voice characteristics for your AI agents.
- Detailed analytics: Monitor and analyze call metrics and performance.
- Secure communication: End-to-end encryption for all voice and data streams.
For a list of available integrations, check Agent Voice Response Integrations.
Before installing AVR, ensure you have the following components:
- Docker and Docker Compose
- An Asterisk server with AudioSocket module enabled
- Access and credentials to ASR, LLM, and TTS services
-
Clone the AVR Infrastructure
Clone theavr-infra
repository from the official GitHub repository:git clone https://github.com/agentvoiceresponse/avr-infra.git cd avr-infra
-
Follow the Instructions in the README Inside the cloned repository, follow the setup and configuration steps described in the README.md file to launch your AVR agent with the desired ASR, LLM, and TTS providers.
If you encounter issues during installation or usage:
-
Connection Issues:
- Ensure all services are running with
docker-compose ps
- Check logs for specific errors:
docker-compose logs avr-core
- Verify network connectivity between services
- Ensure all services are running with
-
Audio Quality Issues:
- Verify the audio codec settings in Asterisk
- Check the ASR service compatibility with your audio format
- Ensure proper audio device configuration
-
Performance Issues:
- Consider scaling resources for components handling high traffic
- Optimize LLM prompts for faster response times
- Monitor system resource usage
-
Common Solutions:
- Check our Troubleshooting Guide
- Review the FAQ
Join our growing community of developers and users to share ideas, get help, and collaborate on projects:
- Discord Server - Connect with other AVR users and the development team
- Documentation Wiki - Comprehensive guides and tutorials
- Website - Latest updates and feature announcements
Copyright (c) 2024 Agent Voice Response
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.