ElatoAI is a GitHub repository that provides a framework for implementing real-time voice AI on Arduino ESP32 microcontrollers. The project is designed to support over 100 voice AI models, facilitating the development of AI-powered toys, companions, and other devices. It leverages secure WebSockets and edge functions to enable uninterrupted conversations globally.
Key features of ElatoAI include real-time speech-to-speech conversion, support for creating custom AI agents with distinct personalities and voices, and customizable voice options. The system incorporates server-side Voice Activity Detection (VAD) for intelligent conversation flow, Opus audio compression for efficient high-quality audio streaming, and Deno Edge Functions for low-latency global performance. It is built on the ESP32 Arduino Framework, making it accessible for hardware integration. The project also mentions integration with various LLM, TTS, and STT providers such as OpenAI, Gemini, xAI, Deepgram, and Whisper.