Documentation Index
Fetch the complete documentation index at: https://patter-06b046ce-docs-fix-logo-and-home-icon.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Voice Providers
Patter supports three voice AI architectures. Each offers different tradeoffs between latency, voice quality, and customization.
OpenAI Realtime (Default)
End-to-end voice processing powered by OpenAI’s Realtime API. Audio goes directly to OpenAI, which handles speech recognition, language understanding, and speech synthesis in a single round trip.
agent = phone.agent(
system_prompt="You are a helpful assistant.",
provider="openai_realtime", # default
model="gpt-4o-mini-realtime-preview",
voice="alloy",
)
Audio Encoding
OpenAI Realtime handles audio encoding automatically based on your telephony provider:
| Telephony Provider | Audio Format | Sample Rate |
|---|
| Twilio | G.711 mu-law | 8 kHz |
| Telnyx | PCM 16-bit | 16 kHz |
Available Voices
"alloy", "echo", "fable", "onyx", "nova", "shimmer"
Requirements
openai_key in the Patter constructor (local mode)
ElevenLabs Conversational AI
Uses ElevenLabs’ Conversational AI platform for natural, expressive voices. Ideal when voice quality is the top priority.
agent = phone.agent(
system_prompt="You are a warm and friendly concierge.",
provider="elevenlabs_convai",
voice="rachel",
)
Configuration
When using ElevenLabs ConvAI, you can configure additional provider-specific parameters through the agent:
| Parameter | Description |
|---|
voice | ElevenLabs voice ID or name (e.g., "rachel", "adam") |
model | Model identifier for ElevenLabs |
Requirements
elevenlabs_key in the Patter constructor (local mode)
Pipeline Mode
Build a custom voice pipeline by combining separate STT (speech-to-text) and TTS (text-to-speech) providers. This gives you full control over each stage of the audio processing chain.
agent = phone.agent(
system_prompt="You are a helpful assistant.",
provider="pipeline",
stt=Patter.deepgram(api_key="dg_..."),
tts=Patter.elevenlabs(api_key="el_...", voice="rachel"),
)
In pipeline mode, the on_message callback receives the transcribed text and returns the response to synthesize:
async def handle_message(event) -> str:
return f"You said: {event['text']}. How can I help?"
await phone.serve(agent, on_message=handle_message)
Requirements
Pipeline mode requires both an STT and a TTS provider. If you don’t pass stt/tts explicitly, Patter falls back to deepgram_key and elevenlabs_key from the constructor.
STT Providers
Use these factory methods to configure speech-to-text:
Patter.deepgram()
stt = Patter.deepgram(api_key="dg_...", language="en")
| Parameter | Type | Default | Description |
|---|
api_key | str | required | Your Deepgram API key. |
language | str | "en" | BCP-47 language code. |
Patter.whisper()
stt = Patter.whisper(api_key="sk-...", language="en")
| Parameter | Type | Default | Description |
|---|
api_key | str | required | Your OpenAI API key. |
language | str | "en" | BCP-47 language code. |
TTS Providers
Use these factory methods to configure text-to-speech:
Patter.elevenlabs()
tts = Patter.elevenlabs(api_key="el_...", voice="rachel")
| Parameter | Type | Default | Description |
|---|
api_key | str | required | Your ElevenLabs API key. |
voice | str | "rachel" | Voice name or ID. |
Patter.openai_tts()
tts = Patter.openai_tts(api_key="sk-...", voice="alloy")
| Parameter | Type | Default | Description |
|---|
api_key | str | required | Your OpenAI API key. |
voice | str | "alloy" | Voice name ("alloy", "echo", "fable", "onyx", "nova", "shimmer"). |
OpenAI TTS returns audio at 24 kHz. Patter automatically resamples it to 16 kHz for telephony compatibility.
Provider Comparison
| Feature | OpenAI Realtime | ElevenLabs ConvAI | Pipeline |
|---|
| Latency | Lowest | Low | Medium |
| Voice quality | Good | Best | Configurable |
| Customization | Limited | Medium | Full |
on_message callback | No | No | Yes |
| Requires AI key | OpenAI | ElevenLabs | STT + TTS keys |
Complete Pipeline Example
import os
import asyncio
from dotenv import load_dotenv
from patter import Patter
load_dotenv()
phone = Patter(
twilio_sid=os.environ["TWILIO_SID"],
twilio_token=os.environ["TWILIO_TOKEN"],
phone_number=os.environ["PHONE_NUMBER"],
webhook_url=os.environ["WEBHOOK_URL"],
)
agent = phone.agent(
system_prompt="You are a helpful assistant.",
provider="pipeline",
stt=Patter.deepgram(api_key=os.environ["DEEPGRAM_KEY"]),
tts=Patter.elevenlabs(api_key=os.environ["ELEVENLABS_KEY"], voice="rachel"),
)
async def handle_message(event) -> str:
user_text = event["text"]
# Add your own LLM logic here
return f"I heard you say: {user_text}"
async def main():
await phone.serve(agent, on_message=handle_message, port=8000)
asyncio.run(main())