Voice Agents API
AI-powered phone calls, audio transcription, and clinical note generation.
The MedSync Platform API provides two core voice capabilities:
- Agent Calls — initiate AI voice agent phone calls that can handle appointment reminders, follow-ups, confirmations, or fully custom conversations.
- Audio Transcription — upload audio recordings and receive speaker-diarized transcripts with AI-generated clinical notes.
Base URL: https://api.gomedsync.com
Architecture
Agent Call Flow
When you initiate a call, MedSync builds a prompt from the purpose and context you provide, sends it to ElevenLabs Conversational AI which handles the voice synthesis and conversation logic, and ElevenLabs uses Twilio to place the actual phone call. After the call completes, the transcript and AI-generated summary are available via the ElevenLabs conversation API.
Transcription Flow
Audio is transcribed by Deepgram with speaker diarization, then Gemini generates a structured clinical note based on the selected template.
Authentication
All API requests require an X-API-Key header. API keys are provisioned per client through the admin panel.
curl -H "X-API-Key: your_api_key_here" \
https://api.gomedsync.com/v1/usage
Agent Calls
The agent call system lets you trigger AI-powered outbound phone calls. The AI agent follows a prompt you define and has a natural conversation with the person who answers.
Initiate a Call
Place an outbound AI voice call to a phone number.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
patient_phone | string | Yes | Phone number in E.164 format (e.g. +14155551234) |
patient_name | string | No | Name of the person being called. Used in the greeting. |
purpose | string | No | One of: appointment_reminder, follow_up, appointment_confirmation, custom. Defaults to appointment_reminder. |
custom_prompt | string | No | Custom system prompt when purpose is custom. |
context | object | No | Key-value pairs appended to the prompt. If context.prompt is set and purpose is custom, it overrides custom_prompt. |
language | string | No | Language code: en, es, ca, fr, pt. Defaults to en. |
voice_id | string | No | ElevenLabs voice ID override. |
callback_url | string | No | URL to receive a POST with call results when the call completes. |
Example Request
curl -X POST https://api.gomedsync.com/v1/agent-call \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"patient_phone": "+14155551234",
"patient_name": "Maria Garcia",
"purpose": "appointment_reminder",
"context": {
"appointment_date": "May 22, 2026 at 10:00 AM",
"doctor_name": "Dr. Smith",
"clinic_name": "Sunrise Medical"
},
"language": "en",
"callback_url": "https://yourapp.com/webhooks/call-complete"
}'
Response
{
"call_sid": "CA1234567890abcdef...",
"conversation_id": "conv_abc123def456...",
"status": "initiated",
"message": "Agent call initiated to +14155551234"
}
| Field | Description |
|---|---|
call_sid | Twilio call SID for tracking the phone call. |
conversation_id | ElevenLabs conversation ID. Use this to fetch the transcript and summary after the call. |
status | Always initiated on success. |
Custom Prompt Example
For fully custom call scripts, set purpose to custom and provide your prompt:
curl -X POST https://api.gomedsync.com/v1/agent-call \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"patient_phone": "+34612345678",
"patient_name": "Carlos",
"purpose": "custom",
"context": {
"prompt": "You are a friendly receptionist calling to inform the patient that their lab results are ready for pickup. Ask them to come to the clinic during business hours (9 AM to 6 PM). Be warm and brief."
},
"language": "es"
}'
Get Call Status
Retrieve the current status of a call by its Twilio call SID.
Example
curl https://api.gomedsync.com/v1/agent-call/CA1234567890abcdef \
-H "X-API-Key: your_api_key"
Response
{
"call_sid": "CA1234567890abcdef",
"status": "completed",
"duration_seconds": 47.0,
"purpose": "appointment_reminder",
"patient_phone": "+14155551234",
"summary": "",
"created_at": "2026-05-20T14:30:00"
}
Possible status values: initiated, ringing, in-progress, completed, failed, busy, no-answer.
List Call Purposes
Returns the available built-in call purposes with descriptions.
curl https://api.gomedsync.com/v1/agent-call/purposes \
-H "X-API-Key: your_api_key"
{
"purposes": [
{"id": "appointment_reminder", "description": "Remind a patient about their upcoming appointment..."},
{"id": "follow_up", "description": "Check on a patient after their recent visit..."},
{"id": "appointment_confirmation", "description": "Confirm a newly booked appointment..."},
{"id": "custom", "description": "Custom prompt"}
]
}
Status Webhook
Twilio status callback endpoint (internal). Receives call status updates and triggers the client callback.
This endpoint is called by Twilio automatically. You do not call it directly. When a call completes and you provided a callback_url, your URL receives a POST with:
{
"call_sid": "CA1234567890abcdef",
"status": "completed",
"duration_seconds": 47.0,
"patient_phone": "+14155551234",
"purpose": "appointment_reminder",
"summary": ""
}
Conversation Data (Transcript & Summary)
After a call completes, the full transcript and an AI-generated summary are available from ElevenLabs using the conversation_id returned when you initiated the call.
Fetch conversation details from ElevenLabs directly. Requires your ElevenLabs API key.
curl https://api.elevenlabs.io/v1/convai/conversations/conv_abc123def456 \
-H "xi-api-key: your_elevenlabs_api_key"
Key Response Fields
| Field | Description |
|---|---|
transcript | Array of {role, message} objects. role is "agent" or "user". |
analysis.transcript_summary | AI-generated summary of the conversation. |
call_duration_secs | Duration in seconds. |
call_successful | Boolean. Whether the call connected and completed. |
status | "done", "processing", or "ongoing". |
status is "done".
Example: Parsing the Transcript
# Python example
import httpx
async def get_transcript(conversation_id: str, api_key: str):
r = await httpx.AsyncClient().get(
f"https://api.elevenlabs.io/v1/convai/conversations/{conversation_id}",
headers={"xi-api-key": api_key},
)
data = r.json()
# Format transcript
for turn in data["transcript"]:
role = "Agent" if turn["role"] == "agent" else "Patient"
print(f"{role}: {turn['message']}")
# Get summary
summary = data.get("analysis", {}).get("transcript_summary", "")
print(f"\nSummary: {summary}")
Audio Transcription
Upload an audio recording and receive a speaker-diarized transcript with an AI-generated clinical note. Designed for doctor-patient encounter recordings.
Transcribe Audio
Upload audio and get a transcript with clinical note.
Request (multipart/form-data)
| Field | Type | Required | Description |
|---|---|---|---|
audio | file | Yes | Audio file. Supports WebM, MP4, WAV, MP3, OGG. Max 150 MB. |
language | string | No | Language code (en, es, ca, fr). Defaults to en. |
template | string | No | Clinical note template. Defaults to general_consultation. |
Example
curl -X POST https://api.gomedsync.com/v1/transcribe \
-H "X-API-Key: your_api_key" \
-F "audio=@recording.webm" \
-F "language=en" \
-F "template=general_consultation"
Response
{
"transcript": [
{
"speaker": 0,
"text": "So tell me what brings you in today.",
"start": 0.5,
"end": 2.8
},
{
"speaker": 1,
"text": "I've been having headaches for the past week.",
"start": 3.1,
"end": 5.9
}
],
"clinical_note": "## Chief Complaint\nHeadaches for one week...",
"language": "en",
"template": "general_consultation",
"duration_seconds": 245.6,
"client_id": "your_client_id"
}
Clinical Note Templates
| Template | Use Case |
|---|---|
general_consultation | Standard SOAP-format clinical note |
follow_up | Follow-up visit with progress assessment |
mental_health | Behavioral health / therapy session |
pediatric | Pediatric visit with growth/development |
emergency | Emergency department encounter |
Supported Languages
| Code | Language | Agent Calls | Transcription |
|---|---|---|---|
en | English | Yes | Yes |
es | Spanish | Yes | Yes |
ca | Catalan | Yes (uses Spanish voice) | Yes |
fr | French | Yes | Yes |
pt | Portuguese | Yes | Yes |
Usage Tracking
Get a summary of your API usage over a time period.
curl "https://api.gomedsync.com/v1/usage?days=30" \
-H "X-API-Key: your_api_key"
{
"client_id": "your_client_id",
"period_days": 30,
"usage": {
"transcription": {"count": 42, "total_duration_seconds": 12450},
"agent_call": {"count": 15, "total_duration_seconds": 680},
"whatsapp": {"count": 230, "total_duration_seconds": 0}
}
}
Configuration
Environment Variables
| Variable | Required | Description |
|---|---|---|
ELEVENLABS_API_KEY | Yes (for calls) | ElevenLabs API key for voice agent calls. |
ELEVENLABS_AGENT_ID | Yes (for calls) | Pre-configured ElevenLabs agent ID. |
ELEVENLABS_PHONE_NUMBER_ID | Yes (for calls) | ElevenLabs phone number ID (your registered Twilio number). |
DEEPGRAM_API_KEY | Yes (for transcription) | Deepgram speech-to-text API key. |
GEMINI_API_KEY | Yes (for transcription) | Google Gemini API key for clinical note generation. |
ElevenLabs Setup
To use agent calls, you need to register your Twilio phone number with ElevenLabs:
- Create an ElevenLabs account and get an API key from elevenlabs.io.
- Create a conversational AI agent in the ElevenLabs dashboard (or let the API create one automatically).
- Register your Twilio phone number with ElevenLabs:
curl -X POST https://api.elevenlabs.io/v1/convai/phone-numbers/create \
-H "xi-api-key: your_elevenlabs_key" \
-H "Content-Type: application/json" \
-d '{
"phone_number": "+14155551234",
"provider": "twilio",
"sid": "your_twilio_account_sid",
"token": "your_twilio_auth_token",
"label": "My Clinic Number"
}'
The response includes a phone_number_id — set this as your ELEVENLABS_PHONE_NUMBER_ID environment variable.
Troubleshooting
Call immediately disconnects (0 duration)
Check that your Twilio number is registered with ElevenLabs and that ELEVENLABS_PHONE_NUMBER_ID is correct. Also verify the destination country is enabled in Twilio Geo Permissions.
500 error when initiating a call
Verify that ELEVENLABS_API_KEY, ELEVENLABS_AGENT_ID, and ELEVENLABS_PHONE_NUMBER_ID are all set. A missing phone number ID is the most common cause.
Transcript is empty after call completes
ElevenLabs needs 30-60 seconds after a call ends to process the transcript. Poll the conversation endpoint until status is "done". Very short calls (under 5 seconds) may not generate a transcript.
Robotic voice / high latency
Use the eleven_turbo_v2 model in your ElevenLabs agent configuration. Set stability to 0.3 and streaming latency optimization to 4. Choose a conversational voice like "Chris" over formal voices.
Transcription returns 413
Audio file exceeds the 150 MB limit. Compress the audio or split into shorter segments.
Transcription returns 502
Deepgram or Gemini service error. Check that DEEPGRAM_API_KEY and GEMINI_API_KEY are valid and the services are operational.
Call status webhook not firing
Ensure the callback_url is publicly accessible. The webhook is sent after the call status changes to completed in the Twilio status callback. Check that /v1/agent-call/status is reachable from the internet.
