Hold Music
VoiceRail plays hold music during long reasoning operations, call transfers, and when connecting calls. Configure custom hold music to match your brand.
Overview
Hold music is configured at the organization level. When set, all assistants in your organization will use the same hold music. If not configured, VoiceRail uses a pleasant default track.
When hold music plays:
- During MCP tool calls that take longer than 3 seconds
- During webhook reasoning calls that take longer than 3 seconds
- While connecting outbound calls (ringing phase)
- During call transfers
Supported Formats
VoiceRail supports two audio formats optimized for telephony:
| Format | Requirements | Notes |
|---|---|---|
| MP3 |
| Recommended. Smaller file size. |
| WAV |
| Larger files but simpler encoding. |
Important: ID3v2 Tag Requirement
MP3 files must include an ID3v2 tag header. Files without this tag will fail to play. Most audio software adds this automatically, but if you're using ffmpeg, ensure you include -id3v2_version 3.
Converting Audio with FFmpeg
Use FFmpeg to convert any audio file to a compatible format:
Convert to MP3 (Recommended)
# Convert any audio file to VoiceRail-compatible MP3
ffmpeg -i input.wav \
-c:a libmp3lame \
-b:a 128k \
-ar 16000 \
-ac 1 \
-id3v2_version 3 \
-write_id3v2 1 \
output.mp3
# Explanation:
# -c:a libmp3lame Use LAME MP3 encoder
# -b:a 128k 128kbps bitrate (sufficient for voice/music)
# -ar 16000 16kHz sample rate (telephony standard)
# -ac 1 Mono audio (required for telephony)
# -id3v2_version 3 Include ID3v2 tag (required)
# -write_id3v2 1 Force ID3v2 headerConvert to WAV
# Convert to VoiceRail-compatible WAV
ffmpeg -i input.mp3 \
-c:a pcm_s16le \
-ar 16000 \
-ac 1 \
output.wav
# Explanation:
# -c:a pcm_s16le 16-bit signed little-endian PCM
# -ar 16000 16kHz sample rate (telephony standard)
# -ac 1 Mono audio (required for telephony)File Size Guidelines
Keep your hold music files small for fast loading:
| Duration | MP3 (128kbps) | WAV (16-bit) |
|---|---|---|
| 30 seconds | ~480 KB | ~960 KB |
| 1 minute | ~960 KB | ~1.9 MB |
| 2 minutes | ~1.9 MB | ~3.8 MB |
| Recommended | 30-60 seconds | < 2 MB |
Tip: Hold music loops automatically. A 30-second track is usually sufficient - longer tracks increase load time without benefit.
Hosting Your Audio
Your hold music URL must be publicly accessible or include authentication (like a SAS token). We recommend Azure Blob Storage or AWS S3.
Azure Blob Storage
# Upload to Azure Blob Storage
az storage blob upload \
--account-name yourstorageaccount \
--container-name audio \
--name hold-music.mp3 \
--file output.mp3 \
--content-type audio/mpeg
# Generate SAS URL (valid for 1 year)
az storage blob generate-sas \
--account-name yourstorageaccount \
--container-name audio \
--name hold-music.mp3 \
--permissions r \
--expiry $(date -u -d "1 year" +%Y-%m-%dT%H:%MZ) \
--full-uriAWS S3
# Upload to AWS S3
aws s3 cp output.mp3 s3://your-bucket/audio/hold-music.mp3 \
--content-type audio/mpeg
# Generate pre-signed URL (valid for 7 days max for S3)
aws s3 presign s3://your-bucket/audio/hold-music.mp3 \
--expires-in 604800
# For longer access, make the object public or use CloudFrontURL Requirements
- HTTPS required - HTTP URLs are not supported
- Direct link - No redirects or login pages
- Stable URL - Ensure SAS tokens have long expiry (1+ year)
- CORS not required - VoiceRail fetches server-side
Configuring Hold Music
Update your organization settings with your hold music URL:
curl -X PATCH https://api.voicerail.ai/v1/organizations/{org_id} \
-H "Authorization: Bearer $VOICERAIL_KEY" \
-H "X-Organization-Id: $ORG_ID" \
-H "Content-Type: application/json" \
-d '{
"holdMusicUrl": "https://your-storage.blob.core.windows.net/audio/hold-music.mp3?sv=..."
}'To remove custom hold music and return to the default, set holdMusicUrl to null.
Best Practices
Choose appropriate music
Select calm, non-distracting music without lyrics. Instrumental tracks work best. Avoid music that might clash with your brand or caller expectations.
Ensure seamless looping
Edit your audio so it loops cleanly without clicks or jarring transitions. The last note should flow naturally into the first.
Normalize volume levels
Match the volume of your hold music to the assistant's voice. Callers shouldn't need to adjust their volume when switching between hold and conversation.
Test on actual phones
Telephony audio sounds different from computer speakers. Test your hold music by making actual calls to ensure it sounds good over the phone network.
Respect copyright
Ensure you have rights to use your chosen music. Consider royalty-free music libraries or commissioning original compositions.
Troubleshooting
| Problem | Solution |
|---|---|
| No audio plays | Check URL accessibility. Try opening in browser. Verify SAS token hasn't expired. |
| Audio sounds distorted | Re-encode at 16kHz sample rate. Ensure mono channel. |
| MP3 fails to load | Verify ID3v2 tag is present. Re-encode with -id3v2_version 3. |
| Audio too quiet/loud | Normalize to -16 LUFS (broadcast standard). Use ffmpeg's loudnorm filter. |
| Long load time | Reduce file size. Use MP3 instead of WAV. Host in same region as VoiceRail (East US 2). |
Related documentation
- • Configure assistants
- • MCP integration (hold music plays during tool calls)
- • API reference