Hold Music

Overview

Hold music is configured at the organization level. When set, all assistants in your organization will use the same hold music. If not configured, VoiceRail uses a pleasant default track.

When hold music plays:

During MCP tool calls that take longer than 3 seconds
During webhook reasoning calls that take longer than 3 seconds
While connecting outbound calls (ringing phase)
During call transfers

Supported Formats

VoiceRail supports two audio formats optimized for telephony:

Format	Requirements	Notes
MP3	16kHz sample rate Mono channel 64-128kbps bitrate Must have ID3v2 tag	Recommended. Smaller file size.
WAV	16kHz sample rate Mono channel 16-bit signed PCM Little-endian byte order	Larger files but simpler encoding.

Important: ID3v2 Tag Requirement

MP3 files must include an ID3v2 tag header. Files without this tag will fail to play. Most audio software adds this automatically, but if you're using ffmpeg, ensure you include -id3v2_version 3.

Converting Audio with FFmpeg

Use FFmpeg to convert any audio file to a compatible format:

Convert to MP3 (Recommended)

Terminal

# Convert any audio file to VoiceRail-compatible MP3
ffmpeg -i input.wav \
  -c:a libmp3lame \
  -b:a 128k \
  -ar 16000 \
  -ac 1 \
  -id3v2_version 3 \
  -write_id3v2 1 \
  output.mp3

# Explanation:
# -c:a libmp3lame   Use LAME MP3 encoder
# -b:a 128k         128kbps bitrate (sufficient for voice/music)
# -ar 16000         16kHz sample rate (telephony standard)
# -ac 1             Mono audio (required for telephony)
# -id3v2_version 3  Include ID3v2 tag (required)
# -write_id3v2 1    Force ID3v2 header

Convert to WAV

Terminal

# Convert to VoiceRail-compatible WAV
ffmpeg -i input.mp3 \
  -c:a pcm_s16le \
  -ar 16000 \
  -ac 1 \
  output.wav

# Explanation:
# -c:a pcm_s16le    16-bit signed little-endian PCM
# -ar 16000         16kHz sample rate (telephony standard)
# -ac 1             Mono audio (required for telephony)

File Size Guidelines

Keep your hold music files small for fast loading:

Duration	MP3 (128kbps)	WAV (16-bit)
30 seconds	~480 KB	~960 KB
1 minute	~960 KB	~1.9 MB
2 minutes	~1.9 MB	~3.8 MB
Recommended	30-60 seconds	< 2 MB

Tip: Hold music loops automatically. A 30-second track is usually sufficient - longer tracks increase load time without benefit.

Hosting Your Audio

Your hold music URL must be publicly accessible or include authentication (like a SAS token). We recommend Azure Blob Storage or AWS S3.

Azure Blob Storage

Terminal

# Upload to Azure Blob Storage
az storage blob upload \
  --account-name yourstorageaccount \
  --container-name audio \
  --name hold-music.mp3 \
  --file output.mp3 \
  --content-type audio/mpeg

# Generate SAS URL (valid for 1 year)
az storage blob generate-sas \
  --account-name yourstorageaccount \
  --container-name audio \
  --name hold-music.mp3 \
  --permissions r \
  --expiry $(date -u -d "1 year" +%Y-%m-%dT%H:%MZ) \
  --full-uri

AWS S3

Terminal

# Upload to AWS S3
aws s3 cp output.mp3 s3://your-bucket/audio/hold-music.mp3 \
  --content-type audio/mpeg

# Generate pre-signed URL (valid for 7 days max for S3)
aws s3 presign s3://your-bucket/audio/hold-music.mp3 \
  --expires-in 604800

# For longer access, make the object public or use CloudFront

URL Requirements

HTTPS required - HTTP URLs are not supported
Direct link - No redirects or login pages
Stable URL - Ensure SAS tokens have long expiry (1+ year)
CORS not required - VoiceRail fetches server-side

Configuring Hold Music

Update your organization settings with your hold music URL:

Terminal

curl -X PATCH https://api.voicerail.ai/v1/organizations/{org_id} \
  -H "Authorization: Bearer $VOICERAIL_KEY" \
  -H "X-Organization-Id: $ORG_ID" \
  -H "Content-Type: application/json" \
  -d '{
    "holdMusicUrl": "https://your-storage.blob.core.windows.net/audio/hold-music.mp3?sv=..."
  }'

To remove custom hold music and return to the default, set holdMusicUrl to null.

Best Practices

Choose appropriate music

Select calm, non-distracting music without lyrics. Instrumental tracks work best. Avoid music that might clash with your brand or caller expectations.

Ensure seamless looping

Edit your audio so it loops cleanly without clicks or jarring transitions. The last note should flow naturally into the first.

Normalize volume levels

Match the volume of your hold music to the assistant's voice. Callers shouldn't need to adjust their volume when switching between hold and conversation.

Test on actual phones

Telephony audio sounds different from computer speakers. Test your hold music by making actual calls to ensure it sounds good over the phone network.

Respect copyright

Ensure you have rights to use your chosen music. Consider royalty-free music libraries or commissioning original compositions.

Troubleshooting

Problem	Solution
No audio plays	Check URL accessibility. Try opening in browser. Verify SAS token hasn't expired.
Audio sounds distorted	Re-encode at 16kHz sample rate. Ensure mono channel.
MP3 fails to load	Verify ID3v2 tag is present. Re-encode with `-id3v2_version 3`.
Audio too quiet/loud	Normalize to -16 LUFS (broadcast standard). Use ffmpeg's `loudnorm` filter.
Long load time	Reduce file size. Use MP3 instead of WAV. Host in same region as VoiceRail (East US 2).

Overview

When hold music plays:

Supported Formats

Important: ID3v2 Tag Requirement

Converting Audio with FFmpeg

Convert to MP3 (Recommended)

Convert to WAV

File Size Guidelines

Hosting Your Audio

Azure Blob Storage

AWS S3

URL Requirements

Configuring Hold Music

Best Practices

Choose appropriate music

Ensure seamless looping

Normalize volume levels

Test on actual phones

Respect copyright

Troubleshooting

Related documentation