Dustin Coates in Alexa

Alexa SSML Audio Embed Gotchas

The audio tag in SSML for Alexa embeds an external audio file in the Alexa response. You can use this for a sound effect, some short music, or words in another language. You need to consider some things, however. Some of these Amazon is up front about, some of them not. Here’s a list of what I’ve gathered so far:

  • Must be MP3
  • Must be hosted on HTTP
  • Cannot be longer than 90 seconds long
  • Bit rate must be 48 kbps and 16000 Hz (which you can convert in a program like FFMPEG or Audacity)
  • Following up a sound file with a period (for example, I’ve had a non-English word in an English sentence) will be pronounced <audio> dot
  • File names cannot have a space

This last one isn’t documented and tripped me up the first time. Alexa or Cloudwatch doesn’t give you the reason why the response doesn’t work, it just doesn’t. Try this checklist next time your audio isn’t working with SSML on Alexa.