APL: Alexa Presentation Language Walkthrough

Over a year ago, Amazon sent out a survey that asked about the interest of many items, but one stood out to me. That was a way to have more control over display experiences, on devices such as the Echo Show or on the Fire TV. Some people said they saw a similar survey question even earlier. This is to say that Amazon has been working on this for a while, and now it’s here. APL, or Alexa Presentation Languge, gives skill developers a way to really control what their skills show on the screen and provide richer interactions with it. I want to share an initial look at APL. (More will certainly be on its way.)

What Does It Do?

Fundamentally, APL will become the way to create rich visual experiences for Alexa. And when I say rich, it truly is rich. This isn’t display templates version 2.0. APL provides developers with state, plus event handlers and can respond to events as well. Once you dig into the challenges with visual that are different across every single skill, you understand why it took Amazon a while to get this right. Having a set display template for lists makes handling “select the second one” (fairly) easy, but how do you do that when you have no idea what the skill put on the screen? We’ll see that later.

(What’s next for display templates? I’m not sure. I imagine that Amazon will continue to support them for backwards compatibility, but they would also like you to move over to APL. For the moment, you should support both in your skills because some devices will not yet have the update that adds APL support. Importantly, though, you should not send both a display template and APL to the device. Scary things will happen.)

Of What Is It Made?

The fundamental concepts of APL are:

Layouts – where are the pieces of content places in relation to each other?
Styles – these are things like color, size, and spacing
Data – you can also think of this as the “content” that will display on the screen

If you’re thinking of this in a context of front-end web development with HTML, CSS, and JavaScript, then you should stop doing that. Unlike that structure, where HTML holds both the data and the layout, APL separates the data the layout. This may be a change in your mental model, but will be worthwhile once you get used to it. Instead of needing to build the layout each time, you can build the layout just once and assemble only the content.

Let’s examine a layout, because that will teach us more about what it’s all about.

Layouts

A layout is a component, that can contain other layouts, can receive information from its own parent layout, and can look different based on data or context. Here you see the same data and the same APL document leading to two different layouts depending on the shape and size of the display. (Alexa calls these small hub and medium hub, respectively.)

Round display

Rectangle display

The code that puts this together is fairly simple to understand. The best way to discuss it is to start sort of in reverse and then work down to the round/rectangle differences. All layouts must start with a mainTemplate and so must we.

mainTemplate

"mainTemplate": {
    "parameters": [
        "payload"
    ],
    "item": [
        {
            "type": "ListTemplate2",
            "backgroundImage": "${payload.listTemplate2Metadata.backgroundImage.sources[0].url}",
            "title": "${payload.listTemplate2Metadata.title}",
            "hintText": "${payload.listTemplate2Metadata.hintText}",
            "logo": "${payload.listTemplate2Metadata.logoUrl}",
            "listData": "${payload.listTemplate2ListData.listPage.listItems}"
        }
    ]
}

parameters on a template represents an argument that a template can receive, though not necessarily an argument that the template will receive. Every main template will receive a "payload" parameter, which represents the data the skill sends along with the APL document.

Templates will then have either an item or an array of items, with the corresponding keys depending on whether there’s one or more. Technically these are an array of components. Components in turn are of two types. The first kind are the built in components. These are one of Image, Text, Container, ScrollView (roughly, a container with one item that scrolls vertically), Frame (a container with a border), Sequence (a ScrollView with many items), TouchWrapper (a container users can touch), or Pager (kind of like a user-driven slide show). Some of these descriptions are simplified, but you generally understand.

The second type of components are developer-created layouts. That’s the case here. There is no standard ListTemplate2 for APL, but the document the skill sends defines it, much as it defines mainTemplate. You’re specifying which template through the type property, and each subsequent property represents a parameter sent to the template.

I’ve clearly not had a lot of time with APL as of yet, but I would imagine that a best practice is moving away from the main template quickly. This can be used to accept the incoming payload, then hand off the heavy layout work to other templates. You can then use those templates in multiple documents.

Custom Templates

An example of the templates you should quickly move to is right here with the ListTemplate2.

"ListTemplate2": {
    "parameters": [
        {
            "name": "backgroundImage",
            "type": "string",
            "default": "https://via.placeholder.com/150"
        },
        "title",
        "logo",
        "hintText",
        {
            "name": "listData",
            "type": "array"
        }
    ],
    "items": [
        {
            "when": "${viewport.shape == 'round'}",
            "type": "Container",
            "height": "100%",
            "width": "100%",
            "items": [
                {
                    "type": "Sequence",
                    "scrollDirection": "horizontal",
                    "data": "${listData}",
                    "height": "100%",
                    "width": "100%",
                    "numbered": true,
                    "item": [
                        {
                            "type": "FullHorizontalListItem",
                            "listLength": "${listData.length}"
                        }
                    ]
                }
            ]
        },
        {
            "type": "Container",
            "height": "100vh",
            "items": [
                ...
            ]
        }
    ]
}

That data sent through from the main template appears here right away through the list of parameters. Notice one thing: some of the parameters are objects. This is a best practice, because it adds typing and the possibility of default values to the parameters.

In the item list, there’s an important new property: when. This specifies the condition under which a given item is displayed. If there are multiple conditions that evaluate to true, then the top one will appear as here where the second item has no such conditions and will always evalute to true.

The condition is extrapolated within the string, and that can be done through any property value through the use of the ${} syntax. This interpolation can include expressions, parameter data, and resources (we’ll look at it later, but essentially constants). The interpolation, as implied, can be combined with text to form what the user sees on the screen.

This layout is still one composed of other layouts, and so we’re not yet seeing any styles. Let’s look at them now.

Styles

Similar to the standard relationship between CSS and HTML, styles in APL are defined with an identifier, and then referenced by the layout.

{
    "type": "Text",
    "text": "<b>${ordinal}.</b> ${data.textContent.primaryText.text}",
    "style": "textStyleSecondary",
    "maxLines": 1,
    "spacing": 12
},

This text component doesn’t have any styles on it directly, but it does reference a style named textStyleSecondary. Note carefully the name of the property. It’s the singular style, not styles. All of the style composition needs to happen where you define the styles.

The style definitions occur at the top level of the document, in a styles object with the style names as properties.

"styles": {
    "textStyleBase": {
        "description": "Base font description; set color and core font family",
        "values": [
            {
                "color": "@colorTextPrimary",
                "fontFamily": "Amazon Ember"
            }
        ]
    },
    "textStyleBase1": {
        "description": "Light version of basic font",
        "extend": "textStyleBase",
        "values": {
            "fontWeight": "300"
        }
    },
    ...
    "mixinBody": {
        "values": {
            "fontSize": "@textSizeBody"
        }
    },
    ...
    "textStyleBody": {
        "extend": [
            "textStyleBase1",
            "mixinBody"
        ]
    },
    ...
},

This snippet shows the composition that ultimately leads to the named style textStyleBody. It starts with textStyleBase, which sets the color and font family. Then textStyleBase1 extends that to add a font weight. mixinBody is defined according to a resource (again, more about that in a bit) to set a font size. Then textStyleBase1 and mixinBody combine to form styleStyleBody.

Something we haven’t yet touched on (I just made a pun, but you don’t know it yet) is the topic of component states.

Component State and Styling

Components can have different states when users interact with them. For example, the TouchWrapper component can have states of disabled, focused, karaoke (when text is being spoken), and pressed. There’s not much in the documentation as yet on the different states different components can have, so I’ll fill this in when I find more information.

Finally, finally, the long teased resources.

Resources

Those values that start with an @ are resources. Again, you can think of them as constants. You’ll first, and probably most often, go to them in style definitions.

{
    "description": "Stock color for the light theme",
    "colors": {
        "colorTextPrimary": "#151920"
    }
},
{
    "description": "Stock color for the dark theme",
    "when": "${viewport.theme == 'dark'}",
    "colors": {
        "colorTextPrimary": "#f0f1ef"
    }
},

In these two definitions, the resource @colorTextPrimary is in fact defined twice, but is defined conditionally. Think of these a bit like media queries in CSS. (And, I know what you’re thinking. You want to use data in the conditions. I get it. But you can only use the properties of the viewport.) Just like in CSS, these cascade, with the later ones taking precedence.

You can also group resources together for better organization.

{
    "description": "Standard font sizes",
    "dimensions": {
        "textSizeBody": 48,
        "textSizePrimary": 27,
        "textSizeSecondary": 23,
        "textSizeDetails": 20,
        "textSizeSecondaryHint": 25
    }
},

You could then refer to @textSizeBody in the styles, or any of the other four values in this grouping.

One thing you may have noticed were the keys for the resource objects. These can be one of boolean, color(s), dimension(s), or string(s). I’m not entirely clear at the moment, but I believe these are for documenting and linting the values (the developer console will warn you if you provide a color when you should have a dimension, for example).

That last type is for strings, which should show you that resources are not only for styles. You might use a resource to specify a S3 bucket prefix, or a default value for headers, or anything else. Again, these are simply constants you can use throughout your layout and styles.

Not constant is the data that’s coming in and is the final piece of assembling the display.

Data

The data is completely separate from the document (the layouts, styles, and resources). This is shown clearly by how you send the response from the fulfillment (taken from the official documentation):

return input.responseBuilder
          .speak(speechText)
          .reprompt(repromptText)
          .addDirective({
              type: 'Alexa.Presentation.APL.RenderDocument',
              version: '1.0',
              document: myDocument,
              datasources: {}
          })
          .getResponse();

In this case, there’s no data, but you can see that it’s completely separate from the document.

The thing about the data is that it’s just JSON, and there’s no specific format to which you must hew. All of it comes through to the main template as that payload, and the layout does whatever it wishes with it.

Transformers

One thing that you may want that needs a specific format is to use transformers. Transformers take an object with a specific format and transform it into another object with another format. There are three transformers in the beginning and you can’t bring your own (yet, or maybe ever). These three are ssmlToSpeech (transforms SSML so it can be spoken with a specific APL command), ssmlToText (strips out all SSML to ready it for display), and textToHint (places hint text, including the user’s chosen wake word).

To be eligible for a tansformer, data needs to be of the “object” type (it’s the only type there is), which takes the format of an object with a type (of `“object”`) and properties. Again, from the official documentation:

{
    "datasources": {
            "catFactData": {
                "type": "object",
                "properties": {
                    "title": "Cat Fact #9",
                    "catFactSsml": "<speak>Not all cats like <emphasis level='strong'>catnip</emphasis>.</speak>"
                },
                "transformers": [
                    {
                      "inputPath": "catFactSsml",
                      "outputName": "catFactSpeech",
                      "transformer": "ssmlToSpeech"
                    },
                    {
                      "inputPath": "catFactSsml",
                      "outputName": "catFact",
                      "transformer": "ssmlToText"
                    }
                ]
            }
    }
 }

The inputPath here references the property to be transformed, while the outputName is the property the output will then live on and the property you’ll reference inside of the document.

With APL you can put nearly anything on a user’s display. Build something nice enough, and users will want to interact with it. Those are done through user events and APL commands, and it’s the subject of the next blog post.

What do you think of APL? Share your thoughts with me on Twitter at @dcoates.