AWS re:Invent 2018 Day Three Recap

Here are notes from day three of AWS re:Invent 2018. If you missed day one, you can read those notes here, or read day two here. Want to continue the conversation? Reach out at dustin@dcoates.com or on Twitter.

Alexa Skill Developer Tools: Build Better Skills Faster

I spent most of the day walking around the expo hall and catching up with people, so the only session was from Paul Cutsinger of Amazon and Dylan Zwick of Pulse Labs. They presented on developer tooling for building Alexa skills. Paul, of course, has experience internal to Amazon, while Pulse Labs is the builder of a testing platform for voice applications. With Pulse Labs, voice developers and designers can present their applications to vetted users, and—near magically—can listen in on the interactions and see the overall successes or failures.

Because sessions were repeated, the presentations from the embedded videos might not match exactly what I present here. This is the summary of the session that I attended.

Cutsinger mentioned that we are at an inflection point of moving from keyboard to voice. He’s right, and people are coming in to build for voice and they need the tooling to be successful. These tools will enable parts of the development cycle that, roughly, fit into four buckets. Design, build, test, and launch.

Design

Design is a step that’s often overlooked by voice developers. Or all developers, to be honest about it. At voice it’s very important because the design informs the functionality and the capability informs the design more than it does on the web or other visual media. If a user can’t understand where to go within the application, the design needs to change to clear it. If the SLU can’t reliably understand the invocation name, that invocation name haas to change.

There are three core principles of voice design:

Adaptability
Contextuality
Availability

Adaptability is when the application meets users where they are. The voice designers and developers need to anticipate the language that their users will use. The example from the session was: “how many different ways are there to say yes or no?” Yes, yeah, uh huh, sure, nah, no way, etc. This clearly isn’t visual design where you choose the right label text and that’s what you’ve got.
Another example that Zwick gave was a bartender skill. The skill understood whiskey but not Johnny Walker. These examples show that the work rests on the builders, not the users to be sure that the user can be successful.

One way to do this is through slots and synonyms. Most developers, especially when they start, use slots to get variable information. They are created for that, but that’s not their only use. What if… you didn’t use the slot value at all? Cutsinger showed an example where the beginning of his sample utterances had a slot whose value he threw away. Instead, it was there to take care of “well,” “so,” and other sentence starters without creating a number of near-duplicate utterances.

Zwick and Cutsinger summarized contextuality with “individualize your entire interaction.” One way to do this that is well-written about is to randomly pluck a response and offer it back to the user. Afterall, no human says the same phrase every time. Zwick warns that pure randomness “can be dangerous.” The responses all need to make sense in the context of the conversation.

Additionally, a successful skill will retain information about previous experiences. Think of this broadly. Retain information about what’s happening in the current session, what the user has done before, and what all users have done before to form the experience to the user needs.

To make a skill available, make as many of the interactions as top-level as possible. Think not in terms of a flow chart from which users can’t escape. You also don’t want a user to ever have to say “start over” or “exit” in order to change the topic of conversation. Quoting Zwick: “Wouldn’t it be nice if, when the user said ‘Find A Wrinkle In Time’ it would just find A Wrinkle In Time?”

Build

Most all of you know already about the Alexa Skills Kit (ASK) SDK. The CLI and the Skill Management API (which is shortened to SMAPI, and is to my surprise prounced “Smappy” which rhymes with happy) make the skill development process even more efficient. Those who have picked up skill development recently might have a hard time understanding how impactful SMAPI was. It made skill development feel instantly more mature.

Some useful starter CLI commands:

$ ask new: create a boilerplate skill project
$ ask deploy: deploy the skill and fulfillment
$ ask deploy -t lambda: deploy just the fulfillment when using AWS Lambda

If you don’t want to go into the CLI and you use VS Code, you can also use the ASK Visual Studio Code extention, which brings all of the CLI functionality to the text editor.

Test

The first way to start testing is through a device or the developer console. The AWS Lambda console provides a quick way to do quasi-unit testing with functionality to send an event to the Lambda function and see the response.

For what it’s worth, I generally do the same, but I do it locally. Lambda appears really magical, but it’s really just invoking an exported function. That’s why I put this in every one of my projects:

const simulation = require("./simulation.json");
const request = simulation.result.skillExecutionInfo.invocationRequest.body;
const handler = require("./lambda/custom/index.js").handler;

handler(request, {}, (err, res) => {
if (err) {
  console.log(`err ${err}`);
} else {
  console.log(res);
}
});

simulation.json is a JSON response that I get from running the $ask simulate command, which simulates skill invocation.

For true unit tests, Bespoken provides a very strong suite of voice testing tools, including ones that use a “fake Alexa” to make a fake voice request.

Unit tests show that the code works, but only users will show if the skill works. The first line is usually the people next to you, but they’re too close to provide great feedback. Then move out in circles. Ask friends and family. Then go for true beta testers. How to get those? Conveniently, Pulse Labs can help you out. Cutsinger and Zwick said it in a useful, pithy way: “the easier the testers are to get, the less valuable the feedback is.” Pulse Labs and similar services don’t make the testers less valuable because it’s easy for you. They just take on the difficulty.

Launch

Nothing, though, is better than actual usage. Set up logging and see how users are moving through the skill. Cutsinger said that people should roll their own analytics. (I think implied is that this is particularly true for a “real” project that you’re doing for money in one senese or another.) Still, Amazon provides analytics tools as well, such as overall usage and retention date, interaction paths, and ISP (in-skill purchasing) reports.

I’ll be back tomorrow with recaps from Thursday, day four, of re:Invent 2018.