An in-depth analysis of the Alexa Skills Kit SDK for Node.js
When I first did this exercise, the Alexa Skills Kit SDK for Node.js had only been around for a very brief period and was 183 lines long.
You could say it’s grown a bit. It’s now 936 lines and has been around for years. This is going to be a lot, so I might break it up into multiple posts and link from here. Or maybe I’ll include everything on a single page. We’ll see!
For the series, I’m looking at the most recent code. The Dig Deep series will not be the place to go for a quick tutorial, but if you’re trying to comprehend a certain method that’s giving you trouble, you’re at the right place.
I’ll try to keep this up to date with any changes, but tweet at me if you see anything I haven’t gotten to.
Looking at this file’s code: alexa.js.
The require statements
This is simple enough, as the code is pulling in some packages:
- This one will be used very heavily, as we’ll see.
- If you’re unfamiliar with this in Node.js or you come from a front-end background, this is very much like the events we listen for and sometimes emit in the browser
- This is only used once, incidentally with a method that is discouraged
- Used for internationalization, as the name implies
- Enables sprintf style string interpolation (e.g.
"This is my %s string", "fantastic"becomes
"This is my fantastic string"), like you might see in C or even
- Enables sprintf style string interpolation (e.g.
DynamoAttributesHelperwe’ll look at later
responseHandlerswe’ll also examine later
Just after pulling in code from packages and external files, we’ve got this single line.
This serves simply as a constant for a key name that is used a handful of times in which to store or get the state off the session’s attributes. For example, we see it again later with
this.state = this._event.session.attributes[_StateString] || '';. It’s also exported as
Here the code is inheriting all of the properties on the prototype of
EventEmitter. Using ES6, we would instead have something like:
It’s not an issue here, but with
inherits, we would get false when checking if
EventEmitter.isPrototypeOf(AlexaRequestEmitter). You can read more here.
AlexaRequestEmitter is the heavy lifter of this SDK. We’ll go over it again often, so take a second to just note that it inherits from
EventEmitter and understand what that means: we can emit and respond to events.
AlexaRequestEmitter is the heavy lifter of this code,
alexaRequestHandler is the public face. If you’ve ever used the Node.js SDK then you’ve used
alexaRequestHandler directly, though you might not realize it since it’s renamed to just
handler when it’s exported in the package. (You can see that here.)
There are dozens of lines of codes, which will look at bit-by-bit:
Some simple setup to allow for session manipulation. If there’s no
session on the
event object, we’ll set it to an object with an
attributes key. If there is that
session object but no
attributes object, we’ll set it up as an empty object.
This creates a new instance of
AlexaRequestEmitter (which, in turn, inherits from
EventEmitter) and sets an upper bound of the number of listeners allowed at
Infinity. The default max is 10, but we’ll be listening to a lot more events than that. (Setting the max number of listeners to
0 is functionally the same, though
Infinity is clearer about the intention.)
Next up we have 13 uses of
Object.defineProperty, all to define properties on
handler (the instance of
AlexaRequestEmitter). Instead of writing the code out for all of them, here’s a table:
This is set to the
event object provided from the Alexa service when the function is invoked. It cannot be changed (and is meant to be private, as denoted by the
Just like above, set to the
context object that comes as an argument from the Alexa service on function invocation. Also cannot be changed.
Once again: comes from the argument of nearly the same name
callback invoked on the function. Not overwritable, but it is useful to know that there isn’t always a callback and, when there is, it’s only called in the event of a failure or exception.
Now we’re getting into the fun stuff. This keeps track of the user’s state as they move through the skill and is set to
null at invocation, as there’s no state. It can (obviously) be overwritten.
It is also “configurable” meaning it can be deleted from the object and the
enumerable attributes can be changed. This configuration flexibility isn’t leveraged anywhere in the SDK and I can’t think of many uses for it outside of “locking” the state by making it unwritable.
The state is used to route the user to multiple intents with the same name of different… states. For example, a user saying “Yes” at the beginning of a skill would likely set off a different intent than one saying the same thing after they’ve completed what they came to do and Alexa has asked them “Are you sure you want to leave?”
Likewise initially set to
null and used to validate that the function is invoked from your app and not someone else’s.
It’s not required to set the
appId, but if you don’t and someone else finds your function’s endpoint, they could use it themselves. Setting the
appId provides extra security by comparing the configured
appId to the one provided on the request.
We’ll come back to this later when we look at
ResponseBuilder. It provides methods to build a response rather than emitting them (e.g.
The SDK has some built-in tooling to save the current session to DynamoDB. We’ll go into more detail in another post, but just know that you don’t have to do the full work anymore. (Unless you really want to, in which case, can I recommend you rethink what makes you happy?)
saveBeforeResponse is really interesting. Interesting why? Because it’s not used at all in this file and it’s not documented in the
README. It is used in response.js to store the current state to DynamoDB (which is also done if the response should end the session via
shouldEndSession or if
:saveState is emitted with a second argument of
This stores the
i18next module. Technically this can be changed (
true), but in practice you probably wouldn’t.
i18next is used the
execute property, which can’t be overwritten, and any internationalization/localization module you would use would also have to use
Which is to say: you won’t need to use this directly, but it’s useful to know what’s powering the localization if you ever run into problems.
Specifies which localized string should be used if specified with
resources (see below). This will never be set by you, as it’s overwritten by the request object on each intent invocation.
This probably deserves its own post, and I’ll likely do one in the future. We’ll certainly see it again in this post.
It’s an optional property (set to
undefined at instantiation) that stores all of the strings you want for localization (as of this writing,
de). It’s used in conjunction with the previous two properties. That is, using the localization/internationalization module and the session’s current locale, the current string is plucked from
this.t('AFTERNOON_SNACK') could be
peanut butter and jelly or
tea, depending on the locale.
We’re now back into the properties that can’t be overwritten and the properties that are functions. This one is:
handler is what all of these properties are being added to. The
arguments will always be response handlers—how we craft a response (e.g.
:tellWithCard, etc). I won’t go in-depth on RegisterHandlers here, because I do that below.
This is another function that calls another function:
HandleLambdaEvent does what it says in the name, but that actually means a lot: it verifies that the event came from the correct application (if
appId is set), it saves the session to DynamoDB (if
dynamoDBTableName is set), handles any errors or exceptions, and then emits the event. The takeaway here? All of the above is done before our intent handlers are even invoked.
We’ve already looked at a lot and we’ve really only gotten through the setup. In this next section, we’ll take a look at
HandleLambdaEvent. Click here to read that post.