Dustin Coates in Alexa

Perspectives on Building the Algolia Alexa Adapter

Since we’re quickly coming up on the 1.0 release of the Algolia Alexa Adapter, it’s a good time to discuss why I built it and what the thought process was when making specific choices.

(Really quick, in case you don’t know already: Algolia is a search-as-a-service provider, and—incidentally—my daytime employer. We power the search that powers thousands of websites and apps, from behemoths like Twitch and Periscope to small “mom and pop” e-commerce shops.)

The impetus to build the adapter came about for a couple reasons: market hints and personal desire.

The market hints came about because I speak with people who are integrating search into their websites and apps every day. My colleagues do the same. And we started hearing rumblings that they were starting to think about voice search. They didn’t know what they wanted to do exactly: they’re still in mid-2017 in that stage. But they know they want to start experimenting with it.

If they were having trouble coming up with a use case, a daunting look at integrating Alexa and Algolia would have scared them off completely. I started thinking about how we could make that simpler.

I built a couple Alexa Skills using Algolia and gave a talk to my colleagues internally (a common pre-happy hour thing in our Paris office). Interest was drummed up.

And, of course, part of the push was my own selfish wants. I really enjoy building with the Alexa Skills Kit and I love building on top of Algolia. And what I love is seeing a finished product or trying something that I haven’t tried before, not repeating the same thing I’ve done before.

The adapter is very much “convention over configuration” that creates a very specific structure for developers to use when combining Algolia and the Alexa Skills Kit. It’s built specifically for AWS Lambda and is built in JavaScript. My introduction to programming came via a healthy mix of JavaScript and Rails, so these paradigms were a natural choice.

One of the things I’ve (and surely you’ve) noticed while building skills is that there are some things that you do every single time. For example, some semblence of this code appears in every single JavaScript-based skill on Lambda:

exports.handler = function(event, context, callback) {
  var alexa = Alexa.handler(event, context);
  alexa.appId = appId;
  alexa.dynamoDBTableName = dynamoDBTableName;
  alexa.registerHandlers(newSessionHandlers, getRulesHandlers);
  alexa.execute();
}

I loathe typing things over and over. It reminds me that my new keyboard hasn’t yet come in and makes me sad.

Thinking About the Interface

My colleague Vincent, who is a fantastic basketball player and even better JavaScript developer, suggested doing README driven development. It’s not a new idea—the linked blog post above is from 2010—but it’s not an approach I’ve taken before, having largely worked on web apps.

From Nicolás Bavacqua:

README-first is a powerful notion. You sit down, you design your library, flesh out an API, write it down, and get to coding. It’s sort of what you would do with TDD, but without the drawback of intensely slowing down your pace, as writing and rewriting your tests is a much slower proposition than rewriting documentation.

So I started by outlining how developers would use the adapter. It’s changed a bit since then. For example, it was simplified significantly for the initial rollout and things like separate configurations for comparisons and facets are likely never coming, having been replaced with a searchParams object.

Something that is interesting to point out is the idea of “templates” that could be used for the Alexa response. They would have looked something like this:

SearchProductIntent: {
  speechTemplate: 'The top product is  from . It costs $.',
  emptyTemplate: function(data){
    return 'There were no products for your  search.'
  }
}

This was an idea I took from our InstantSearch.js library. If the developer provides a string, a mustache-like syntax interpolates the data to return something to be displayed (in the case of InstantSearch.js) or spoken (in the case of Alexa). If the dev uses a function, there’s a lot more flexibility, but also more that’s needed to be done in the code.

I ultimately decided to drop it because the Alexa responses are generally quite complex (at the very least, you have to choose which kind of response you want, e.g. ask, askWithCard, tell, etc.) and so you need that extra flexibility. I still very much like the idea and wouldn’t be surprised to see myself attempting to add it back in.

The Code

The code itself is quite small. There are a couple parts that I find interesting and worth being called out:

Dependency Dependency Injection

The first thing is that you can inject whichever version of the Algolia JavaScript client or Alexa Skills Kit SDK you want into the adapter. (As of this writing, we use 3.20.3 and 1.0.6, respectively, but I expect that to change shortly.)

We don’t advertise this anywhere and I’d be quite surprised if anyone was taking advantage of it. But it made life a little easier for testing purposes and relieves some pressure to be constantly updating the dependencies.

Intent Capturing

The second thing is, you could say, the raison d’être of the adapter: intents are captured and the query sent to Algolia.

This required one pretty rigid requirement: any term that the developer wanted to use to do a full-text seach must be named query. It hasn’t appeared to be a problem so far.

The way the capturing works is that devs have two approaches: if they want a normal intent handler they can add one as a method directly off the handlers configuration object.

If they want the search to go to Algolia, they provide—instead—an object with a key of answerWith, with a function value that accepts a single argument.

The adapter code loops through the intent handlers and checks if each is a function or an object. If a function, fine, nothing is changed. If an object with the answerWith key, then the function is called after the Algolia search results have been returned, with the results merged with the Alexa response.

More’s Coming

There’s still more left to do: we don’t currently handle different states, for example. (I’ve got a whiteboarding session on Monday to discuss.)

Already, though, I’m happy with the response. Without any posts about it on our website or social media, people are discovering it and using it. And when customers mention their interest in building an Alexa Skill that uses Algolia, it’s always fun to mention, “Oh, did you know we already have something for that?”

I’m bullish on the future of voice search—and Algolia’s involvement.