Dustin Coates in Vux

Using decaying user data to keep context in conversational and voice experiences

One of the fundamental features of conversational interfaces is that you can’t force a user to ask for something in a specific way. One of the fundamental features of a conversation is that a lot can be communicated in a small number of words. One manner of doing this is by assuming that the other parties know the topic of conversation. This could be because the conversation is about the immediate world (“We’re lucky, huh?” after a meteorite falls on the car next to yours) or a continuation of a recent topic (“How do you think he did it?” doesn’t require the definitions of “he” and “it” if you were just talking about your friend’s broken leg). People are great at keeping this context in mind. A change in topic, and “How do you think he did it?” could be confusing or refer to something new.

Teaching computers to keep context has been a goal since the very early days of natural language understanding. Context tracking at the level of humans is, for sure, not something that’s possible right now, especially in third party skills. There are, however, some steps that developers can take to approximate it. Here, we’ll look at some using Actions on Google code with something I’ve built that looks up baseball scores and standings. However, the principles remain for any voice or conversational application.

Explicit versus implicit requests

Sometimes your users will be very explicit in what they want (“How did the Houston Astros play last night?”) and other times they won’t (“Who do they play next?”) The explicit requests are easy: what they’re looking for will come through in an intent/slot matching (e.g. LastGameIntent and {team: "Houston Astros"}). Take that, and return the information (“The Astros beat the White Sox five to zero.”).

The implicit requests are less clear. Let’s think through it. “Who do they play next?” can follow a response or it can be the first one in a while or first one ever. This is good, we’re starting to think of some conditions that will be useful in our code logic.

First up is the implicit request that follows a response. In that case, the work starts before the implicit request comes in and instead when you’re creating the response. Determine the key subject of this request and consider what a user could ask in a follow-up.

Utterance “How did the Houston Astros play last night?”
Slots/Entities {team: "Houston Astros"}
Response “The Astros beat the White Sox five to zero.”
Subject Houston Astros
Nouns Astros, White Sox, five, zero

The safest assumption is that any follow-up would most likely be about the subject, the Houston Astros. In my skill, I can expect this to be true, though that does require some trade-offs. A user could reasonably say “And what is their record?” as a follow-up and refer to either the Astros or their opponent. This is a question that would be ambiguous among humans, so we can’t expect that computers will do better. We will assume that the question is for Houston’s record, and let the user be more explicit if they want the White Sox. This is an assumption that I don’t believe will catch a user off-guard, and so we’re free to do it.

Request time decay

We, then, need to store the subject in a data store to access later, create utterances that have space for an explicit team request and those that don’t, and check in our fulfillment if that slot has been filled.

app.intent('last_game', (conv, {team}) => {
  if (team) {
    conv.user.storage.recentTeam = team;
    // return response for explicitly requested team's last game
  } else {
    if (conv.user.storage.recentTeam) {
      // return response for the most recent, explicitly requested team's last game
    } else {
      conv.contexts.set(CONTEXTS.needTeam, CONTEXT_LENGTHS.shortTerm);
      return conv.ask(`What team do you want to know about?`);
    }
  }
});

In this code, the filled slot takes precedence over everything and is stored as the recent request. Otherwise, we look to see if there has been a recent request. If so, provide information for that team. If not, we ask the user for clarification. The last part uses contexts, a similar principle that helps route requests for a certain number of turns.

This creates a situation where the most recent team will always be the context for future requests. This isn’t necessarily wanted. The further in the past a request is, the less it should impact the context until a point where it has no impact. For that, we’ll keep track of both the most recent request and when it happened. If it was longer than two minutes ago, we won’t use it to implicitly set the context. (I’m using the moment.js library for times, but that’s not necessary.)

function recent (userStorage) {
  const decay = 2;
  if (
    userStorage.recentTeam &&
    userStorage.recentTeam.lastAccessed &&
    moment().subtract(decay, 'm').toDate() < moment(userStorage.recentTeam.lastAccessed).toDate()
  ) {
    return userStorage.recentTeam;
  }
}

app.intent('last_game', (conv, {team}) => {
  if (team) {
    conv.user.storage.recentTeam = {name: team, lastAccessed: moment().toDate()};
    // return response for explicitly requested team's last game
  } else {
    if (recent(conv.user.storage)) {
      // return response for the most recent, explicitly requested team's last game
    } else {
      conv.contexts.set(CONTEXTS.needTeam, CONTEXT_LENGTHS.shortTerm);
      return conv.ask(`What team do you want to know about?`);
    }
  }
});

Now we are storing when the request last happened, and only returning it if it’s happened in the past couple of minutes. This prevents a runaway situation where a user comes back after a long time and hears something unexpected—most likely to happen when the NLU hasn’t correctly heard an explicitly requested team.

User favorites

Some users, though, are always going to request the same thing. This is especially true for a voice experience like what we’re looking at. People are fans, and are primarily interested in one favorite team. The time decay ignores that, because it resets all knowledge after a certain amount of time. We should pair it, instead, with knowledge of a user’s preferences, by keeping track of what they request most often and defaulting to that when there are no other signals.

function topFav (favorites) {
  return favorites.sort((a, b) => b.seen - a.seen)[0];
}

function recentOrFav (userStorage) {
  const decay = 2;
  if (
    userStorage.recentTeam &&
    userStorage.recentTeam.lastAccessed &&
    moment().subtract(decay, 'm').toDate() < moment(userStorage.recentTeam.lastAccessed).toDate()
  ) {
    return userStorage.recentTeam;
  }

  if (userStorage.favoriteTeams) {
    return topFav(userStorage.favoriteTeams);
  }
}

function setFav (team, favorites=[]) {
  let teamObjIndex = favorites.findIndex(teamObj => teamObj.name === team);
  let teamObj = favorites[teamObjIndex];

  if (teamObjIndex === -1) {
    teamObj = {name: team, seen: 0};
    teamObjIndex = favorites.length;
  }

  teamObj.seen = teamObj.seen + 1;

  favorites[teamObjIndex] = teamObj;

  return favorites;
}

app.intent('last_game', (conv, {team}) => {
  if (team) {
    conv.user.storage.recentTeam = {name: team, lastAccessed: moment().toDate()};
    conv.user.storage.favoriteTeams = setFav(team, conv.user.storage.favoriteTeams);
    // return response for explicitly requested team's last game
  } else {
    if (recentorFav(conv.user.storage)) {
      conv.user.storage.favoriteTeams = setFav(team, conv.user.storage.favoriteTeams);
      // return response for the most recent, explicitly requested team's last game
    } else {
      conv.contexts.set(CONTEXTS.needTeam, CONTEXT_LENGTHS.shortTerm);
      return conv.ask(`What team do you want to know about?`);
    }
  }
});

What we have now are three helper functions for getting and setting the favorite team, and getting the implicit context. What this does is store how many times a team has been requested, then assumes the most requested team is the favorite. Then, if a team is explicitly requested, we return information about that team. If not, we see if a team has been requested in the past two minutes. If not, then we default to the favorite team.

This approach does not require a single context like we have here in teams. This can be done with multiple contexts and rely on each one differently in different intents. For a response of “an apple scone has 320 calories,” the possible follow-up requests might be “how much fat does it have?” (implied context is “apple scone”) or “what’s half of that?” (implied context is 320). This would be a situation where storing both is useful and letting GetNutritionIntent and SimpleMathIntent (or whatever) grab what they need.

In combining time decay for recent requests and user preferences, we can guard against an implicit requests or misunderstand explicit requests returning nothing to the user, and anticipate their needs.