Some of the 105 Voice First Tabs Left Open on My Mobile Chrome
I’m trying to use my phone less, but I still use it. What this leads to is that I have enough time to open pages I want to read without taking the time to read them. Many times I’ll read them in another setting, but rarely do I go back and close the tabs on my phone. So here it is, for your enjoyment: a small smattering of the links I’ve opened but didn’t read on mobile. Take this is a tour of news and interesting things from the past however many months.
The links are roughly broken into a few categories. Click on one below to skip to the links inside the category.
- Voice-First Resources and Guides
- Voice-First Platform News
- NLP Resources
- Programming News and Resources
Creative Commons audio recordings. Use these in your voice apps.
Voice-First Resources and Guides
From Justin Jeffress, a list of Alexa-specific resources on dialog management. Experienced developers will want to look at the last three resources in detail, which focus on switching context during a conversation, how to capture “this and that” or “the other thing” slots, and dynamic slot elicitation.
A guide that discusses how to implement Google Sign-In.
Leon Nicholls discusses how to use Dialogflow history to view historical interactions and improve an Action’s conversational design. Watch out for mismatched intents and fulfillment errors.
Promotion, name-free discovery, etc.
Voice-First Platform News
Google Assistant News
Google is now (well, as of September) offering account linking through users’ Google accounts. The best time to use Google Sign-In is for Assistant-only apps or when the other platforms use Google Sign-In for authentication.
Announces support for digital goods and subscription purchasing in Actions. Also announces Google Sign-In for the Assistant (see above).
Paul Cutsinger discusses the different technologies that Alexa is using to improve skill discovery. Alexa is using both aggregate and user-level data to choose which skills to surface when a user makes a request without an invocation name. Specifically, Alexa uses a tool called Shortlister to find the best skills for the request, then ranks the skills using HypRank. HypRank uses “contextual signals” for ranking. These signals include the request domain, user affinities, skill popularity, recent skill usage, and skill quality. This approach leads to 95-96% accuracy in choosing the single best skill for a request.
To improve the likelihood that Alexa will select a skill, developers should add as much (accurate) metadata as possible, have high usage and many reviews, and implement `CanFulfillIntentRequest`.
From Alexa Science, a summary of a paper that discusses how Alexa might (read: probably does, or probably will) use on-screen information to improve ambiguity resolutions. The example that Vishal Naik gives is a “play Harry Potter” request. Does this mean the audio book, the movie, or the soundtrack? If an Echo Show has movies on the screen, the likely answer is the movie.
Third-party reminders! I’ll be adding these to my baseball skills to remind people when a game is about to start.
Developers can specify an acceptable list of slot values. If a user provides something outside of that list, Alexa will reprompt the user for an acceptable value. I’d argue that many times, at least for content acquisition, good voice search tools are a better approach. Still, there are situations where slot values need to be enums.
This is now out of preview an into beta, a way for developers to build skills without worrying about heading into the depths of AWS. This isn’t a tool most developers are going to use for production skills, but it is inavaluable for those building their first skills. Invaluable enough that I rewrote chapter two of my book on building Alexa skills to use hosted skills.
How the Alexa Science team is improving TTS to use different speaking styles in different contexts (e.g. reading the news versus reading a bedtime story). You know this: it’s your “boyfriend voice” or “parent voice” which throws off your friends when they hear you using it.
Amazon rolled out what they call “phrase slots,” or slots where the developer might now know everything a user will say to fill the slot. The first one (and only so far, I think) is
AMAZON.SearchQuery. Use it to add search to your Alexa skills, or just open up the possibility for getting (nearly) all of the user utterance.
An introduction to the “Recurrent Embedding Dialogue Policy.” This approach is able to achieve 100% in bAbl dialogue task.
A Hacker News discussion on this blog post from Amazon Science on disambiguation. Most interesting about the post is that the parsers are domain-general, and indeed they look to get models that perform poorly at predicting domains. We can expect to see better collection of lists slot values in the future.
Just what it says on the tin.
Programming News and Resources
The best summary is the subtitle: “or, How Google Code Search Worked.” An interesting article for, among many reasons, learning that popular RegEx implemenations don’t use automata, and can be really slow.