Dustin Coates in Code

Creating search suggestions with ML via t5-base-qa-qg-hl (🤗 Taste Test)


What we’re going to look at today is a model that will help us generate questions. And why might this be useful? I can think of a couple of different reasons. One is perhaps you are studying for a test or you are, let’s say, a sales enablement person at your company. And you have to give a quiz to your colleagues about something that they’ve studied.

You can write out the questions yourself, or you could use a model like this, if you have to do this on a regular basis. You can use a model like this to generate the queries for you. Another thing that is useful… this could be useful for, is for generating search suggestions as well. So on Google, for example, you see some trending searches right here, but if I start typing when…

And you see: when is Mother’s Day, when is Easter, the Superbowl, when does spring start, when does summer start, when does Loki come out? I think that that’s one of those new Marvel movies. And this is something where if you have content news articles , support desk articles, this is something where you can use a model to generate these types of questions, help people get to their content that they’re looking for even faster.

The model we’ll be looking at is the T5 base QA QG HL model. And this sounds like a mouthful; it is a mouthful, but really all means it is based on the T5 base model. It is used for question answering and question generation, and it uses highlights to create the questions.

And we can see on here—and indeed, this is one of the nice things about Hugging Face—you can try out a lot of these models on the website itself. We can see right here. Let’s go ahead and click compute. And so we said, generate a question, we’ve highlighted what the answer is to the question that we wanted, sort of Jeopardy style.

We give the answer first and then it comes up with a question. What is the answer to life, the universe, and everything? And the answer of course is 42, which is what we’ve highlighted. So maybe that’s another use case: you can create your own Jeopardy spinoff. This is quite simple to use, especially if we’re using the accelerated inference API.

First, let’s see what this is going to look like in practice. I created just a simple, simple web UI right here, where we’re going to paste some text right here. We’re going to click on generate a question and the model is going to give us a question back. We will take some texts from the latest Talking to Computers blog, this blog post about Building Machine Learning Powered Applications book.

We will take this paragraph right here and let’s paste it in there. The way I have this coded up is you select the answer, and it’s going to give you a back a question. So let’s select “because you need data upfront.” And the question that we’re going to get back is “why is ML not going to power your product?”

And indeed, ML, isn’t going to power your product because you need data up front.

Let’s try one that’s maybe a little bit more complex and let’s select, “gathering data,” and let’s see what the model is going to give us back. “What is a big part of creating an ML-driven product?” And indeed. Ameisen points out that gathering data is a big part of creating an ML driven product.

Let’s see, as well, we can see in this CoLab that’s provided by the model creator. If we scroll all the way down, what we’ll see is that in that web interface, we are only seeing one result, but it can actually give back multiple results as well. So here, this is giving four. This is giving four. This is giving three as well. So you can use this to create those search suggestions quite quickly.

Let’s see what this looks like in terms of code.

And we see that the code is quite simple. It really is simply just calling the API. There are some configurations that you can put in here. But you can see, we are sending inputs. Again, we are generating the question, we are sending text along, and then we’re sending this tag right here to say, “Hey, this is the end of the string.”

This is a different text, this is about the trumpet player Al Hirt. Random Wikipedia article that I took, his song Java. And we’ve highlighted the album, Honey in the Horn. Let’s run this and see what we get back.

Generated text: “What was the name of Al Hirt’s 1963 album?” I don’t know about you, but I find this incredibly impressive because this model needs to determine that this Honey in the Horn is an album. It is Al Hirt’s album, and it was from 1963 as well. I find that incredibly impressive.

You don’t necessarily have to have these highlight tags. Let’s remove those and see what kind of question it gives us. Let’s run that again. “What was the name of the song that Al Hirt recorded in 1963?” And now this is coming pretty early in the text, but it’s still quite impressive to show that it’s able to come up with that kind of question, really without any highlighting whatsoever.

And then finally, what this might look like in practice is you have a search record right here. You’ve got the title, you’ve got a synopsis. You might even have the text of the blog post, or the help desk article. And you want to go ahead and add those suggestions that the model has provided you, “what is a big part of creating an ML driven product?” “why is ML not going to power your product,” et cetera. You can add those on, in order for there to be added textual relevance onto these records. You can also build your own search index that people search on first as they’re typing, that are only query suggestions. That is another way to do it as well.

You have a number of different options, if you want to integrate this in search. But as you’ve seen, it is very, very simple to run this model. You can of course do your fine tuning. You can do all of that, but if you are just a product manager, just a search engineer, this is something that you may want to look at right now, see if you can build it into your search, improve that search experience.

Again, that’s the T5 base QA QG HL model, which you can find on the Hugging Face inference API. Thanks so much for taking the time to watch this video Please check out some of the other videos that look at other models that you may want to use to improve your product, or even to build a product on top of. Let me know what you think, and I hope to hear from you soon.