CS 124 / LING 180 From Languages to Information, Dan Jurafsky, Winter 2015

Week 5: Group Exercises on QA: Application to Mobile Devices Feb 10, 2015

Your goal today is to explore one of the most common applicatons of question answering: the personal assitant scenario. You are going to explore the question answering and other conversational features of the personal assistants on your smart phone, Siri, Google Now, Cortana, etc. If no one in your group has a smart phone, just rearrange the groups a bit. If you have more than one personal assistant in your group, compare and contrast!!

  1. Write a couple of texts or emails. What is the speech recognition word error rate? The word error rate is the edit distance in words between what was recognized and what you intended (i.e. sum of the substitutions + deletions + insertions from the correct word string, and divide by the total number of words in the correct string). Can you characterize what's going on with the errors?

  2. Make and cancel some calendar appointments. Again, analyze any errors: did they fail because of speech recognition (the wrong words were recognized) or natural language understanding (the words were right, but the system still didn't understand). If the NLU, what went wrong?

  3. Try to find a business (a restaurant or etc.). Again, analyze any errors: did they fail because of speech recognition or natural language understanding? If the NLU, what went wrong?

  4. Does the speech recognition system allow barge-in? barge-in is you interrupting/talking over the system.

  5. Make a list of the set of tasks that the dialog manager seems capable of (the dialog manager is the control structure that controls what kinds of tasks the system can do, given the words and the meaning from each sentence). Anything that you can think of that they didn't accomplish?

  6. Do some error analysis on the performance of the text-to-speech component. Any problems in pronunciation? If so is it a wrong phone, incorrect stress, or a problem with the prosody (the rhythm/pitch)?

  7. What does the system do if it seems to be unsure what you said? Does it have a strategy for confirming? If so, what is it? Does it always confirm?

  8. Can the system make use of the context (by which I mean specficially the previous question)? If so what aspects of this can the system use?

  9. Think of somehing you might want to do that requires more than one interaction (i.e. you say something, the system says something, you respond). Assuming the system can't do this now, discuss how you might do it.