Final Project of the Language Understanding System Course @ University of Trento (2017/2018)
This is a bot that answers questions about movies either via text or via voice (or text input and voice output), it was developed as the second/final project of the Language Understanding Systems course at UNITN.
Demos:
- A video demo of the bot "in action" using text input and voice output can be seen here
- A shorter video demo of the voice interaction functionality can be seen here
This repository contains the code to run, build and evaluate the bot, and an accompaigning report that can be seen here.
If you want to see an example conversation with the bot you can see the demo videos above or the "Example of an actual conversation with the bot" section later on in this readme.
The bot can answer questions about the movie domain. All the topics it can answer are found in the movie_domain.yml file, under intents, but basically it can answer this kind of queries:
- Find actor/director of a movie or movies
- For example: all comedy movies, or movies published in 1990, made in a given country, etc
- Find information on a given movie or movies. Question regarding: language, country, genre, budget, date, subjects (think about plot keywords), rating/reviews, and revenue. Some examples:
- what is the gross revenue of movies published in Germany
- when was Phantom Menace released
- what genre is Star Wars
- It can count movies in a series:
- how many Star Wars movies exist
It can also answer follow up questions since it keeps track of the context. For an example of a conversation where context is used, see the "Example of an actual conversation with the bot" section later on in this readme.
The first thing you need to do to be able to run the bot is to install the dependencies:
- For the basic bot functionality:
- If you want to use the or using TTS and STT:
After these are installed (or only the first 2 if you don't care for the speech functionality) you can proceed to check out the Makefile which already defines a help
recipe that details the bot's capabilities. The Makefile also provides shortcuts to run all the commands.
To just run the bot you first need to train the model with:
make train-nlu-aggregated train-dialogue
and then you can run the bot with make run-bot
for simple text input output. See below for other ways in which to run it.
Here is a basic rundown of the "recipes" provided in the Makefile and what they do:
run-bot
-- run bot on the command line (text input and output)run-voice
-- run bot over voice (voice input and output)run-voice-only-output
-- run bot with normal text input but voice outputtrain-nlu
-- train nlu module on training datatrain-nlu-aggregated
-- train nlu module on aggregation of train and test datatrain-dialogue
-- train dialogue moduletrain-online
-- start bot in interactive training mode using RegEx interpreter (see this page for more information on how this works)train-online-with-nlu
-- start bot in interactive training mode using the NLU interpreterconvert-data
- converts NLSPARQL data to rasa format (this is the NLU data)evaluate-nlu
-- only evaluate nlu against test dataevaluate-nlu-crossval
-- do a crossvalidation on aggregated datasetevaluate-dialogue
-- only evaluate dialogue against test datatrain-and-evaluate
-- train nlu and dialogue and evaluate both on test data
The functionality of the bot is separated across the following python modules:
bot.py
- this is the main interface with the bot. It handles training and running the bot.moviedb.py
- this module is in charge of comunicating with the databaste, which by the way is an SQLite DB and is located here.channels.py
- here we define the functionality for the voice communication channels. Here are defined both the input and the output voice channels.actions.py
- this is the largest of all the modules and includes all the actions that the bot can do. They define what kind of questions the bot can answer satisfactorily and which not. The actions, and a basic definition, are the following:- ActionSearchPerson - is in charge or searching actors and directors
- ActionSearchPersonInfo - doesn't really do much since the database doens't hold "personal information" on actors and directors. For the moment it just redirects users to the "person's" IMDB page. It was intented as the action that could answer questions like "who played Neo in The Matrix"
- ActionSearchMovie - it is in charge of searching for a movie or set of movies that satisfy a given query (ie, comedy movies, or those made by a given director)
- ActionSearchMovieInfo - this is the largest action and is in charge of finding all the information related to a movie. It is in charge of answering questions like "what genre is Star Wars"
- ActionAnswer - it is in charge of diplaying the answer to the user or saying that no answer was found
- ActionFalloutSlots - this is a special action that implements a forgetting mechanism that helps the bot funcion properly in longer conversations (4 or more turns). For more information on this action see the report.
policy.py
- this contains the main dialogue policy for the bot. It is in charge of routing a user request to the appropiate action (ie, user wants to know the director of a movie so the policy gives the request toActionSearchPerson
)evaluate_nlu.py
- this is a modified version of Rasa's own evaluate_nlu script. The only difference between this and Rasa's own script is that it saves evaluation information into files so that they can be used later.
Besides the main code of the bot (defined above) there are also some other script that are in charge of working with the dataset provided for the project. These scripts can be found in the data/scripts folder.
convert_to_rasa.py
- The dataset for training the NLU module was originally provided in NLSPARQL format (this original data can be found here). This script is in charge of converting the NLSPARQL data into the json format that is recognized by rasa. It is used to convert both the training and testing data.create_database.py
- this is just a simple script to create the SQLite database given the original database schema and data that were provided for the project.agregate_jsons.py
- this script is in charge of aggregating 2 rasa formatted jsons into one. It it used to join the training and testing files generated byconvert_to_rasa.py
so that we can use this big, aggregated, file when training for production.analyze_data.py
- prints some basic information on the data.Graphics.ipynb
- is in charge of generating graphics for the report.
User: Hello
Bot: Hi
Bot: How may I help you?
User: when was Phantom Menace released
Bot: It was released in the year 1999
User: who were the actors in The Matrix
Bot: Marcus Chong, Gloria Foster, Nona Gaye, Steve Bastoni, Daniel Bernhardt, Keanu Reeves, Collin Chou, Essie Davis, and Helmut Bakaitis
User: when was it published
Bot: There are multiple answers to your question:
-> The Matrix: 1999
-> The Matrix Revolutions: 2003
-> The Matrix Reloaded: 2003
User: what was the revenue
Bot: There are multiple answers to your question:
-> The Matrix: 171383253
-> The Matrix Revolutions: 139259759
-> The Matrix Reloaded: 281492479
User: what year was Clueless released
Bot: It was released in the year 1995
User: who directed it
Bot: Amy Heckerling
User: which other movies did she direct
Bot: This are the movies I found: Fast Times at Ridgemont High, Clueless, Loser, and Vamps
User: who starred in Harry Potter
Bot: Gary Oldman, Verne Troyer, Daniel Radcliffe, Alfred Enoch, Ralph Ineson, Toby Jones, Dave Legeno, Robert Pattinson, Rupert Grint, Alan Rickman, Emma Watson, and Fiona Shaw
User: show me trailers
Bot: I'm sorry but I don't have information on trailers. You can find the trailer and other media about the movie on its IMDB page:
-> Harry Potter and the Sorcerer's Stone: http://www.imdb.com/title/tt0241527/?ref_=fn_tt_tt_1
-> Harry Potter and the Chamber of Secrets: http://www.imdb.com/title/tt0295297/?ref_=fn_tt_tt_1
-> Harry Potter and the Prisoner of Azkaban: http://www.imdb.com/title/tt0304141/?ref_=fn_tt_tt_1
-> Harry Potter and the Goblet of Fire: http://www.imdb.com/title/tt0330373/?ref_=fn_tt_tt_1
-> Harry Potter and the Order of the Phoenix: http://www.imdb.com/title/tt0373889/?ref_=fn_tt_tt_1
-> Harry Potter and the Half-Blood Prince: http://www.imdb.com/title/tt0417741/?ref_=fn_tt_tt_1
-> Harry Potter and the Deathly Hallows: Part I: http://www.imdb.com/title/tt1571403/?ref_=fn_tt_tt_1
-> Harry Potter and the Deathly Hallows: Part II: http://www.imdb.com/title/tt1680310/?ref_=fn_tt_tt_1
User: thank you
Bot: Any time ;)
User: bye
Bot: Goodbye. Have a nice day!