This project focuses on the implementation of a dialogue manager module inside a currently in-development minimalistic voice assistant application. Users of this application will be able to communicate with it in Romanian and issue simple voice commands (current iteration focuses only on the first three), such as:
- querying the weather
- adding events/appointments to a calendar
- querying the aforementioned calendar
- playing music
- turning the lights on/off
Home Assistant Add Lunch Scenario
The dialogue manager is a component inside the project which is responsible for the interpretation of given commands and their parameters, deciding whether they are correctly specified or not, retrieving information from APIs (weather, calendar, etc.), and issuing a meaningful response based on given input.
The above diagram illustrates the system's structure and how it is divided in several interacting modules, each of which is going to be detailed in the following sub-sections of this document.
The dialogue manager is essentially based on two intercommunicating servers, one written in Prolog, which employs a finite state machine logic in order to handle user communication and requests, and the other written in Python, which handles actual API calls and authentications.
The knowledge base is written using Prolog and divided in several files:
- a main file,
kb.pl
- additional files, one for each intent:
calendar_kb.pl
andweather_kb.pl
Intents are our chosen representation of user-given commands; they can be specified using the intent/1
predicate. Intents contain several entities, each corresponding to one of their parameters, which can be specified using the entity/3
predicate. The three arguments are: the intent to which the entity corresponds, the name of the entity, and its value. For example, an add to calendar intent may contain entities such as: event title, date, starting time, ending time etc.
- Some entities have default values which are stored using the
default/4
predicate, such as:
default(calendarAsk, ora_final, [time(23, 59, 59)], 'sfarsitul zilei').
- Most entities' values require some transformation, as they do not contain information that can be handled by the application. For example, a value of tomorrow for the time entity inside a queryWeather intent might be easily understood by a human, but it is of no use to our weather module that expects timestamps. We tackled this matter by defining a
relative/5
predicate which transforms humanly readable entity values so that they can be further processed by other modules. For example:
relative(calendarAsk, ora_inceput_relativ, 'dupa-masa', R, Mesaj) :-
R = [time(12, 0, 0), time(18, 0, 0)],
Mesaj = 'dup\u0103-mas\u0103'.
- Some entities might be missing because they have not been specified by the user.
Finally, the getEntitiesValues/3
predicate performs transformations using relative
predicates for each entity in the list of entities given as parameter. Additionally, if the third parameter is set to true, it will replace missing entities with their default values. The transformed values are asserted using finalEntity/4
and missingEntity/2
predicates. In the case of a queryWeather intent with a specified time of tomorrow and a missing location parameter, the flow of execution would look like this:
These files contain definitions for relative
predicates in the case of multiple possible values of different entities, such as:
relative(calendarAsk, data, 'maine', R, Mesaj) :-
date_get(tomorrow, Tomorrow),
R = [Tomorrow],
Mesaj = 'm\u00e2ine'.
relative(queryWeather, timp, 'peste o ora', R, Mesaj) :-
R = ['hourly', 1],
Mesaj = 'peste o or\u0103'.
It is worth mentioning the extensive use of the Prolog date_time library in order to compute time intervals, date differences and so on.
These additional files define predicates that wrap the http calls made by the Prolog server to the Python service module and handle the received responses.
getWeatherCall(LatLon, Time, R) :-
% Build URL for call
getWeatherURL(LatLon, Time, Url),
% Perform http call and receive JSON response
setup_call_cleanup(
http_open(Url, In, [request_header('Accept'='application/json')]),
json_read_dict(In, WeatherData),
close(In)
),
% Reformat the response as a string
format(string(R), ...).
The finite state machine is written using Prolog and is running under a server that handles http requests which pass JSON data containing intents and their entities. Predicates such as currentState/1
, performState/1
, switchState/1
and intentReceived/0
are used to design the execution of the FSM.
currentState
simply asserts which state the automata is in at a given moment.performState
has a separate definition for each state and contains various actions that have to be executed during that particular state. It also switches the current state the FSM is in.switchState
is used to switch between states and simultaneously log messages to the console and perform clean-up actions.intentReceived
is used to switch states according to the new intent received. Together withintentEndpointHandler/1
(which implements the server endpoint for receiving requests) they form the backbone of our observer pattern implementation:
intentEndpointHandler(Request) :-
% Read and handle JSON input
http_read_json(Request, Dict, [json_object(dict)]),
atom_string(Intent, Dict.get('intent')),
Entities = Dict.get('entities'),
% Remove old intent
retractall(intent(_)),
retractall(entity(answer, _, _)),
retractall(entity(calendarUpdate, _, _)),
% Setup new intent and persist entities in kb
New =..[intent, Intent],
assertz(New),
persistEntities(Entities),
reply_json(json([message = 'Received intent'])), nl,
intentReceived().
% Perform states requiring intent as input.
intentReceived() :- intent(X), not(validIntent(X)), log('Invalid intent'), switchState(idle).
intentReceived() :- performState(idle), !.
intentReceived() :- performState(waitForAnswer), !.
intentReceived() :- performState(waitForUpdate), !.
% If no state can be performed, switch to idle.
intentReceived() :- switchState(idle), !.
For example, the actions for a weather call state would look like this:
performState(weatherApiCall) :-
% Assert the current FSM state
currentState(weatherApiCall),
% Check the existence of entities generated using the knowledge base
finalEntity(queryWeather, loc, LatLon, Location),
finalEntity(queryWeather, timp, Time, T),
% Call the weather Prolog module and log messages
log('Weather api call'),
getWeatherCall(LatLon, Time, Desc),
format(string(Message), 'Vremea ~s ~s. ~s.', [T, Location, Desc]),
assertz(message(Message)),
% Switch to the next state
switchState(respond).
We also make great use of helper predicates. The following one checks whether the start and times in the knowledge base are in correct order:
startEndTimeForAddAreCorrect() :-
% Get start and end time
finalEntity(calendarAdd, ora_inceput, StartL, _),
finalEntity(calendarAdd, ora_final, EndL, _),
% Get the actual values from lists
nth0(0, StartL, Start),
nth0(0, EndL, End),
% Set message if start and end are not in order
not(time_compare(Start, <, End)),
Message =..[message, 'Orele de \u00eenceput \u0219i de final nu sunt date corect. Care dore\u0219ti s\u0103 fie ora de \u00eenceput a evenimentului?'],
assertz(Message),
% Set start/end time entities as missing
M1 =..[missingEntity, calendarAdd, ora_inceput], assertz(M1),
M2 =..[missingEntity, calendarAdd, ora_final], assertz(M2),
Modified =..[toBeModified, calendarAdd, ora_inceput],
assertz(Modified),
% Remove start/end hour so that the state will be reached once again
retractall(finalEntity(calendarAdd, ora_inceput, _, _)),
retractall(finalEntity(calendarAdd, ora_final, _, _)).
Other helper predicates: persistEntities/0
, hasMissingEntity/0
.
The Python module is running as a server written using the Flask framework. This server handles incoming requests from Prolog modules and issues calls to Google API (calendar add/update requests) and OpenWeatherMap API (weather querying requests). The OAuth standard is used in order to access the Google API.
In the case of weather API calls, the Python module employs a locally stored cache file that is accessed whenever weather querying requests are received from the Prolog server. Actual calls to the API are made to update the cache data only if it is older than 60 minutes, thus providing quicker responses in the event of consecutive weather querying requests from the user.
def get_weather(lat, lon):
# ...
if key in cache:
print('Found cached data')
keys = list(cache[key])
# ...
if minutes > 60:
print('Cached data too old')
else:
# Return cached data
# Issuse an API request...
# url = ...
response = req.get(url)
The Python module sends http responses back to the Prolog server in the form of JSON data.
@app.route('/api/calendar/update/', methods = ['POST'])
def calendar_api_update():
if request.method == 'POST':
# ...
return json.dumps(event)
# ...
# ...
# ...
@app.route('/api/weather', methods=['GET'])
def weather_api():
if request.method == 'GET':
# ...
if time[0] == 'daily':
# ...
else:
# ...
response['temp'] = data['temp']
response['feels_like'] = data['feels_like']
response['humidity'] = data['humidity']
response['description'] = data['weather'][0]['description']
return json.dumps(response, indent=2, ensure_ascii=False)
Other modules include the speech-to-text module, which is developed using the RASA NLU framework. The training datasets containing sentences used for learning are generated using Chatito. Example content of one of the .chatito files:
%[answer]('training': '90', 'testing': '30')
@[positive]
@[negative]
@[positive]
Da
Da, este ok
Sigur
Desigur
Așa facem
Bine
Continuă tot așa
Continuă
Okay
Ok
Îmi place
@[negative]
Nu
Nu așa
Nu este ok
Deloc
Nu sunt de acord