Table of Contents |
---|
Overview
the process of creating a character is iterative in nature. The following steps create an initial character that will probably need to be refined by repeating the steps as required to obtain the desired behavior.
...
- content: this directory contains the files that define the user and system utterances and their identifying strings.
- dm: this directory contains the dialogue policy
- nlu: this directory contains the natural language understanding model learned by the corresponding module from the data in the data found in the content directory
Step 1: authoring the content
Anchor | ||||
---|---|---|---|---|
|
User utterances:
The NLU module given an utterance returns the most probably identifying strings. It's based on a maximum entropy multiclass classifier and therefore the user utterance file should list utterances maintaining their natural frequency. That is, the best way to obtain these utterances is by running wizard of oz experiments or role plays. Then annotate the data by assigning to each utterance said by a user during these experiments an identifying string. These identifying strings are sometime called speech acts or dialogue acts (in case more domain specific semantic is attached to the basic speech act). Examples of dialogue acts are: question.age to mark all utterances in which the user is asking about the age of the addressee.
When we design a dialogue policy for the character using this content, whenever we want to wait for the user to say a certain utterance, we will use the string identifier (speech act) associated to that utterance.
System utterances:
These utterances are the one the system can say. Similarly to the user utterances, each utterance has a specific string identifier. When designing the dialogue policy, if we want to say a certain system utterance, we will use the corresponding identifier (also here we call the identifier speech act).
File format:
The user and system utterances files use the same Excel spreadsheet format. These files have a number of columns (these are the initial 2 rows of the system utterances file for the character used in the example below):
...
The user utterance file needs to be called user-utterances.xlsx and the system utterance file must be called system-utterances.xlsx (these names can be configured, but the default configuration looks for those names in each character available).
Step 2: Authoring the dialogue policy
Overview
The FLoReS (Forward Locking Reward Seeking) dialogue manager is an information state and event driven dialogue manager. That is it does nothing unless an event is received. When an event is received it searches for the best action (i.e. sub-dialogue) that can be executed in the current information state and that achieves the highest expected reward. Once the best action is found it start executing it. Unless:
...
As mentioned earlier the dialogue manager searches for the best available action every time an event comes in.
Operators (sub-dialogues or actions):
Actions are also called sub-dialogues and define dialogue trees. For example this is one sub-dialogue found in the CakeVendor example below:
...
At any time in the system there is at most 1 active sub-dialogue: the current action. As said above, in some cases there may be no active actions. All actions are normally inactive, unless they have been active and they have been substituted (swapped-out) by another action before their natural termination (that is, at some point the system found a better action and so changed the state of the current action to dormant and made the newly found best action as active). Not all actions that are active and are swapped out for a new best action can become dormant. Some will go back directly to the inactive state. An action, to be allowed to become dormant, must have special entry paths that allow for it to be awoken back to the active state in case it becomes again the best action.
Entry paths:
A sub-dialogue has multiple entry paths. The entry paths have a specific order (decided by the author) and each entry path has conditions to regulate when it can be taken and has also a start state. That is when the system during the search for the best available action considers a certain sub-dialogue, it'll considers all the possible entry paths in the order specified. The first that has satisfied conditions will be taken and it'll start the execution of the action at the specified start state in the sub-dialogue tree.
...
We refer to the entry paths with their conditions also as preconditions as that is the name traditionally used by the planning community.
Nodes and edges:
The edges of a sub-dialogue tree are of three types:
- user edges: these edges tell the system to wait for a certain event before traversing them. If a state has one outgoing edge that is a user edge, then all outgoing edges of that state will be user edge. this property make a state a user waiting state that blocks the execution of the action until the user says any of the events in the outgoing user edges.
- system edges: these edge when traversed make the system say a particular utterance. System edges take time to be traversed: the time taken by the associated system utterances to be played (one can configure to ignore this waiting but the default is to wait for a system edge to finish playing the associated animation).
- condition edges: these edges are used to connect state when we don't want to wait for an event and we don't want the virtual human to say anything.
Effects:
Anchor | ||||
---|---|---|---|---|
|
...
In the example of sub-dialogue given here the red nodes are states with effects. These states can be inspected to display the particular effects associated with them. This graphical representation of a sub-dialogue is generated for debug purposes, it's not used to edit the sub-dialogue, just to check that the intended form is correctly generated from the provided information.
End node:
Each sub-dialogue is terminated when the execution path reaches a node that has no more outgoing edges.
Final sub-dialogue:
Each sub-dialogue can be marked final. That means that when the end node of a final sub-dialogue is reached, the conversation ends. When the conversation ends the DM will ignore all events and the user will not be able to interact with the virtual character anymore.
Execution
Execution of a sub-dialogue consists of taking a certain entry path (the one that lead to the maximum expected reward) and then at every node, take the first outgoing edge (the order is from left to right and is specified by the author) that can be taken (that is has a satisfied condition) until we reach a waiting point: a user state (i.e. a state with user outgoing edges). At that point the dialogue manager terminates the execution and waits for the next event. If the incoming event is one of the expected events (i.e. the events specified in the user edges) then the execution continues along the first satisfied user edge. If the final node is reached, the sub-dialogue is terminated and becomes inactive and the system searches for a new optimal action to start executing.
Anchor | ||||
---|---|---|---|---|
|
The information state is formed by variables and stores the current state of the conversation. Three things can update the information state:
- the dialogue manager (DM) takes care of updating a set of special variables (e.g. the time since the last user action). These special variables can be found in a file called specialVariables.xml in the dm sub-directory. The file is automatically generated every time the DM starts.
- event listeners: one can associate to certain events automatic updates that are executes every time a particular event is received. these updates are also called state less updates because they happen regardless of the current action or best selected action.
- effects: as described in the Creating a new virutal human with the FLoReS dialogue manager effects section above, a sub-dialogue node can have a specific effect to update the value of a certain variable.
- forward inference rules: one can specify an ordered list of implications. They are executed every time a change is made to the information state. when one is found in which the antecedent of the implication is true, the consequent is executed. For example, give the rule "if A then B else C" if A is true, then B is executed otherwise C is executed. The else part is optional. A is a Boolean expression. B and C are assignments.
Special variables:
The dialogue manager has a predefined set of special variables it updates automatically and that can be used in a dialogue policy, if needed.
...
Each line found in that file shows a special variable. id
is the name of the variable to be used if you want to refer to that variable in your dialogue policies. value
is the initial value given to that variable at start-up. type
specifies the type of the variable (just to give you an idea, variables are untyped so you can change if want what that variable stores, but when the dialogue manager will automatically update that variable it'll write again something that belongs to the predefined type found in this description). desc
contains a textual description of what that variable contains.
Dialogue policy execution
When an event is received, the dialogue manager (DM) checks to see if it is expected by the current action (i.e. the current action is at a user node and one of the user outgoing edges is waiting for the received event). If the current action is waiting for the received event the DM will continue the execution of the current action. Otherwise it'll execute a forward search to find the best action to execute. The forward search simulates possible future conversations. It's a breath first search and it's limited by time and depth (i.e. it'll always return quickly even if the search space is huge). Currently the limits are: 250ms or 10 levels maximum (i.e. the dialogue manager terminates the search for the optimal action after 250ms or if the search graph that represents the possible future conversations reaches a depth of 10 sub-dialogues, that is the search had enough time (i.e. within the 250ms timeout) to explore all possible conversations made up using a sequence of 10 sub-dialogues).
...
The preconditions are used to limit which sub-dialogues can be executed in a given state. Rewards are used to differentiate among a set of executable sub-dialogue.
The policy format
The dialogue policy is composed by several files. The main file that defines it is called policy.xml (also this name can be configured, but this is the default name).
...
line 4 specifies the discount factor alpha mentioned in the expected reward formula. The line given above defines a discount factor of 0.9.
...
To include multiple files just duplicate line 5 for each different text format file that needs to be included.
The information state initialization file:
The following example shows the format of the information state initialization file:
...
For example, the line: <initialize expr="assign(lastNonNullSubdialog,null)"/>
defines the variable lastNonNullSubdialog
and assigns to it the value null
. The syntax of the information state update is described in this section.
Normally all initialization entries are assignments (to give an initial value to a particular variable). However, one can use also another construct called an implication. This instead of defining and initializing a variable stores a forward inference rule in the knowledge base (as mentioned in the information state section). For example, the implication defined with imply(AND(>(delta-symptom_worried,0), deployed), ++(ptsd_counter, delta-symptom_worried))
creates a forward rule that evaluates the increment ++(ptsd_counter,delta-symptom_worried)
every time there is an update to the information state and the condition AND(>(delta-symptom_worried,0), deployed)
is satisfied.
A note about values:
Variables in the information state can have various types of values:
- A number like in
assign(timeSinceLastAction,0.3)
- A string like in
assign(name,'John')
- A Boolean like in
assign(notTrue,false)
- No value like in
assign(unknown,null)
- A java object can be assigned but only programmatically
Anchor | ||||
---|---|---|---|---|
|
Here we describe the syntax used to define conditions and effects. Typically we use a prefixed syntax where the operator or function is specified first followed by its arguments. For some operators an infix version is also available. when unsure, use the prefix version. Formulas and variable names are case insensitive. internally all names are lowercase. You can still use syntax like thisIsALongVarName to facilitate human reading, but internal it makes no difference and if one uses a version like thisisalongvarName it'll work fine.
Conditions
We call conditions all expressions that have as result a Boolean value. Conditions are used in the definition of entry paths or as the first argument of an implication. They are also used as second argument of assignments (to assign a variable a Boolean value).
...
For example, AND(>(delta-symptom_worried,0), deployed)
defines a conjunction of 2 formulas: >(delta-symptom_worried,0)
satisfied when the variable delta-symptom_worried is greater than 0 and deployed
satisfied when the variable deployed
contains the value true
.
Effects
By effects here we mean just the information state updates. Effects in sub-dialogues can also be rewards. But here we are describing the syntax of the effects that update the information state. They can be of two types: assignments and implications. Implications are conditional assignments, that is they execute an assignment only if a particular condition is true (they have also an else portion).
Assignments
Assignments are used to update the value of a specified variable. They take 2 arguments: a variable being assigned and a formula returning a value to be assigned to the first argument. For example assign(lastNonNullSubdialog,null)
assigns the value null
to the variable lastNonNullSubdialog
. The special assign
operator is available also in infix form as =
. The formula assign(var1,2)
is equivalent to var1=2
.
...
- Increments: assignments that increment the value of a number variable. These assignments can be done using the
++
operator. This operator can take 1 or 2 arguments.++(var1)
increments the variablevar1
by 1.++(var1,2)
incrementsvar1
by 2. The second argument, if present, can be a variable or complex expression that is evaluated to provide the increment value. The syntax++(var1,var2)
is equivalent toassign(var1,+(var1,var2))
. - Assertions: this are assignments to a variable of a Boolean value. For example, if we want to make a certain variable
true
then instead of executingassign(var1,true)
we can executeassert(var1)
. If we want to make a variable false we executeassert(!var1)
orassert(NOT(var1))
.
Implications
Implications are used to define conditional assignments. An implication takes 2 or 3 arguments: a condition and 1 or 2 assignments. For example, imply(==(var1,2),assign(var1,3),assign(var2,4))
executes assign(var1,3)
if ==(var1,2)
is true, otherwise it executes assign(var2,4)
. the third argument (the else part) is optional and can be omitted.
Special functions (aka Custom functions)
Special functions can be added by implementing the interface edu.usc.ict.nl.kb.cf.CustomFunctionInterface
. Special functions are a way to define new functions by associating arbitrary Java code to a certain string. At the moment the following special functions are defined:
isCurrentTopic(var)
: returns true if the provided string or variable containing a string matches one of the topics of the sub-dialogue currently active.known(expr)
: this returns true of the provided expression evaluates to anything but the NULL value.isLastNonNullTopic(var)
: similar toisCurrentTopic
but executes the match on the last non null topic. That is, if currently there are no active networks, this will match the value of var with the topic of the last active network.isQuestion(var)
: returns true if the provided var evaluates to a string that contains the string "question". This maps to the methodedu.usc.ict.nl.io.NLU.isQuestion
overwrite with your own specific NLU class if you want to customize or write a new custom function.match(var1,var2)
: maps to the String.matches(regexp) Java method. var1 must be a string or evaluate to one. var2 must be a string or evaluate to one. The content of var2 must be a valid Java regular expression.random(var)
: generates a random number from 0 to the value invar
-1.var
doesn't have to be a variable but can also be a numeric constant.follows(var1,var2)
: var1 is a string constant (or a variable with a string constant as value) and var2 is a boolean (or a variable with a boolean value). Var2 is optional, by default it's false. The function returns true if the operator named by var1 has already been executed. If var2 is true, then the function returns true only if the operator named by var1 has already been completed (that is, any final state in the operator has been executed (as opposed to being swapped out before completion)).
Quotation
Delayed evaluation is available using the special operator quote
. For example, if we execute this assignment assign(expr1,quote(+(var1,var2,3)))
we save in the variable expr1
the expression that computes the sum of var1
, var2
and the constant 3
. every time we use the variable expr1
it's like if we use the entire expression it contains. If we later write the condition >=(expr1,34)
it's equivalent to the condition >=(+(var1,var2,3),34)
.
The reward definition file:
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
<goals> <goal id="simple" desc="the basic reward" value= "10"/> <goal id="quick" desc="reward for something more important" value= "30"/> ... </goals> |
...
For example, the line <goal id="simple" desc="the basic reward" value= "10"/>
defines the reward named simple
with description "the basic reward
" and value 10
. This lines internally defines a variable named valueFor_simple
with value 10
. This variable name is used if one wants to change the global value associated to a specific reward at run time (i.e. as an effect of a certain action).
The Text Format used by the files that defines the sub-dialogues (aka operators or actions):
This section describes the text format. As mentioned before one can include in the root policy file any number of files in text format containing the definition of sub-dialogues. This features allows the author to organize the sub-dialogues in some meaningful way.
...
Notice the use of the standard xi:include
to include content from external xml files. Use this technique to divide the content of the policy file as you desire.
Flow of execution:
As mentioned earlier a sub-dialogue defines a small conversation as a tree with nodes and edges. Nodes contain effects, while edges are either user, system or condition edges.
...
to this basic tree, the entry path add ways to define when and where the execution can start. An entry path defines when it can be traversed and at which point of the sub-dialogue it'll start the execution if taken.
Topics:
First we list the topics associated with this sub-dialologue. This is done using line 2, that associates this network to the topic qa
. We can associate a network to as many topics as we like by adding more lines like this one.
Topics use a hierarchical structure using the "." as a separator. The argument of the topic tag, #topic:
, should be a complete path (i.e. from root to leaf) in this hierarchy.
Entrance conditions:
Line 3 defines a user entry path satisfied when the input event is question.what.you-make
.
...
An entrance condition should be placed where we want to start the execution if it's traversed. Entrance conditions don't have to be in the initial part of the sub-dialogue even though that is the most common location. Then can be anywhere we could start the execution of the sub-dialogue. in particular, re-entry paths should be located where execution can restart after an interruption. Interruptions can happen anywhere but the most common places are while waiting for a user input (i.e. at the state before user edges).
System actions:
Line 5 defines a system action that makes the system say the line associated to the identifier answer.what.make
.
...
User and system actions can be followed by an optional condition that specifies further restrictions on when that edge can be taken. For system actions the semantics of the optional condition is slightly different: the edge will be traversed no matter what, but the line associated to the edge will not be said if the condition is false. So the system line with condition should be read as "say this if this is true". The user line with condition should be read as "wait for this event and this should also be true".
Effects:
Line 6 attaches the reward effect to the state reached after the system edge is executed. This reward effect says that the reward associated tot he defined goal named simple
will be achieved when the execution reaches that point on the sub-dialogue tree.
Information state updates are defined using a line like #action: state='exit'
in which the value of the variable state
is set to be the string 'exit'
.
Example with conditional edges:
This sub-dialogue defines a confirmation dialogue tree that says different things depending of the value of the type
and flavor
information state variables.
...
The ELSE
block is optional.
A more complex example with user actions, ORs, DO and SWAPOUT:
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Network flavorCheese { #topic: set.flavor #entrance condition: system initiative #condition: and(state=='start',type=='cheese',known(sugar)) #reentrance option: statement.back system: question.cake.flavor { { user: statement.flavor.chocolate #action: flavor='chocolate' } OR { user: statement.flavor.lemon #action: flavor='lemon' } OR { user: statement.flavor.amaretto system: apology.flavor #action: clarifyFlavors=true #action: swapout } } DO #action: state='done' #goal: simple } |
...
The rest are normal information state updates. The remaining new statement introduced is line 24: #action: swapout
. this effect forces, when executed, the sub-dialogue to become paused. No further content can come after a swapout action as it'll not be executed. One would need to put a re-entry path there to be able to re-start execution right after the swapout action. This action can be used as a way to emulate calling another sub-dialogue. For example, line 23 sets the variable clarifyFlavors
to true enabling a different network to be executed. Then we swapout the current network. Given how we set the rewards, the other network will be selected and once completed (if again the rewards are set properly) execution will restart the flavorChees
network.
Comments:
Comments can be inserted using the Java style. Single line comments are // and multi-line comments are /* */.
Debugging:
Syntax checks:
Syntax checks are executed at load time and if problems are encountered the policy is discarded and a message printed that says where the problem was encountered.
Other messages are printed that may require your attention. For example, if user or system actions use undefined string identifiers you may need to press ENTER to continue the execution.
Graph conversion:
Also, if one defines very complex sub-dialogues with many different variable updates a warning may be presented saying that too many possible final conversation states (a conversation state is a set of effects defined along each possible path of the sub-dialogue tree) are defined. This will make the search step impossibly slow and consequentially the rewards useless. One way to avoid the problem is to add ignore
statements to tell the code that process the sub-dialogue to find the possible different final conversation states to avoid considering a specific variable. These ignore
statements should be added at the beginning of the network. For example, the statement: #ignore: var1
tells the code to ignore updates to the variable named var1
.
...
Then after the policy is loaded, you'll find a .gdl file named policy_complete_path_to_policy_file.gdl. Open this using the aiSee software.
Dialogue manager logs:
The system generates two log files in the logs directory. for each conversation, it generates a separate xml file with the following name: chat-log-MACHINE-USER-[YEAR_MONTH_DAY]-[HOURS_MINUTES_SECONDS]-sid=999-pid=.xml
. This file should be seen in a browser that supports xsl style files (e.g. Firefox). It contains the record of the conversation with the NLU interpretation and the changes in the information state.
The second log file has the name: system-logs-MACHINE-USER-[YEAR_MONTH_DAY]-[HOURS_MINUTES_SECONDS]-sid=999-pid=
with no extension. This contains the system messages generated and the content depends on the log level set for each component in the log4j configuration file: src/log4j.properties.
Event listeners:
As mentioned in the information state section one can also define event listeners that execute predefined updates to the information state every time a particular event is received irrespectively of the currently active sub-dialogue.
...
The listing order is important as the listeners are evaluated in the order in which they were defined.
Step 3: Train the natural language understanding module
After defining the content and the dialogue policy we are ready for training the NLU. We need to start the FLoReS module. After the interface pops up:
...
select the NLU menu and under it the training voice. The first time the interface is opened the training happens automatically but if you update the content as described in step 1, you need to manually select this menu to update the NLU models.
An example
CakeVendor.zip contains all is required to define a CakeVendor character that is an extension of the character created in this other tutorial for NPCEditor.
Running a character
The FLoReS module comes with a chat interface that allows to easily test a dialogue policy without requiring any other module to run.
...