Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

the process of creating a character is iterative in nature. The following steps create an initial character that will probably need to be refined by repeating the steps as required to obtain the desired behavior.

...

Disclaimer

Although this module is not part of the vhtoolkit distribution, it is compatible with it and can be used when more flexibility is desired for natural language understanding, generation and dialog management. The module is available open source from this github repository.

Getting started

  1. Clone the jmnl github repository.
  2. Install Eclipse.
  3. Install Java at least 8.
  4. In eclipse, import the existing project defined in the github repository.
  5. Run the main in edu.usc.ict.nl.ui.chat.ChatInterface with arguments -s chatInterface.xml.

Overview

the process of creating a character is iterative in nature. The following steps create an initial character that will probably need to be refined by repeating the steps as required to obtain the desired behavior.

In general, three actions are required to create the content necessary to drive the natural language component:

...

  1. user edges: these edges tell the system to wait for a certain event before traversing them. If a state has one outgoing edge that is a user edge, then all outgoing edges of that state will be user edge. this property make a state a user waiting state that blocks the execution of the action until the user says any of the events in the outgoing user edges.
  2. system edges: these edge when traversed make the system say a particular utterance. System edges take time to be traversed: the time taken by the associated system utterances to be played (one can configure to ignore this waiting but the default is to wait for a system edge to finish playing the associated animation).
  3. condition edges: these edges are used to connect state when we don't want to wait for an event and we don't want the virtual human to say anything.
  4. wait edges: an edge that does nothing but wait a specific amount of time.

...

  1. an information state update. That is, changing the value of some variable in the information state when the node containing the effect is entered.
  2. a reward. A reward can be a numeric constant or an expression returning a number. When the state containing a reward is reached, the system achieves the associated reward. A sub-dialogue can have multiple rewards associated to multiple states.
  3. swap out the current sub-dialog (force the sub-dialogue to go from ACTIVE to DORMANT state).

In the example of sub-dialogue given here the red nodes are states with effects. These states can be inspected to display the particular effects associated with them. This graphical representation of a sub-dialogue is generated for debug purposes, it's not used to edit the sub-dialogue, just to check that the intended form is correctly generated from the provided information.

End node:

Each sub-dialogue is terminated when the execution path reaches a node that has no more outgoing edges.

Final sub-dialogue:

...

  1. System edges can be of three types:
    1. a normal speech act given as a constant string. The DM will simple send a request to the NLG to create surface text for the given speech act.
    2. an interruptible speech act. This is a system line for which we are ready to receive an interruption. That is, if the user says something we except to prioritize what the user says and interrupt if we have an interruption policy in place that generates an interrupt request.
    3. an evaluation system action: this action has as argument an expression that needs to be evaluated and its result must be a string to be handled like the normal case above.
  2. condition edges: these edges are used to connect state when we don't want to wait for an event and we don't want the virtual human to say anything.
  3. wait edges: an edge that does nothing but wait a specific amount of time.

Anchor
effects
effects
Nodes contain effects. There are three types of effects:

  1. an information state update. That is, changing the value of some variable in the information state when the node containing the effect is entered.
    1. like an assignment or an assertion.
  2. a reward. A reward can be a numeric constant or an expression returning a number. When the state containing a reward is reached, the system achieves the associated reward. A sub-dialogue can have multiple rewards associated to multiple states.
  3. swap out the current sub-dialog (force the sub-dialogue to go from ACTIVE to DORMANT state).
  4. a request to interrupt the current system action
  5. sending an internal message
  6. sending a VH protocol message

In the example of sub-dialogue given here the red nodes are states with effects. These states can be inspected to display the particular effects associated with them. This graphical representation of a sub-dialogue is generated for debug purposes, it's not used to edit the sub-dialogue, just to check that the intended form is correctly generated from the provided information.

End node:

Each sub-dialogue is terminated when the execution path reaches a node that has no more outgoing edges.

Final sub-dialogue:

Each sub-dialogue can be marked final. That means that when the end node of a final sub-dialogue is reached, the conversation ends. When the conversation ends the DM will ignore all events and the user will not be able to interact with the virtual character anymore.

Execution

...

Daemon sub-dialogue:

A sub-dialogue can be also marked as a daemon sub-dialogue consists of taking a certain entry path (the one that . These subdialogues are extensions of the event listeners, they can have system initiative entrance conditions and do more complex processing and information state updates.

Execution

Execution of a sub-dialogue consists of taking a certain entry path (the one that lead to the maximum expected reward) and then at every node, take the first outgoing edge (the order is from left to right and is specified by the author) that can be taken (that is has a satisfied condition) until we reach a waiting point: a user state (i.e. a state with user outgoing edges). At that point the dialogue manager terminates the execution and waits for the next event. If the incoming event is one of the expected events (i.e. the events specified in the user edges) then the execution continues along the first satisfied user edge. If the final node is reached, the sub-dialogue is terminated and becomes inactive and the system searches for a new optimal action to start executing.

...

The information state is formed by variables and stores the current state of the conversation. Three Four things can update the information state:

...

Code Block
languagehtml/xml
themeEmacs
<sv id="timeSinceLastUserAction" value="0" type="NUMBER" desc="Time in seconds since the last thing said by the user."/>
<sv id="timeSinceLastSystemAction" value="0" type="NUMBER" desc="Time in seconds since the last thing said by the system."/>
<sv id="consecutiveUnhandledUserActions" value="0" type="NUMBER" desc="Number of consecutive user actions for which the system had no direct response (handler)."/>
<sv id="timeSinceLastAction" value="0" type="NUMBER" desc="Time in seconds since anyone said something (user or system)."/>
<sv id="timeSinceLastResource" value="0" type="NUMBER" desc="Time in seconds since the last resource link/video was given."/>
<sv id="event" value="null" type="TEXT" desc="Name of last speech act said by the user and processed by the system."/>
<sv id="lastNonNullSubdialog" value="null" type="TEXT" desc="Name of last sub-dialog executed by the system."/>
<sv id="systemEvent" value="null" type="TEXT" desc="Name of the speech act last said by the system."/>
<sv id="timerInterval" value="1" type="NUMBER" desc="Time in seconds between 2 consecutive timer events."/>
<sv id="preferForms" value="true" type="BOOLEAN" desc="If true and a form is available for the current system speech act, the form will be selected by the NLG."/>

...


<sv id="tmpEvent" value="null" type="TEXT" desc="Variable used to store the input event that generated one of the internal events (e.g. unhandled, ignore and loop)."/>

Each line found in that file shows a special variable. id is the name of the variable to be used if you want to refer to that variable in your dialogue policies. value is the initial value given to that variable at start-up. type specifies the type of the variable (just to give you an idea, variables are untyped so you can change if want what that variable stores, but when the dialogue manager will automatically update that variable it'll write again something that belongs to the predefined type found in this description). desc contains a textual description of what that variable contains.

...

Code Block
languagehtml/xml
themeEmacs
firstline1
titlepolicy.xml
linenumberstrue
<policy xmlns:xi="http://www.w3.org/2001/XInclude">
 <xi:include href="macros.xml"/>
 <xi:include href="initKB.xml"/>
 <xi:include href="goals.xml"/>
 <stepDiscount value="0.9"/>
 <include href="textFormat/policy.txt"/>
</policy>

line 2 specifies the file used to define macros used in effects (events and information state updates).

line 3 specifies the file used to define all the variables in the information state and to initialize them.

line 3 4 specifies the file that defines the basic value of the rewards available in this dialogue policy.

line 4 5 specifies the discount factor alpha mentioned in the expected reward formula. The line given above defines a discount factor of 0.9.

line 5 6 includes a file that specifies some operators (actions/sub-dialogues) in a particular text format. One could specify the sub-dialogue trees directly in a xml variant but it's harder and so we prefer to document how to design operators using this special text format. One can have any number of text format files included. When designing complex characters, it is helpful to organize the sub-dialogues in multiple files.

...

  • A number like in assign(timeSinceLastAction,0.3)
  • A string like in assign(name,'John')
  • A Boolean like in assign(notTrue,false)
  • No value like in assign(unknown,null)
  • A java object can be assigned an hash table line in assign(tmp,set(answer2questionmap,'answer.question.1','question.1')) this assignment is equivalent to a java statement: tmp=answer2questionmap.put('answer.question.1','question.1')
  • a list like in assign(tmp,topic(?)) here tmp will contain all arguments, x, for which topic(x) is true in the information state.
  • A java object can be assigned but only programmatically

...

Implications are used to define conditional assignments. An implication takes 2 or 3 arguments: a condition and 1 or 2 assignments. For example, imply(==(var1,2),assign(var1,3),assign(var2,4))executes assign(var1,3) if ==(var1,2) is true, otherwise it executes assign(var2,4). the third argument (the else part) is optional and can be omitted.

Special functions (aka Custom functions)

Special functions can be added by implementing the interface edu.usc.ict.nl.kb.cf.CustomFunctionInterface. Special functions are a way to define new functions by associating arbitrary Java code to a certain string. At the moment the following special functions are defined:

  • Hash functions:

      ...

        • newMap(

      ...

      follows(var1,var2): var1 is a string constant (or a variable with a string constant as value) and var2 is a boolean (or a variable with a boolean value). Var2 is optional, by default it's false. The function returns true if the operator named by var1 has already been executed. If var2 is true, then the function returns true only if the operator named by var1 has already been completed (that is, any final state in the operator has been executed (as opposed to being swapped out before completion)).

      Quotation

      Delayed evaluation is available using the special operator quote. For example, if we execute this assignment assign(expr1,quote(+(var1,var2,3))) we save in the variable expr1 the expression that computes the sum of var1var2 and the constant 3. every time we use the variable expr1 it's like if we use the entire expression it contains. If we later write the condition >=(expr1,34) it's equivalent to the condition >=(+(var1,var2,3),34).

      The reward definition file:

       

      Code Block
      languagehtml/xml
      themeEmacs
      titleReward definition file
      linenumberstrue
      <goals>
          <goal id="simple" desc="the basic reward" value= "10"/>
          <goal id="quick" desc="reward for something more important" value= "30"/>
      ...
      </goals>

      Each <goal> element defines a new reward (we refer to them also as goal to stick with the planning terminology even though they are not really goals).

      For example, the line <goal id="simple" desc="the basic reward" value= "10"/> defines the reward named simple with description "the basic reward" and value 10. This lines internally defines a variable named valueFor_simple with value 10. This variable name is used if one wants to change the global value associated to a specific reward at run time (i.e. as an effect of a certain action).

      The Text Format used by the files that defines the sub-dialogues (aka operators or actions):

      This section describes the text format. As mentioned before one can include in the root policy file any number of files in  text format containing the definition of sub-dialogues. This features allows the author to organize the sub-dialogues in some meaningful way.

      All files needs to be in the format described here.

      Examples will be used to illustrate the format. This text defines a sub-dialogue named qamake that has a structure typical of question-answering sub-dialogues:

      Code Block
      titleDefining a sub-dialogue
      linenumberstrue
      Network qamake {
          #topic: qa
          #entrance condition: current NLU speech act = question.what.you-make
          
          system: answer.what.make
          #goal: simple
      }

      (sorry, one more way to call a sub-dialogue: Network)

      Notice the use of the standard xi:include to include content from external xml files. Use this technique to divide the content of the policy file as you desire.

      Flow of execution:

      As mentioned earlier a sub-dialogue defines a small conversation as a tree with nodes and edges. Nodes contain effects, while edges are either user, system or condition edges.

      Without entry paths, the remaining structure found in the sub-dialogue definition creates a tree with a single root. Unless otherwise specified the flow created is a single linear path of edges. To create a node with more than one outgoing edge one needs to use a special keyword (OR or IF).

      to this basic tree, the entry path add ways to define when and where the execution can start. An entry path defines when it can be traversed and at which point of the sub-dialogue it'll start the execution if taken.

      Topics:

      First we list the topics associated with this sub-dialologue. This is done using line 2, that associates this network to the topic qa. We can associate a network to as many topics as we like by adding more lines like this one.

      Topics use a hierarchical structure using the "." as a separator. The argument of the topic tag, #topic:, should be a complete path (i.e. from root to leaf) in this hierarchy.

      Entrance conditions:

      Line 3 defines a user entry path satisfied when the input event is question.what.you-make.

      A system initiative entry path would be defined with a line like: #entrance condition: system initiative

      A re-entry path is defined by a line like #reentrance option: statement.back where statement.back is the line that the system will say when it takes that re-entry path.

      All entry paths can be followed by an optional condition tag that specifies further restriction on when the entry path can be taken based on the information state. For example, the following entrance condition:

      Code Block
      themeEmacs
      titleSystem entry with condition
      #entrance condition: system initiative
      #condition: and(state=='start',!known(type),known(sugar))

      specifies a system initiative entry path what can be taken when the variable state contains the string 'start' and the variable type is null and the variable sugar is not null.

      An entrance condition should be placed where we want to start the execution if it's traversed. Entrance conditions don't have to be in the initial part of the sub-dialogue even though that is the most common location. Then can be anywhere we could start the execution of the sub-dialogue. in particular, re-entry paths should be located where execution can restart after an interruption. Interruptions can happen anywhere but the most common places are while waiting for a user input (i.e. at the state before user edges).

      System actions:

      Line 5 defines a system action that makes the system say the line associated to the identifier answer.what.make.

      That system action is translated into a system edge.

      User and system actions can be followed by an optional condition that specifies further restrictions on when that edge can be taken. For system actions the semantics of the optional condition is slightly different: the edge will be traversed no matter what, but the line associated to the edge will not be said if the condition is false. So the system line with condition should be read as "say this if this is true". The user line with condition should be read as "wait for this event and this should also be true".

      Effects:

      Line 6 attaches the reward effect to the state reached after the system edge is executed. This reward effect says that the reward associated tot he defined goal named simple will be achieved when the execution reaches that point on the sub-dialogue tree.

      Information state updates are defined using a line like #action: state='exit' in which the value of the variable state is set to be the string 'exit'.

      Example with conditional edges:

      This sub-dialogue defines a confirmation dialogue tree that says different things depending of the value of the type and flavor information state variables.

      ...

      themeEmacs
      firstline1
      titleSub-dialogue with conditional edges
      linenumberstrue

      ...

        • ): this function creates a new hash table.
        • clear(var): empties the hash table stored in the variable var.
        • get(var1,var2): returns the value associated to the key var2 in the hash table var1.
        • set(var1,var2,var3): sets the value var3 to the key var2 in the hash table var1.
      • List functions:
        • get(var1,var2): returns the value associated to index var2 in list var1 (index can also be the string "random" in that case the function returns a random element of the list).
        • exists(var1,var2,var3): returns true iff there exists an element of var2 for which var3 is true when substituted to the variable named var1.
        • intersect(var1,var2): computes the intersection between the two given collections.
        • len(var1): returns the length of the given list.
        • removeIf(var1,var2,var3),removeIfNot(var1,var2,var3): returns the list formed by the elements of the list var2 for which the boolean expression var3 is false (,true). var1 is the loop variable.
        • set(var1,var2,var3): sets the value var3 at position var2 in list var1.
        • subtract(var1,var2): removes all the elements in the list var2 from the list var1.
        • union(var1,var2): computes the union of the two lists.
      • String functions:
        • match(var1,var2): maps to the String.matches(regexp) Java method. var1 must be a string or evaluate to one. var2 must be a string or evaluate to one. The content of var2 must be a valid Java regular expression.
        • concatenate(var1,...,varn): concatenates the provided strings.
      • Time functions:
        • currentTime(): returns the current time in milliseconds since 1/1/1970.
        • getLastTimeMark(var1): returns the last time (in milliseconds since 1/1/1970) the current operator was in state var1, where var1 can be either "DONE" or "ENTER".
        • getLastTimeMark(var1,var2): returns the last time the current operator said the speech act var2. Var1 must be "SAY".
      • Ordering:
        • follows(var1,var2): var1 is a string constant (or a variable with a string constant as value) and var2 is a boolean (or a variable with a boolean value). Var2 is optional, by default it's false. The function returns true if the operator named by var1 has already been executed. If var2 is true, then the function returns true only if the operator named by var1 has already been completed (that is, any final state in the operator has been executed (as opposed to being swapped out before completion)).
      • Topic:
        • isCurrentTopic(var): returns true if the provided string or variable containing a string matches one of the topics of the sub-dialogue currently active.
        • isLastNonNullTopic(var): similar to isCurrentTopic but executes the match on the last non null topic. That is, if currently there are no active networks, this will match the value of var with the topic of the last active network.
      • Numbers:
        • min(var1,...,varn),max(var1,...,varn): returns the min/max of the given list of numbers.
        • random(var): generates a random number from 0 to the value in var-1. var doesn't have to be a variable but can also be a numeric constant.
        • round(var): returns the output of java.lang.Math.round applied to the input argument when converted to a float value.
      • Debug:
        • trace(var): prints a java stack trace when var is evaluated.
        • print(var): prints the value of var when the expression is evaluated by the system.
      • Other:
        • if(var1,var2,var3): return the evaluation of var2 if var1 evaluates to true, if var1 evaluates to false it returns the evaluation of var3. null if var1 returns null.
        • known(expr): this returns true of the provided expression evaluates to anything but the NULL value.
        • numToString(var): returns the string representation of the given number. For example, it returns "twenty three" for 23.
        • hasBeenInterrupted(var): returns true if the current operator has been swapped out by an interruption.
        • isInterruptible(): returns true if the current transition being executed is interruptible (by the user).
        • isQuestion(var): returns true if the provided var evaluates to a string that contains the string "question". This maps to the method edu.usc.ict.nl.io.NLU.isQuestion overwrite with your own specific NLU class if you want to customize or write a new custom function.
      Quotation

      Delayed evaluation is available using the special operator quote. For example, if we execute this assignment assign(expr1,quote(+(var1,var2,3))) we save in the variable expr1 the expression that computes the sum of var1var2 and the constant 3. every time we use the variable expr1 it's like if we use the entire expression it contains. If we later write the condition >=(expr1,34) it's equivalent to the condition >=(+(var1,var2,3),34).

      Macros

      Macros can be defined to name complex expressions used in conditions and effects. The system also supports templates. For example,

      <formulamacro left="isAvailable(topic)" right="exists(m3,question(topic,?),or(!known(answered('other',m3)),!known(answered('self',m3))))"/>

      the above defines a template macro isAvailable that accepts one argument, for example, if the argument topic is the variable tt, the template generates the expression: exists(m3,question(t,?),or(!known(answered('other',m3)),!known(answered('self',m3))))

      The system also supports event macros to provide a simple way to define random options for system actions:

      Anchor
      eventmacro
      eventmacro

      <eventmacro left=“OKAY” right=“or(AI_alrightA,backchannel.okay_confirm,AI_mhmC,AI_alrightE,AI_mhmE,AI_uhhuhE)”/>

      this macro defines a system speech act called "OKAY" that could be verbalized as any of the 5 speech act listed.

      The reward definition file:

       

      Code Block
      languagehtml/xml
      themeEmacs
      titleReward definition file
      linenumberstrue
      <goals>
          <goal id="simple" desc="the basic reward" value= "10"/>
          <goal id="quick" desc="reward for something more important" value= "30"/>
      ...
      </goals>

      Each <goal> element defines a new reward (we refer to them also as goal to stick with the planning terminology even though they are not really goals).

      For example, the line <goal id="simple" desc="the basic reward" value= "10"/> defines the reward named simple with description "the basic reward" and value 10. This lines internally defines a variable named valueFor_simple with value 10. This variable name is used if one wants to change the global value associated to a specific reward at run time (i.e. as an effect of a certain action).

      The Text Format used by the files that defines the sub-dialogues (aka operators or actions):

      This section describes the text format. As mentioned before one can include in the root policy file any number of files in  text format containing the definition of sub-dialogues. This features allows the author to organize the sub-dialogues in some meaningful way.

      All files needs to be in the format described here.

      Examples will be used to illustrate the format. This text defines a sub-dialogue named qamake that has a structure typical of question-answering sub-dialogues:

      Code Block
      titleDefining a sub-dialogue
      linenumberstrue
      Network qamake {
          #topic: doneqa
          #entrance condition: systemcurrent initiativeNLU speech    #condition: and(state=='done')
       act = question.what.you-make
        {  
            if (and(type=='sponge',flavor=='chocolate'))
                  system: statementanswer.cake.ready.chocolate.spongewhat.make
          #goal:    if (and(type=='cheese',flavor=='chocolate'))
                  system: statement.cake.ready.chocolate.cheese
              if (flavor=='lemon')
                  system: statement.cake.ready.lemon
              if (flavor=='amaretto')
                  system: statement.cake.ready.amaretto
          }
          #action: state='exit'
          #goal: simple
      }

      The automatically generated dialogue tree is:

      Image Removed

      In general the IF keyword has the following syntax:

      Code Block
      themeEmacs
      titleIF syntax
      IF (condition)
      {
      	//system/user actions
      }
      ELSE
      {
      	//system/user actions
      }

      The ELSE block is optional.

      A more complex example with user actions, ORs, DO and SWAPOUT:
      Code Block
      themeEmacs
      firstline1
      titleUser actions, ORs, DO, SWAPOUT
      linenumberstrue
      Network flavorCheese {
          #topic: set.flavor
          #entrance condition: system initiative
          #condition: and(state=='start',type=='cheese',known(sugar))
          
          #reentrance option: statement.back
          
          system: question.cake.flavor
          {
              {
                  user: statement.flavor.chocolate
                  #action: flavor='chocolate'
              }
              OR
              {
                  user: statement.flavor.lemon
                  #action: flavor='lemon'
              }
              OR
              {
                  user: statement.flavor.amaretto
                  system: apology.flavor
                  #action: clarifyFlavors=true
                  #action: swapout
              }
          }
          DO
          #action: state='done'
          #goal: simple
      }

      Here we see a re-entry path with no optional condition (remember that an entry path can be taken only if the sub-dialogue containing it is paused, that is it was started earlier by a normal entry path and so it already passed one level of conditions. This is to say that it's typical for re-entry paths to not have conditions).

      Line 8 is a system edge that brings as to a node with 3 outgoing edges defined by the 2 ORs. The use of OR should be read as follow this edge if you can, OR this, OR this...

      If one needs to specify a complex sub-tree instead of just a simple edge with no effects, one should surround the block of text defining that sub-tree with curly brackets to define its scope unambiguously. You can see the use of curly brackets for scoping at line 9, and then for the three blocks of code that defines the three arguments of the ORs.

      Line 26 closes the bracket opened at line 9. This generates empty edges that bring all execution paths open till then (3 in this case) to one single node.

      The DO keyword creates a new node to which we can attach effects if we don't need to use the standard way to create nodes using user or system actions. In this case 2 effects are attached to that node, a reward and an information state update.

      Line 11 shows a user action that defines an edge waiting for the event statement.flavor.chocolate.

      The rest are normal information state updates. The remaining new statement introduced is line 24: #action: swapout. this effect forces, when executed, the sub-dialogue to become paused. No further content can come after a swapout action as it'll not be executed. One would need to put a re-entry path there to be able to re-start execution right after the swapout action. This action can be used as a way to emulate calling another sub-dialogue. For example, line 23 sets the variable clarifyFlavors to true enabling a different network to be executed. Then we swapout the current network. Given how we set the rewards, the other network will be selected and once completed (if again the rewards are set properly) execution will restart the flavorChees network.

      Comments:

      Comments can be inserted using the Java style. Single line comments are // and multi-line comments are /* */.

      Debugging:
      Syntax checks:

      Syntax checks are executed at load time and if problems are encountered the policy is discarded and a message printed that says where the problem was encountered.

      Other messages are printed that may require your attention. For example, if user or system actions use undefined string identifiers you may need to press ENTER to continue the execution.

      Graph conversion:

      Also, if one defines very complex sub-dialogues with many different variable updates a warning may be presented saying that too many possible final conversation states (a conversation state is a set of effects defined along each possible path of the sub-dialogue tree) are defined. This will make the search step impossibly slow and consequentially the rewards useless. One way to avoid the problem is to add ignore statements to tell the code that process the sub-dialogue to find the possible different final conversation states to avoid considering a specific variable. These ignore statements should be added at the beginning of the network. For example, the statement: #ignore: var1 tells the code to ignore updates to the variable named var1.

      Another check to do to a policy is to generate the graphical representation and check that it matches the desired design. To generate the graph representation one needs to set the debug level associated with the  policy parser to DEBUG: modify in the file src/log4j.properties the line:

      log4j.logger.edu.usc.ict.nl.dm.reward.model.RewardPolicy=warn

      to

      log4j.logger.edu.usc.ict.nl.dm.reward.model.RewardPolicy=debug

      Then after the policy is loaded, you'll find a .gdl file named policy_complete_path_to_policy_file.gdl. Open this using the aiSee software.

      Dialogue manager logs:

      The system generates two log files in the logs directory. for each conversation, it generates a separate xml file with the following name: chat-log-MACHINE-USER-[YEAR_MONTH_DAY]-[HOURS_MINUTES_SECONDS]-sid=999-pid=.xml. This file should be seen in a browser that supports xsl style files (e.g. Firefox). It contains the record of the conversation with the NLU interpretation and the changes in the information state.

      The second log file has the name: system-logs-MACHINE-USER-[YEAR_MONTH_DAY]-[HOURS_MINUTES_SECONDS]-sid=999-pid= with no extension. This contains the system messages generated and the content depends on the log level set for each component in the log4j configuration file: src/log4j.properties.

      Event listeners:

      As mentioned in the information state section one can also define event listeners that execute predefined updates to the information state every time a particular event is received irrespectively of the currently active sub-dialogue.

      To do so you can a list of listeners to the policy file using this syntax:

      Code Block
      themeEmacs
      firstline1
      titleListeners
      linenumberstrue
      <listeners>
      	<listen event="internal.timer" update="imply(questionnaire_flag==2, ++(break_timer,timerInterval))"/>
          <listen event="internal.timer" update="assign(smalltalk_pause_lock_auto,isQuestion(systemEvent))"/>
          <listen event="answer.observable.*" update="++(symptom_said)"/>
      </listeners>

      The above defines 3 listeners. The first two fire when the event internal.timer is received. The third fire when any event that is prefixed by "answer.observable." is received (the * has the semantics of .+ in traditional regular expressions).

      the update parameter defines what will be executed when the specified event is received. The imply update will increment the variable break_timer by the value of the variable timerInterval if the variable questionnaire_flag has value 2. Basically this allow to define an event listener that executes the update when a particular event is received and a particular information state condition is satisfied.

      the second update using the assign keyword is a simple assignment (the variable smalltalk_pause_lock_auto is assigned the value returned by the function isQuestion applied to the value of the variable systemEvent.

      The listing order is important as the listeners are evaluated in the order in which they were defined.

      Step 3: Train the natural language understanding module

      After defining the content and the dialogue policy we are ready for training the NLU. We need to start the FLoReS module. After the interface pops up:

      Image Removed

      select the NLU menu and under it the training voice. The first time the interface is opened the training happens automatically but if you update the content as described in step 1, you need to manually select this menu to update the NLU models.

      CakeVendor.zip contains all is required to define a CakeVendor character that is an extension of the character created in this other tutorial for NPCEditor.

      Running a character

      The FLoReS module comes with a chat interface that allows to easily test a dialogue policy without requiring any other module to run.

      The chat interface allows the user to select a character, train the NLU, see what the system says, send text to the system and momentarily block event processing in the dialogue manager.

      If you design multiple characters defined under resources/characters/, the one for which the policy was loaded correctly will be available in the Characters menu. select one to chat with it.

      The NLU menu allows you to retrain the NLU after you have made changes to the user utterances file.

      The Controls menu has a voice to pause the event processing so that you can check the logs before the next timer event comes in.

      To send text to teh character, just type in the bottom part of the interface and press ENTER.

      The latest messages sent by the system will be highlighted in green.

      ...

      simple
      }

      (sorry, one more way to call a sub-dialogue: Network)

      Notice the use of the standard xi:include to include content from external xml files. Use this technique to divide the content of the policy file as you desire.

      Flow of execution:

      As mentioned earlier a sub-dialogue defines a small conversation as a tree with nodes and edges. Nodes contain effects, while edges are either user, system or condition edges.

      Without entry paths, the remaining structure found in the sub-dialogue definition creates a tree with a single root. Unless otherwise specified the flow created is a single linear path of edges. To create a node with more than one outgoing edge one needs to use a special keyword (OR or IF).

      to this basic tree, the entry path add ways to define when and where the execution can start. An entry path defines when it can be traversed and at which point of the sub-dialogue it'll start the execution if taken.

      Topics:

      First we list the topics associated with this sub-dialologue. This is done using line 2, that associates this network to the topic qa. We can associate a network to as many topics as we like by adding more lines like this one.

      Topics use a hierarchical structure using the "." as a separator. The argument of the topic tag, #topic:, should be a complete path (i.e. from root to leaf) in this hierarchy.

      Entrance conditions:

      Line 3 defines a user entry path satisfied when the input event is question.what.you-make.

      A system initiative entry path would be defined with a line like: #entrance condition: system initiative

      A re-entry path is defined by a line like #reentrance option: statement.back where statement.back is the line that the system will say when it takes that re-entry path.

      All entry paths can be followed by an optional condition tag that specifies further restriction on when the entry path can be taken based on the information state. For example, the following entrance condition:

      Code Block
      themeEmacs
      titleSystem entry with condition
      #entrance condition: system initiative
      #condition: and(state=='start',!known(type),known(sugar))

      specifies a system initiative entry path what can be taken when the variable state contains the string 'start' and the variable type is null and the variable sugar is not null.

      An entrance condition should be placed where we want to start the execution if it's traversed. Entrance conditions don't have to be in the initial part of the sub-dialogue even though that is the most common location. Then can be anywhere we could start the execution of the sub-dialogue. in particular, re-entry paths should be located where execution can restart after an interruption. Interruptions can happen anywhere but the most common places are while waiting for a user input (i.e. at the state before user edges).

      System actions:

      Line 5 defines a system action that makes the system say the line associated to the identifier answer.what.make.

      That system action is translated into a system edge.

      User and system actions can be followed by an optional condition that specifies further restrictions on when that edge can be taken. For system actions the semantics of the optional condition is slightly different: the edge will be traversed no matter what, but the line associated to the edge will not be said if the condition is false. So the system line with condition should be read as "say this if this is true". The user line with condition should be read as "wait for this event and this should also be true".

      Effects:

      Line 6 attaches the reward effect to the state reached after the system edge is executed. This reward effect says that the reward associated tot he defined goal named simple will be achieved when the execution reaches that point on the sub-dialogue tree.

      Information state updates are defined using a line like #action: state='exit' in which the value of the variable state is set to be the string 'exit'.

      Example with conditional edges:

      This sub-dialogue defines a confirmation dialogue tree that says different things depending of the value of the type and flavor information state variables.

      Code Block
      themeEmacs
      firstline1
      titleSub-dialogue with conditional edges
      linenumberstrue
      Network done {
          #topic: done
          #entrance condition: system initiative
          #condition: and(state=='done')
          {
              if (and(type=='sponge',flavor=='chocolate'))
                  system: statement.cake.ready.chocolate.sponge
              if (and(type=='cheese',flavor=='chocolate'))
                  system: statement.cake.ready.chocolate.cheese
              if (flavor=='lemon')
                  system: statement.cake.ready.lemon
              if (flavor=='amaretto')
                  system: statement.cake.ready.amaretto
          }
          #action: state='exit'
          #goal: simple
      }

      The automatically generated dialogue tree is:

      Image Added

      In general the IF keyword has the following syntax:

      Code Block
      themeEmacs
      titleIF syntax
      IF (condition)
      {
      	//system/user actions
      }
      ELSE
      {
      	//system/user actions
      }

      The ELSE block is optional.

      Anchor
      examplecomplex
      examplecomplex
      A more complex example with user actions, ORs, DO and SWAPOUT:
      Code Block
      themeEmacs
      firstline1
      titleUser actions, ORs, DO, SWAPOUT
      linenumberstrue
      Network flavorCheese {
          #topic: set.flavor
          #entrance condition: system initiative
          #condition: and(state=='start',type=='cheese',known(sugar))
          
          #reentrance option: statement.back
          
          system: question.cake.flavor
          {
              {
                  user: statement.flavor.chocolate
                  #action: flavor='chocolate'
              }
              OR
              {
                  user: statement.flavor.lemon
                  #action: flavor='lemon'
              }
              OR
              {
                  user: statement.flavor.amaretto
                  system: apology.flavor
                  #action: clarifyFlavors=true
                  #action: swapout
              }
          }
          DO
          #action: state='done'
          #goal: simple
      }

      Here we see a re-entry path with no optional condition (remember that an entry path can be taken only if the sub-dialogue containing it is paused, that is it was started earlier by a normal entry path and so it already passed one level of conditions. This is to say that it's typical for re-entry paths to not have conditions).

      Line 8 is a system edge that brings as to a node with 3 outgoing edges defined by the 2 ORs. The use of OR should be read as follow this edge if you can, OR this, OR this.... For simple system actions that do not require different information state updates, an event macro would be easier to use.

      If one needs to specify a complex sub-tree instead of just a simple edge with no effects, one should surround the block of text defining that sub-tree with curly brackets to define its scope unambiguously. You can see the use of curly brackets for scoping at line 9, and then for the three blocks of code that defines the three arguments of the ORs.

      Line 26 closes the bracket opened at line 9. This generates empty edges that bring all execution paths open till then (3 in this case) to one single node.

      The DO keyword creates a new node to which we can attach effects if we don't need to use the standard way to create nodes using user or system actions. In this case 2 effects are attached to that node, a reward and an information state update.

      Line 11 shows a user action that defines an edge waiting for the event statement.flavor.chocolate.

      The rest are normal information state updates. The remaining new statement introduced is line 24: #action: swapout. this effect forces, when executed, the sub-dialogue to become paused. No further content can come after a swapout action as it'll not be executed. One would need to put a re-entry path there to be able to re-start execution right after the swapout action. This action can be used as a way to emulate calling another sub-dialogue. For example, line 23 sets the variable clarifyFlavors to true enabling a different network to be executed. Then we swapout the current network. Given how we set the rewards, the other network will be selected and once completed (if again the rewards are set properly) execution will restart the flavorChees network.

      Comments:

      Comments can be inserted using the Java style. Single line comments are // and multi-line comments are /* */.

      Debugging:
      Syntax checks:

      Syntax checks are executed at load time and if problems are encountered the policy is discarded and a message printed that says where the problem was encountered.

      Other messages are printed that may require your attention. For example, if user or system actions use undefined string identifiers you may need to press ENTER to continue the execution.

      Graph conversion:

      Also, if one defines very complex sub-dialogues with many different variable updates a warning may be presented saying that too many possible final conversation states (a conversation state is a set of effects defined along each possible path of the sub-dialogue tree) are defined. This will make the search step impossibly slow and consequentially the rewards useless. One way to avoid the problem is to add ignore statements to tell the code that process the sub-dialogue to find the possible different final conversation states to avoid considering a specific variable. These ignore statements should be added at the beginning of the network. For example, the statement: #ignore: var1 tells the code to ignore updates to the variable named var1.

      Another check to do to a policy is to generate the graphical representation and check that it matches the desired design. To generate the graph representation one needs to set the debug level associated with the  policy parser to DEBUG: modify in the file src/log4j.properties the line:

      log4j.logger.edu.usc.ict.nl.dm.reward.model.RewardPolicy=warn

      to

      log4j.logger.edu.usc.ict.nl.dm.reward.model.RewardPolicy=debug

      Then after the policy is loaded, you'll find a .gdl file named policy_complete_path_to_policy_file.gdl. Open this using the aiSee software.

      Dialogue manager logs:

      The system generates two log files in the logs directory. for each conversation, it generates a separate xml file with the following name: chat-log-MACHINE-USER-[YEAR_MONTH_DAY]-[HOURS_MINUTES_SECONDS]-sid=999-pid=.xml. This file should be seen in a browser that supports xsl style files (e.g. Firefox). It contains the record of the conversation with the NLU interpretation and the changes in the information state.

      The second log file has the name: system-logs-MACHINE-USER-[YEAR_MONTH_DAY]-[HOURS_MINUTES_SECONDS]-sid=999-pid= with no extension. This contains the system messages generated and the content depends on the log level set for each component in the log4j configuration file: src/log4j.properties.

      Event listeners:

      As mentioned in the information state section one can also define event listeners that execute predefined updates to the information state every time a particular event is received irrespectively of the currently active sub-dialogue.

      To do so you can a list of listeners to the policy file using this syntax:

      Code Block
      themeEmacs
      firstline1
      titleListeners
      linenumberstrue
      <listeners>
      	<listen event="internal.timer" update="imply(questionnaire_flag==2, ++(break_timer,timerInterval))"/>
          <listen event="internal.timer" update="assign(smalltalk_pause_lock_auto,isQuestion(systemEvent))"/>
          <listen event="answer.observable.*" update="++(symptom_said)"/>
      </listeners>

      The above defines 3 listeners. The first two fire when the event internal.timer is received. The third fire when any event that is prefixed by "answer.observable." is received (the * has the semantics of .+ in traditional regular expressions).

      the update parameter defines what will be executed when the specified event is received. The imply update will increment the variable break_timer by the value of the variable timerInterval if the variable questionnaire_flag has value 2. Basically this allow to define an event listener that executes the update when a particular event is received and a particular information state condition is satisfied.

      the second update using the assign keyword is a simple assignment (the variable smalltalk_pause_lock_auto is assigned the value returned by the function isQuestion applied to the value of the variable systemEvent.

      The listing order is important as the listeners are evaluated in the order in which they were defined.

      Step 3: Train the natural language understanding module

      After defining the content and the dialogue policy we are ready for training the NLU. We need to start the FLoReS module. After the interface pops up:

      Image Added

      select the NLU menu and under it the training voice. The first time the interface is opened the training happens automatically but if you update the content as described in step 1, you need to manually select this menu to update the NLU models.

      CakeVendor contains all is required to define a CakeVendor character that is an extension of the character created in this other tutorial for NPCEditor.

      Running a character

      The FLoReS module comes with a chat interface that allows to easily test a dialogue policy without requiring any other module to run.

      The chat interface allows the user to select a character, train the NLU, see what the system says, send text to the system, momentarily block event processing in the dialogue manager, test all nlg speech acts and inspect the information state.

      If you design multiple characters defined under resources/characters/, the one for which the policy was loaded correctly will be available in the Characters menu. select one to chat with it. the following is a screen capture of a typical characters menu:

      Image Added

      The NLU menu allows you to retrain the NLU after you have made changes to the user utterances file.

      The DM menu:

      Image Added

      allows to control the dialogue manager:

      • pause the event processing so that, for example, you can check the logs or the information state before the next timer event comes in
      • save the current information state so it can be loaded at a later time to initialize another character
      • reload the entire character content (to get fresh changes made to the files on disk)
      • send a login event to simulate a login from a remote user (useful for policies designed to start when a login is received)
      • open an information state inspector:
        Image Added
        in which one can see the value of all assignments and assertions in the information state and evaluate all expressions
      • enable a mode that displays for every user input the possible system replies that were available sorted by their expected reward

      The interface also allows to send text to the character, just type in the bottom part of the interface and press ENTER and select the particular character to which to send the text in case multiple are running (advanced configuration that uses the meta protocol).

      Configuration

      The system is completely configured using a set of xml files. the configuration is separated into 4 main modules:

      1. the messaging bus
      2. the NLU module
      3. the DM module
      4. the NLG module

      The messaging bus is common across all characters. The NLU, DM and NLG configuration instead is (can be) specific to each individual character. A default configuration must be provided but that can be overriden by specific configuration provided in the characters/CharacterName folder.

      Here follows a list of all the current configuration fields for each of the types above:

      Messaging bus configuration

      • AllowNluTraining: enables retraining of the NLU through the chat interface.
      • Character: default character attempted to start at startup.
      • ChatLog: the prefix (including path) of the chat log files.
      • ContentRoot: root directory where all characters are found. Typically it is resources/characters.
      • DisplayNluOutputChat: if true, the system displays the NLU output in the chat interface.
      • LoggingEventsInChatLog: if false, it disables saving the chat log file.
      • Protocols: list of external messaging protocols to enable. Each must be the name of a class extending edu.usc.ict.nl.bus.protocols.Protocol.
      • UseVrExpressOnly: use this is you want the system to listen to other's vrexpress messages and treat them as input text utterances.
      • UseVrSpeakOnly: similar to the UseVrExpressOnly. This enables listening to the vrSpoke messages instead of the vrExpress ones.
      • VhComponentId: the string identifier used to respond to the VHToolkit messanger API.
      • VhOtherSpeaker: a string identifying the name of other vh speakers to which we want to listen (for multi agent configurations). Can be "*" to indicate listen to all.
      • VhServer: ip/name of the activemq server to which to connect.
      • VhSpeaker: name of the sender of vh messages, in general this should be automatically set using the current character, but it can be overwritten by this setting.
      • VhTopic: topic for the vh messages.
      • ZoomFactorChat: a float used to configure the font size in the chat window.
      • FileRoot: deprecated
      • InternalDmClass4VhMsgWrapper: deprecated.
      • IsLoadBalancing: not used in this context.
      • RunningMode: deprecated.
      • ValidatePolicies: deprecated.

      NLU configuration

      • AcceptanceThreshold: if configured (i.e. not null or negative) the NLU will return its 1-best result only if the confidence score associated with it is above this threshold.
      • ChartNluInSingleMode: used by the class edu.usc.ict.nl.nlu.chart.MXChartClassifierNLU to disable the multiple speech acts extraction described in the paper: http://www.aclweb.org/anthology/P11-2017
      • ChartNluMaxLength: used by the class edu.usc.ict.nl.nlu.chart.MXChartClassifierNLU to automatically disable extracting multiple speech acts for longer utterances, see http://www.aclweb.org/anthology/P11-2017
      • EmptyTextEventName: if the user enters no text, then this event is returned by the NLU.
      • ForcedNLUContentRoot: this is the path to the NLU models in case you don't want to use the default location in a character folder.
      • FstInputSymbols: configuration used by the attempts to use Finite state transducers to do NLU for SPS. Read the code for more info.
      • FstOutputSymbols: see FstInputSymbols
      • RunningFstCommand:the actual command to be executed for the FST experiment. See src/NLUConfigs.xml for an example.
      • TrainingFstCommand: see RunningFstCommand
      • HierNluReturnsNonLeaves: boolean, default true. used in hierarchical NLU models, if true the model will return also non leaves (i.e. result from a NLU model that has children NLU models) when the result has a higher probability than the results of its children
      • HierarchicalNluSeparator: separator used in the labels to recognize hierarchical structure. For example, "." is the hier separator for java packages.
      • InternalNluClass4Chart: internal NLU class used by the chart classifier (multiple speech acts in a single line of text)
      • InternalNluClass4Hier: internal NLU class used in hierarchical NLU models.
      • InternalNluListForMultiNlu: list of NLU beans to run simultaneously in the multi NLU setup (e.g. SPS).
      • LowConfidenceEvent: event sent out if the 1-best NLU result is below the AcceptanceThreshold.
      • MaximumNumberOfLabels: maximum number of labels to be found in the training set. Used to generate an error or warning if it's known that the particular classifier used has this limitation (on the number of labels).
      • MergerForMultiNlu: bean to use to reach a single output from a multi NLU setup. For example, if classifier 1 returns result r1 then run classifier 2 and return result r2 as the global result, otherwise return r1.
      • NluClass: the basic NLU class used (could be a hierarchical or multi or chart or simple classifier). check out src/NLUConfigs.xml for some examples. Check out the resources/characters/*/NLUConfig.xml for other examples.
      • NluDir: the name of the directory under which the nlu stores its model: ContentRoot/characterName/NluDir
      • NluExeEnv: setup needed only when running specific external nlu exe that requires custom environment variables (check the source code, never used).
      • NluExeRoot: see NluExeEnv.
      • NluFeaturesBuilderClass: the class used to build the features from the training data for the NLU class.
      • NluHardLinks: file that contains direct links between surface text and speech acts. if a text matches that string 1-to-1 then the NLU is not invoked and the associated label is returned.
      • NluModelFile: the name of the file that stores the NLU model.
      • NluTrainingFile: the name of the file that stores training data used by the NLU classifier to train its model. Usually the data is generated from the user utterances found in xlsx files in the ContentRoot/characterName/content directory. Then that data goes to the features builder class and then it gets dumped in the training data format of the specific NLU class used in this file.
      • NluVhGenerating: if true the NLU generates the vrNLU vh message
      • NluVhListening: if true and the NL bus has VHProtocol enabled, then the system will listen to vrSpeech messages.
      • PreprocessingRunningConfig: the spring bean name of the preprocessing config to be use at runtime.
      • PreprocessingTrainingConfig: the spring bean name of the preprocessing config to use to generate the NLU training data. You need a different one, for example, if you want to run different preprocessing steps to prepare the training data as opposed to
      • PrintNluErrors: not used
      • Regularization: regularization parameter.
      • SpsMapperModelFile: sps specific, should be moved out to a sps specific class.
      • SpsMapperUsesNluOutput: see SpsMapperModelFile.
      • TrainingDataReader: class used to read the NLU training data format.
      • UseSystemFormsToTrainNLU: the system will extract training data from the forms definition file is there (forms define the multiple choice questions).
      • UserUtterances: defines the name of the file that contains the user utterances to be used for training.
      • nBest: defines hoe many results should be returned by the NLU.

      DM configuration

      • ApproximatedForwardSearch: if enabled the system runs a simplified search. faster but less accurate.
      • CaseSensitive: if true, variables are case sensitive, otherwise everything is lowercased internally.
      • DmClass: the DM class to be used (e.g. RewardDM)
      • DmVhGenerating: if true the DM generates the vrGenerate message.
      • DmVhListening: if true listens to vrNLU.
      • ForcedIgnoreEventName: name of the event generated when the dm ignores a user event.
      • InitialPolicyFileName: the name of the policy file.
      • LoginEventName: the name of the event generated at login.
      • LoopEventName:the name of the event generated if the DM recognizes that it's stuck in a loop.
      • MaxIterations: maximum number of iterations for a single event. it's a safeguard for bugs in the policy and event handling.
      • MaxSearchLevels: used to stop the search, defines the maximum depth of the visited search space.
      • PreferUserInitiatedActions: if true, it prefers a user initiated action from the possibilities found by the search.
      • SkipUnhandledWhileSystemSpeaking: if true, doesn't generate the unhandled event while the system is speaking.
      • SpecialVariablesFileName: the name of the file used to dump the list of special variables at startup.
      • SpokenFractionForSaid: percentage of a line that needs to be said (before interruption) for a line to be considered said.
      • StaticURLs
      • SystemEventsHaveDuration: if true, the system will track the NLG to wait for a line to be finished before moving on.
      • TimerEvent: the name of the timer event.
      • TimerInterval: the length of time in seconds between timer events. if negative, timer events are disabled.
      • TrivialSystemSpeechActs
      • UnhandledEventName: the name of the event generated when the system doesn't have an executable operator that can handle the current user event.
      • UserAlwaysInterrupts: if true, the user always interrupt the system.
      • ValueTrackers: trackers used to update specific variables with high precision. See src/DMConfigs.xml for examples.
      • VisualizerClass: used to visualize the DM state, mostly deprecated.
      • VisualizerConfig: see VisualizerClass
      • WaitForUserReplyTimeout: number of seconds for which we allow a user to take to reply.

      NLG configuration

      • AllowEmptyNLGOutput: if true the NLG can return empty text, otherwise it'll generate an error if empty text is returned.
      • AlwaysPreferForms: global flag used to prefer forms (multiple choice) for a given speech act if forms are defined for it/
      • DefaultDuration: default duration of a speech act. Usually VH messages are used to compute this, or the audio file, or the length of the text. If all custom methods fail, then this default is used.
      • DisplayFormAnswerInNlg: in case multiple choice is used, the nlg will return the full selected answer.
      • IsAsciiNLG: if true, it filters non ASCII characters from the nlg text.
      • IsNormalizeBlanksNLG: if true, normalizes blanks in the nlg text (removes duplicates, clean end of line).
      • IsStrictNLG: if true, it returns errors for each speech act used in the DM policy from which the NLG cannot return text.
      • LfNlgLexiconFile: not used
      • NlgClass: the NLG class used to generate text from speech acts.
      • NlgVhGenerating: not used, typically specific NLGs generate vh messages.
      • NlgVhListening: is true, listens to vrGenerate.
      • Nvbs: file that contains the nvb info.
      • Picker: class that specify how to pick one text realization from multiple possibilities for a given speech act.
      • SystemForms: name of the file that contains the definition of the multiple choice system lines.
      • SystemResources: name of the file that contains resources (e.g. links)
      • SystemUtterances: name of the file that contains the system lines (mapping between speech acts and surface form).