Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Disclaimer

Although this module is not part of the vhtoolkit distribution, it is compatible with it and can be used when more flexibility is desired for natural language understanding, generation and dialog management. The module is available open source from this github repository.

Getting started

  1. Clone the jmnl github repository.
  2. Install Eclipse.
  3. Install Java at least 8.
  4. In eclipse, import the existing project defined in the github repository.
  5. Run the main in edu.usc.ict.nl.ui.chat.ChatInterface with arguments -s chatInterface.xml.

Overview

the process of creating a character is iterative in nature. The following steps create an initial character that will probably need to be refined by repeating the steps as required to obtain the desired behavior.

...

select the NLU menu and under it the training voice. The first time the interface is opened the training happens automatically but if you update the content as described in step 1, you need to manually select this menu to update the NLU models.

CakeVendor.zip contains contains all is required to define a CakeVendor character that is an extension of the character created in this other tutorial for NPCEditor.

...

The interface also allows to send text to the character, just type in the bottom part of the interface and press ENTER and select the particular character to which to send the text in case multiple are running (advanced configuration that uses the meta protocol).

Configuration

The system is completely configured using a set of xml files.

 

 the configuration is separated into 4 main modules:

  1. the messaging bus
  2. the NLU module
  3. the DM module
  4. the NLG module

The messaging bus is common across all characters. The NLU, DM and NLG configuration instead is (can be) specific to each individual character. A default configuration must be provided but that can be overriden by specific configuration provided in the characters/CharacterName folder.

Here follows a list of all the current configuration fields for each of the types above:

Messaging bus configuration

  • AllowNluTraining: enables retraining of the NLU through the chat interface.
  • Character: default character attempted to start at startup.
  • ChatLog: the prefix (including path) of the chat log files.
  • ContentRoot: root directory where all characters are found. Typically it is resources/characters.
  • DisplayNluOutputChat: if true, the system displays the NLU output in the chat interface.
  • LoggingEventsInChatLog: if false, it disables saving the chat log file.
  • Protocols: list of external messaging protocols to enable. Each must be the name of a class extending edu.usc.ict.nl.bus.protocols.Protocol.
  • UseVrExpressOnly: use this is you want the system to listen to other's vrexpress messages and treat them as input text utterances.
  • UseVrSpeakOnly: similar to the UseVrExpressOnly. This enables listening to the vrSpoke messages instead of the vrExpress ones.
  • VhComponentId: the string identifier used to respond to the VHToolkit messanger API.
  • VhOtherSpeaker: a string identifying the name of other vh speakers to which we want to listen (for multi agent configurations). Can be "*" to indicate listen to all.
  • VhServer: ip/name of the activemq server to which to connect.
  • VhSpeaker: name of the sender of vh messages, in general this should be automatically set using the current character, but it can be overwritten by this setting.
  • VhTopic: topic for the vh messages.
  • ZoomFactorChat: a float used to configure the font size in the chat window.
  • FileRoot: deprecated
  • InternalDmClass4VhMsgWrapper: deprecated.
  • IsLoadBalancing: not used in this context.
  • RunningMode: deprecated.
  • ValidatePolicies: deprecated.

NLU configuration

  • AcceptanceThreshold: if configured (i.e. not null or negative) the NLU will return its 1-best result only if the confidence score associated with it is above this threshold.
  • ChartNluInSingleMode: used by the class edu.usc.ict.nl.nlu.chart.MXChartClassifierNLU to disable the multiple speech acts extraction described in the paper: http://www.aclweb.org/anthology/P11-2017
  • ChartNluMaxLength: used by the class edu.usc.ict.nl.nlu.chart.MXChartClassifierNLU to automatically disable extracting multiple speech acts for longer utterances, see http://www.aclweb.org/anthology/P11-2017
  • EmptyTextEventName: if the user enters no text, then this event is returned by the NLU.
  • ForcedNLUContentRoot: this is the path to the NLU models in case you don't want to use the default location in a character folder.
  • FstInputSymbols: configuration used by the attempts to use Finite state transducers to do NLU for SPS. Read the code for more info.
  • FstOutputSymbols: see FstInputSymbols
  • RunningFstCommand:the actual command to be executed for the FST experiment. See src/NLUConfigs.xml for an example.
  • TrainingFstCommand: see RunningFstCommand
  • HierNluReturnsNonLeaves: boolean, default true. used in hierarchical NLU models, if true the model will return also non leaves (i.e. result from a NLU model that has children NLU models) when the result has a higher probability than the results of its children
  • HierarchicalNluSeparator: separator used in the labels to recognize hierarchical structure. For example, "." is the hier separator for java packages.
  • InternalNluClass4Chart: internal NLU class used by the chart classifier (multiple speech acts in a single line of text)
  • InternalNluClass4Hier: internal NLU class used in hierarchical NLU models.
  • InternalNluListForMultiNlu: list of NLU beans to run simultaneously in the multi NLU setup (e.g. SPS).
  • LowConfidenceEvent: event sent out if the 1-best NLU result is below the AcceptanceThreshold.
  • MaximumNumberOfLabels: maximum number of labels to be found in the training set. Used to generate an error or warning if it's known that the particular classifier used has this limitation (on the number of labels).
  • MergerForMultiNlu: bean to use to reach a single output from a multi NLU setup. For example, if classifier 1 returns result r1 then run classifier 2 and return result r2 as the global result, otherwise return r1.
  • NluClass: the basic NLU class used (could be a hierarchical or multi or chart or simple classifier). check out src/NLUConfigs.xml for some examples. Check out the resources/characters/*/NLUConfig.xml for other examples.
  • NluDir: the name of the directory under which the nlu stores its model: ContentRoot/characterName/NluDir
  • NluExeEnv: setup needed only when running specific external nlu exe that requires custom environment variables (check the source code, never used).
  • NluExeRoot: see NluExeEnv.
  • NluFeaturesBuilderClass: the class used to build the features from the training data for the NLU class.
  • NluHardLinks: file that contains direct links between surface text and speech acts. if a text matches that string 1-to-1 then the NLU is not invoked and the associated label is returned.
  • NluModelFile: the name of the file that stores the NLU model.
  • NluTrainingFile: the name of the file that stores training data used by the NLU classifier to train its model. Usually the data is generated from the user utterances found in xlsx files in the ContentRoot/characterName/content directory. Then that data goes to the features builder class and then it gets dumped in the training data format of the specific NLU class used in this file.
  • NluVhGenerating: if true the NLU generates the vrNLU vh message
  • NluVhListening: if true and the NL bus has VHProtocol enabled, then the system will listen to vrSpeech messages.
  • PreprocessingRunningConfig: the spring bean name of the preprocessing config to be use at runtime.
  • PreprocessingTrainingConfig: the spring bean name of the preprocessing config to use to generate the NLU training data. You need a different one, for example, if you want to run different preprocessing steps to prepare the training data as opposed to
  • PrintNluErrors: not used
  • Regularization: regularization parameter.
  • SpsMapperModelFile: sps specific, should be moved out to a sps specific class.
  • SpsMapperUsesNluOutput: see SpsMapperModelFile.
  • TrainingDataReader: class used to read the NLU training data format.
  • UseSystemFormsToTrainNLU: the system will extract training data from the forms definition file is there (forms define the multiple choice questions).
  • UserUtterances: defines the name of the file that contains the user utterances to be used for training.
  • nBest: defines hoe many results should be returned by the NLU.

DM configuration

  • ApproximatedForwardSearch: if enabled the system runs a simplified search. faster but less accurate.
  • CaseSensitive: if true, variables are case sensitive, otherwise everything is lowercased internally.
  • DmClass: the DM class to be used (e.g. RewardDM)
  • DmVhGenerating: if true the DM generates the vrGenerate message.
  • DmVhListening: if true listens to vrNLU.
  • ForcedIgnoreEventName: name of the event generated when the dm ignores a user event.
  • InitialPolicyFileName: the name of the policy file.
  • LoginEventName: the name of the event generated at login.
  • LoopEventName:the name of the event generated if the DM recognizes that it's stuck in a loop.
  • MaxIterations: maximum number of iterations for a single event. it's a safeguard for bugs in the policy and event handling.
  • MaxSearchLevels: used to stop the search, defines the maximum depth of the visited search space.
  • PreferUserInitiatedActions: if true, it prefers a user initiated action from the possibilities found by the search.
  • SkipUnhandledWhileSystemSpeaking: if true, doesn't generate the unhandled event while the system is speaking.
  • SpecialVariablesFileName: the name of the file used to dump the list of special variables at startup.
  • SpokenFractionForSaid: percentage of a line that needs to be said (before interruption) for a line to be considered said.
  • StaticURLs
  • SystemEventsHaveDuration: if true, the system will track the NLG to wait for a line to be finished before moving on.
  • TimerEvent: the name of the timer event.
  • TimerInterval: the length of time in seconds between timer events. if negative, timer events are disabled.
  • TrivialSystemSpeechActs
  • UnhandledEventName: the name of the event generated when the system doesn't have an executable operator that can handle the current user event.
  • UserAlwaysInterrupts: if true, the user always interrupt the system.
  • ValueTrackers: trackers used to update specific variables with high precision. See src/DMConfigs.xml for examples.
  • VisualizerClass: used to visualize the DM state, mostly deprecated.
  • VisualizerConfig: see VisualizerClass
  • WaitForUserReplyTimeout: number of seconds for which we allow a user to take to reply.

NLG configuration

  • AllowEmptyNLGOutput: if true the NLG can return empty text, otherwise it'll generate an error if empty text is returned.
  • AlwaysPreferForms: global flag used to prefer forms (multiple choice) for a given speech act if forms are defined for it/
  • DefaultDuration: default duration of a speech act. Usually VH messages are used to compute this, or the audio file, or the length of the text. If all custom methods fail, then this default is used.
  • DisplayFormAnswerInNlg: in case multiple choice is used, the nlg will return the full selected answer.
  • IsAsciiNLG: if true, it filters non ASCII characters from the nlg text.
  • IsNormalizeBlanksNLG: if true, normalizes blanks in the nlg text (removes duplicates, clean end of line).
  • IsStrictNLG: if true, it returns errors for each speech act used in the DM policy from which the NLG cannot return text.
  • LfNlgLexiconFile: not used
  • NlgClass: the NLG class used to generate text from speech acts.
  • NlgVhGenerating: not used, typically specific NLGs generate vh messages.
  • NlgVhListening: is true, listens to vrGenerate.
  • Nvbs: file that contains the nvb info.
  • Picker: class that specify how to pick one text realization from multiple possibilities for a given speech act.
  • SystemForms: name of the file that contains the definition of the multiple choice system lines.
  • SystemResources: name of the file that contains resources (e.g. links)
  • SystemUtterances: name of the file that contains the system lines (mapping between speech acts and surface form).