Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

The Virtual Human Toolkit is based on the ICT Virtual Human Architecture. This architecture defines at an abstract level what modules are needed to realize a virtual human, and how these modules interact. The basic functionality of each module as well as its interface is well-defined, but its actual implementation falls outside the scope of the architecture. The architecture dictates the implementation of a distributed system where communication is mostly realized using message passing. This allows for multiple implementations of a certain module and simple substitution of one implementation for another during runtimerun-time. It also allows for distributed systems, where different modules run on separate computers.

...

Please see below for a high level Virtual Human Architecture.

 

Image Removed

Figure 1. The ICT Virtual Human Architecture.Image Added

Virtual Human Toolkit Implementation

The Virtual Human Toolkit is a set of components (modules, tools and libraries) that implements one possible version instantiation of the Virtual Human Architecture. It has the following main modules:

  • Speech Recognition: PocketSphinx server plus AcquireSpeech client 
  • Multi-modal Perception & Understanding: MultiSense
  • Agent/Natural Language Generation: NPCEditor
  • Nonverbal Behavior Generation: Nonverbal Behavior Generator (NVBG)
  • Animation System: SmartBody
  • Renderer: Unity 3D or Ogre
  • Speech Generation: Text To Speech InterfaceNPCEditor, a statistical text classifier which matches novel input (a question the user asks) to authored output (a line the characters speak).
  • SmartBody (SB), a character animation platform that provides locomotion, steering, object manipulation, lip syncing, gazing and nonverbal behavior in real time through the Behavior Markup Language (BML).
  • NonVerbal Behavior Generator (NVBG), a rule based system that takes a character utterance as input and a nonverbal behavior schedule (gestures, head nods, etc.) in the form of BML as output.
  • MultiSense, a perception framework that enables multiple sensing and understanding modules to inter-operate simultaneously, broadcasting data through the Perception Markup Language (PML). 
  • Unity, a proprietary game engine. The Toolkit only contains the executable, but you can download the free version of Unity or purchase Unity Pro from their website. The Toolkit includes Ogre as an open source example on how to integrate SmartBody with a renderer. 
  • PocketSphinx, an open source speech recognition engine. In the Toolkit, PocketSphinx is the speech server for our AcquireSpeech client.
  • Text-to-speech engines, including Festival and MS SAPI.

For a complete overview of all the modules, tools and libraries, please see the Components section.

The figure below shows how all modules interact along with the messages used to communicate with each other.

Image Removed

Figure 2. The Virtual Human Toolkit architecture.a high level overview of the Toolkit modules. Instead of a full-fledged agent, the NPCEditor is handling all verbal in- and outputs. Note that since MultiSense (perception) is currently included as a basic proof-of-concept, it is directly communicating to SmartBody. A deeper integration with the system would require MultiSense to communicate with the NPCEditor, a dialogue manager and/or the NVBG instead. 

Image Added

A more detailed overview is depicted below. The bold arrows indicate a direct link between modules, either TCP/IP or included DLL. All other links show what messages are passing between modules; see Virtual Human Messaging for more details. 

Image Added

Data Models

Virtual Human Toolkit:

Image Added

Individual modules:

Image Added

Image Added

Image Added

Image Added

Image Added