Voice Interaction System 'MMDAgent' Makes Popular 3D Character Available for Lively Chat

Company: Nagoya Institute of Technology
Published: 25 September, 2010
The Speech Processing Laboratory of the Nagoya Institute of Technology has produced a prototype of "MMDAgent," a software toolkit to build a voice interaction system running on PCs. A voice interaction system built with "MMDAgent" allows a PC user to have a lively chat with a digital 3D character shown on the display screen as if the character were a real person. "MMDAgent" is a software toolkit combining several element technologies such as voice recognition, speech synthesis, 3D character display and voice interaction control.

Its features include the following:

(1) "MMDAgent" represents the high-level integration of self-developed, state-of-the-art speech synthesis and voice recognition technologies. It combines the speech synthesis toolkit HTS (HMM-based Speech Synthesis System), developed over a long period of time and made public by the laboratory, and the voice recognition engine Julius, realizing high-speed, accurate and expressive conversation capabilities.

(2) The toolkit has advanced 3D character-rendering capabilities based on OpenGL. It permits real 3D rendering by employing toon rendering and shadow mapping, and realistic expressions with the use of a physics engine.

(3) The toolkit's voice interaction control part allows a user without expert knowledge to describe delicate and rich voice dialogue scenarios, responding to various inner conditions and outer developments, including voice input.

(4) "MMDAgent" is planned for release as open-source software. As the format of its various models and other data adheres to open-source specifications, users can customize not only 3D character models, motions and voices but also entire dialog scenarios, or use existing models and data.

"MMDAgent" will be exhibited at CEATEC Japan 2010 to be held from October 5 in Makuhari Messe in the form of a system built into a large display for life-size digital signage. Visitors will be able to enjoy having lively conversation with a popular 3D character at the exhibition. The Speech Processing Laboratory is a special project lab set up for research on international voice language processing inside the compounds of the Nagoya Institute of Technology, with Tokuda & Lee Laboratory playing a leading role. Its high-level research results on voice technologies have been actively made public in the form of open-source software.