Animated pronunciation generated from speech for pronunciation training

Yurie Iribe, Silasak Manosavan, Kouichi Katsurada, Tsuneo Nitta

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Computer-assisted pronunciation training (CAPT) was introduced for language education in recent years. CAPT scores the learner's pronunciation quality and points out wrong phonemes by using speech recognition technology. However, although the learner can thus realize that his/her speech is different from the teacher's, the learner still cannot control the articulation organs to pronounce correctly. The learner cannot understand how to correct the wrong articulatory gestures precisely. We indicate these differences by visualizing a learner's wrong pronunciation movements and the correct pronunciation movements with CG animation. We propose a system for generating animated pronunciation by estimating a learner's pronunciation movements from his/her speech automatically. The proposed system maps speech to coordinate values that are needed to generate the animations by using multi-layer neural networks (MLN). We use MRI data to generate smooth animated pronunciations. Additionally, we verify whether the vocal tract area and articulatory features are suitable as characteristics of pronunciation movement through experimental evaluation.

Original languageEnglish
Title of host publicationIntelligent Interactive Multimedia
Subtitle of host publicationSystems and Services : Proceedings of the 5th International Conference on Intelligent Interactive Multimedia Systems and Services (IIMSS 2012)
EditorsJain Lakhmi, Howlett Robert, Watada Junzo, Watanabe Toyohide, Takahashi Naohisa
Pages73-82
Number of pages10
DOIs
Publication statusPublished - 2012

Publication series

NameSmart Innovation, Systems and Technologies
Volume14
ISSN (Print)2190-3018
ISSN (Electronic)2190-3026

Keywords

  • Animated Pronunciation
  • Articulatory Feature
  • Pronunciation Training
  • Vocal Tract Area

Fingerprint

Dive into the research topics of 'Animated pronunciation generated from speech for pronunciation training'. Together they form a unique fingerprint.

Cite this