Comparison of Speaker Adaptation Techniques for
Average-Voice-Based Speech Synthesis

Demonstrations

六百人のお客さんの人いきれにむし暑くて扇子を使わずにいられない。

Speaker
MHT
MTK
MMI
FTK
Target speech (Analysis-by-Synthesis)
 
Speaker-Dependent (SD) model
 
Average Voice Model Gender-Dependent Model (male) Gendar-Independent Model Gendar-Dependent Model(Female)
     
SBR [M. Rahim et al '96]
 
AMCC [K. Shinoda et al ‘95]
 
SMAP [K. Shinoda et al ‘01]
 
MLLR [C.J. Legetter et al ‘95]
 
CMLLR [V. Digalakis et al‘95][M.J.F. Gales ‘98]
 
SMAPLR [O. Shiohan et al ‘02]
 
CSMAPLR [Y. Nakano et al ‘06]
 
CSMAPLR+MAP [V. Digalakis et al ‘96]
 

Reference:

Yuji Nakano, Makoto Tachibana, Junichi Yamagishi, Takao Kobayashi
``Constrained structural maximum a posteriori linear regression for average-voice-based speech synthesis''
Proc. 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP, pp.2286--2289, Pittsburgh, USA (2006.09)


Demos
Top page