Synthesis of disordered voices

International Cooperation Program between CNPq (National Research Council of Brazil) and F.R.S.-FNRS (Fonds de la Recherche Scientifique, French-Speaking Community of Belgium), 2012 - 2014.

Jorge C. Lucero and Jean Schoentgen

The context of this project is the clinical assessment of voice. Auditory and acoustic assessment of voice (and by extension speech) is to laryngology and speech therapy what electrocardiography is to cardiology and electroencephalography to neurology. That is, it reports the function of the laryngeal oscillator and the adequacy of the produced vocal timbre by relying on methods of investigation that are not intrusive and do not obstruct the patient’s production of speech. Synthetic speech contributes to that purpose as a computational tool to facilitate tests and training, and also to explore and understand the genesis of abnormal vocal qualities.

Vocal tract

A disordered voice is a voice that is perceived as abnormal with regard to pitch, loudness or timbre, and is often the consequence of a laryngeal pathology or some physiological dysfunction. In this project, we develop synthesizers of speech sounds which are capable of simulating the timbre of disordered voices with an acceptable level of naturalness. Two main strategies are followed: a physics-based one, which uses a dynamical model of the vocal folds coupled to a tube representation of the vocal tract, and formant synthesis, with a kinematic description of the glottal source.

Listen to some samples produced by our physics-based synthesizer.

Publications

  1. J. C. Lucero, J. Schoentgen, M. Behlau & G. Madazio. ‘‘A vocal fold model for disordered voice synthesis,’’ 9th International Conference on Voice Physiology and Biomechanics (Salt Lake City, 2014). (Abstract and synthetic voice samples).

  2. J. Schoentgen & J. C. Lucero. ‘‘Solving the Riccati-­Titze equation of the glottal airflow rate,’’ 9th International Conference on Voice Physiology and Biomechanics (Salt Lake City, 2014).

  3. J. Schoentgen & J. C. Lucero. ‘‘Is formant frequency jitter audible?,’’ 9th International Conference on Voice Physiology and Biomechanics (Salt Lake City, 2014).

  4. J. Schoentgen, S. Fraj & J. C. Lucero. ‘‘Testing the reliability of Grade, Roughness and Breathiness scores by means of synthetic speech stimuli,’’ Logopedics, Phoniatrics, Vocology (in press).

  5. M. Behlau, G. Madazio, J. C. Lucero & J. Schoentgen. ‘‘A novo paradigma no ensino da avaliação auditiva da voz – Uso de amostras sintetizadas,’’ (in Portuguese) XXI Brasilian and II Ibero-American Congress of Phonoaudiology (Porto de Galinhas, 2013). Award for Excellence in Phonoaudiology.

  6. J. C. Lucero, J. Schoentgen & M Behlau. ‘‘Physics-based synthesis of disordered voices,’’ Interspeech 2013 (Lyon, 2013). (PDF, synthetic samples).

  7. J. Schoentgen & J. C. Lucero. ‘‘Is jitter irregularity the missing link between measured speech cycle length jitter and perceived vocal timbre?,’’ 10th Pan European Voice Conference - PEVOC (Prague, 2013).

  8. J. C. Lucero & J. Schoentgen. ‘‘Modeling vocal fold asymmetries with coupled van der Pol oscillators,’’ 21st International Congress on Acoustics (Montreal, 2013). (PDF).

  9. J. Schoentgen & J. C. Lucero. ‘‘Synthesis by rule of disordered voices,’’ in: T. Drugman & T. Dutroit (eds.), Advances in Nonlinear Speech Processing (Lecture Notes in Computer Science), Springer, 120-127 (2013). (PDF).