Synfonica: Building on sound principles

Developing a new generation of
text-to-speech technology with
unprecedented flexibility

Flexible and expressive TTS

Synfonica is developing a knowledge-based text-to-speech (TTS) system called Synfony that uses formant synthesis for its speech output and can produce American English speech in a number of expressive speech modes and styles. These currently include conversational, reading, sadness, elation, cold anger, and hot anger. Formant synthesis offers a degree of control over the speech output not afforded by any other type of synthesis technology. Synfony capitalizes on this flexibility while at the same time offering substantially improved naturalness over existing formant synthesis systems. (The work on expressive speech was funded in part by Small Business Innovation Research grant R44DC014173 from the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health.)

Synfony's flexibility and naturalness make it well-suited to virtually any application requiring unlimited vocabulary speech output, including embedded applications. Our initial focus is on specialized applications for assistive technology, particularly applications for blind individuals. At the same time, we are developing a software development kit (SDK) for more general speech synthesis applications. You can read more about our approach and its evolution on our Technology page.

Our synthesis models build on linguistic and perceptual speech models developed by Dr. Susan Hertz and her collaborators over more than forty years. You can read about our current team on our About Us page.

Personalized synthetic voices

Together with Nemours Children's Health, we are marrying the knowledge-based technology underlying Synfony with the machine-learning technology underlying pioneering voice-banking service called ModelTalker developed at Nemours, with the aim of developing improved TTS services for individuals who can't speak. Among other things, Synfony's algorithms for the generation of natural-sounding prosody in different modes and styles will be integrated into ModelTalker's machine-learning algorithms, creating a hybrid system that embraces the best of both approaches.

The TTS system resulting from this project will require just a few minutes of recorded speech from the voice banker, accurately capture each voice banker's vocal identity, and be structured such that any new expressive modes and speech styles available in Synfony can be added without additional recording. (This work has been funded in part by Small Business Technology Transfer grant R41DC020693 from the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health.)

Opportunity for Speech-Language Pathologists

We are seeking SLPs interested in testing EnunciAid, a proprietary speech therapy tablet app currently in development.

Our application has unique capabilities that may be of substantial benefit in helping your clients improve their speech. Working with us would give you the opportunity to help shape the product.

Please contact us if you are interested in learning more about this opportunity.

Developing a new generation of text-to-speech technology with unprecedented flexibility

Flexible and expressive TTS

Personalized synthetic voices

Opportunity for Speech-Language Pathologists

Developing a new generation of
text-to-speech technology with
unprecedented flexibility