So since this does not seem to be widely disseminated knowledge, the RH Voice speech synthesizer engine allows for training your own voice models by associating sets of sentences in a text file with their corresponding recordings as wav files. There is a tutorial about this on the project's Wiki at:
github.com/RHVoice/RHVoice/wik….
I haven't tried it myself yet so can't answer possible questions, others have though so it surely works.
Happy hacking, creating SAPI, NVDA, Android voices or whatever else you imagine! ##SpeechSynthesis #Accessibility #NVDASR #Blind

reshared this

Unknown parent

mastodon - Link to source

Paweł Masarczyk

@bluespacedragon Yes, recording conditions as close as possible to a recording studio tend to guarantee the best results. A good quality microphone is certainly going to boost the quality. The most important is to avoid echo, background noise etc. There are probably other things to watch out for but I haven't got all the specifics at hand at the moment, can ask though. The good thing is you can try on a small subset of sentences, around 50-100 and see how the general effect is shaping and either modify what you find annoying or carry on as you are. Also note that it is possible to train against currently supported linguistic models E.G. English. If you would like a new language to be supported, it is necessary to develop a new model for it which is a lot of work and expertise from what I've heard. Good luck!