Paweł Masarczyk

2 years ago

Paweł Masarczyk
2 years ago

So since this does not seem to be widely disseminated knowledge, the RH Voice speech synthesizer engine allows for training your own voice models by associating sets of sentences in a text file with their corresponding recordings as wav files. There is a tutorial about this on the project's Wiki at:
github.com/RHVoice/RHVoice/wik….
I haven't tried it myself yet so can't answer possible questions, others have though so it surely works.
Happy hacking, creating SAPI, NVDA, Android voices or whatever else you imagine! ##SpeechSynthesis #Accessibility #NVDASR #Blind

Home · RHVoice/RHVoice Wiki

a free and open source speech synthesizer for Russian and other languages - Home · RHVoice/RHVoice Wiki

^GitHub

Peter Vágner likes this.

reshared this

in reply to Paweł Masarczyk

Tarren (They/Them)

in reply to Paweł Masarczyk 2 years ago

Ooooh, I wonder if that would be an option for getting something better than #eSpeak for my Linux computers. I mean, sure I can understand it OK, but it's kinda grating.

#espeak

in reply to Tarren (They/Them)

Paweł Masarczyk

in reply to Tarren (They/Them) 2 years ago

@Tarrenvane You could certainly try. The possibilities of the engine allow for producing of pretty high quality, pleasant to listen voices actually. Take a look at the examples in Polish at zlotowicz.pl/synteza

Pięć polskich głosów dla syntezatora mowy RHVoice. – www.zlotowicz.pl

^zlotowicz.pl

@Tarren (They/Them)

in reply to Paweł Masarczyk

x0

in reply to Paweł Masarczyk 2 years ago

@pitermach I've tried it, and getting the environment set up requires Linux wizardry that I still haven't been able to pull off. The HTK is some old patch for older code that isn't readily available now and trying to apply it breaks stuff.

@Pitermach

Unknown parent

Paweł Masarczyk

Unknown parent 2 years ago

@bluespacedragon Yes, recording conditions as close as possible to a recording studio tend to guarantee the best results. A good quality microphone is certainly going to boost the quality. The most important is to avoid echo, background noise etc. There are probably other things to watch out for but I haven't got all the specifics at hand at the moment, can ask though. The good thing is you can try on a small subset of sentences, around 50-100 and see how the general effect is shaping and either modify what you find annoying or carry on as you are. Also note that it is possible to train against currently supported linguistic models E.G. English. If you would like a new language to be supported, it is necessary to develop a new model for it which is a lot of work and expertise from what I've heard. Good luck!

in reply to Paweł Masarczyk

djtt88

in reply to Paweł Masarczyk 2 years ago

This sounds interesting, even though I would need to learn quite a few things. Is there any demo of this rh voice synthesiser in action? would be fun to test out some voices and hear what they sound like.

in reply to Paweł Masarczyk

Tehiverse

in reply to Paweł Masarczyk 2 years ago

Hi, the one you once new as either jakalas01 or JVSStuff on Twitter. Jacques from South Africa here. Seems like the RHVoice's page for NVDA is somewhat broken since I don't get a link to download it when I go there.

Paweł Masarczyk

Paweł Masarczyk 2 years ago • •

Paweł Masarczyk
2 years ago