eurpod.com/synths/nvSpeechPlay…
The big switch is that it is no longer a sawtooth wave. Instead, it now uses asymmetric cosine glottal-flow pulse (a pitch-synchronous "glottal pulse train"). So, glottal flow pulses, not continuous oscillator shapes like triangle/saw/square. This has allowed us to achieve a much smoother voice, with clearer consonants but the familiarity of the voice people know.
Zach Bennoui reshared this.
Zach Bennoui
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
Tamas G
in reply to Zach Bennoui • • •Zach Bennoui
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
🇨🇦Samuel Proulx🇨🇦
in reply to Zach Bennoui • • •Sean Randall
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •Tamas G
in reply to Sean Randall • • •James Scholes
in reply to Sean Randall • • •🇨🇦Samuel Proulx🇨🇦
in reply to James Scholes • • •Tamas G
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to Tamas G • • •MariahL
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to MariahL • • •James Scholes
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
🇨🇦Samuel Proulx🇨🇦
in reply to James Scholes • • •@jscholes@cachondo@Tamasg@ZBennoui The other thing that makes this super, super hard is that there are like nine different systems, and all of them need tuning. And it's impossible to ask people who haven't spent pretty much four days straight thinking exclusively about this for feedback on a particular system, because they all work together to make up the voice, and you can't know where any given issue comes from. There's the rules for going from text to IPA phonemes. Then the rules for determining the way IPA phonemes are actually voiced and fit together. And then there's the intonation table. And then there's the two systems that actually make the sound. Right now I'm mostly looking at the system that actually makes the sound, IE when you do "aaaaaaaaaaaaa" or "eeeeeeeeeeeeeeeee", because that's still not right. But because it's an entire voice, it's even hard for me to separate my own perceptions and fix anything.
The important thing to remember is that eloquence began development in 1982, by a team of about a dozen researchers. It wasn't in the state we know it until around 2002. We have existing research to build on, but no funding and fewer people, and no PHD level speech researchers. So actually doing this, even with the help of AI, is a 20-30 year project before we get close to eloquence levels. Because we have something that "works" and improves step by step, it's easy to lose sight of the size of the problem we're taking on, because it feels like we should be able to get there in a month or two. But that's not realistic.
🇨🇦Samuel Proulx🇨🇦
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •Tamas G
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •They're so, so critical, and part of my frustration is this fork in the road feeling about all that, and in that sense the phoneme tuning is the easier part in some ways than the linguistics of it all. The blame isn't 100% ESpeak, but it's not 100% phoneme tuning either.
🇨🇦Samuel Proulx🇨🇦
in reply to Tamas G • • •Cleverson
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to Cleverson • • •Cleverson
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦 likes this.
James Scholes
in reply to Cleverson • • •It really does not mean that. If someone doesn't like your work, it often just means you haven't yet landed on a form they find pleasing or useful.
And that might be fine if, say, you're building a command line app and the only thing a given user would find acceptable is a graphical one. If you have no plans to build a graphical one, they're shit out of luck and you have to move on.
Other times, though, it does point to a real problem in the thing you're building that you should legitimately keep trying to solve. Dismissing everybody who expresses negativity outright only leads to an echo chamber.
@fastfinge @cachondo @Tamasg @ZBennoui
🇨🇦Samuel Proulx🇨🇦 likes this.
James Scholes
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •@cachondo @ZBennoui This is all fair and, to me at least, interesting insight. But part of embarking upon such a huge project is dealing with feedback from people on what is ultimately seen a user-facing, homogenous blob.
Many such people won't understand and, if we're being honest, care about how the sausage is being made. Particularly if the changes between builds are too subtle for most of them to perceive.
Regardless, I'm glad this is being worked on and deeply appreciative of the effort.
🇨🇦Samuel Proulx🇨🇦
in reply to James Scholes • • •Tamas G
in reply to James Scholes • • •🇨🇦Samuel Proulx🇨🇦 likes this.
🇨🇦Samuel Proulx🇨🇦
in reply to Tamas G • • •Cleverson
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •Tamas G
in reply to Sean Randall • • •🇨🇦Samuel Proulx🇨🇦 likes this.
🇨🇦Samuel Proulx🇨🇦
in reply to Tamas G • • •Zach Bennoui
in reply to Sean Randall • • •Zach Bennoui
in reply to Zach Bennoui • • •Sean | Ginsenshi The blindwolf
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to Sean | Ginsenshi The blindwolf • • •aaron
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
Tamas G
in reply to aaron • • •🇨🇦Samuel Proulx🇨🇦 likes this.
Dane Stange
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
Cleverson
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
Sean | Ginsenshi The blindwolf
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦 likes this.
Sean | Ginsenshi The blindwolf
in reply to Tamas G • • •Spacedog
in reply to Tamas G • • •🇨🇦Samuel Proulx🇨🇦
in reply to Spacedog • • •