Tamas G

1 month ago

Tamas G
1 month ago

Look y'all, since a few of you have asked about NVDA 2026.1 and my speech synths, I'm not going to start work on this until at least betas are out. @fastfinge has done a phenominal work starting around this, and his code helps me understand how such a thing would be done, but for one, I don't want an inter-process thing to be written in C++ but rather Python, sort of how he has it. Until #NVDASR develops their own adapter that developers can reuse for any DLL and just adapt function names or signatures to the workings of that speech synth, that process would become much, much easier. If that host process defined an API shape existing speech synths just hook into, I'm all for it. Then I'll begin that work, but not until then. Thanks for getting it, or not.

#nvdasr @🇨🇦Samuel Proulx🇨🇦

This entry was edited (1 month ago)

🇨🇦Samuel Proulx🇨🇦 likes this.

in reply to Tamas G

Luis Carlos

in reply to Tamas G 1 month ago

I'm waiting if Sonata will be compatible with NVDA 2026.1 before I upgrade and even update those piper collabs. Anyone on there? @mush42 @ZBennoui

@Musharraf @Zach Bennoui

in reply to Luis Carlos

Tamas G

in reply to Luis Carlos 1 month ago

@luiscarlosgonzalez LOL between Brailab, 3 versions of Eloquence, SoftVoice, SMPSoft, Flexvoice, I just will have my hands busy redoing all of it. Not going to look forward to that work, if there's a reason I get GPT 5 pro again, that'll be it because by hand to recode so many synths... Hopefully those other devs come out to do the work too, but it's never a 100% guarantee and NVDA can't commit devs to that work nor are they obligated to continue it after the initial release if they cannot. That's just the nature of open-source stuff, I myself won't have time to pick it up but I know @mush42 has been looking for new maintainers for his work. @fastfinge @ZBennoui

@Musharraf @Luis Carlos @Zach Bennoui @🇨🇦Samuel Proulx🇨🇦

in reply to Tamas G

Luis Carlos

in reply to Tamas G 1 month ago

@mush42 @ZBennoui Yeah maybe someone will take ownership of Sonata not only the NVDA addon, but also the framework. Piper is another thing because the collabs are very outdated, many dependencies changed, and even Piper's repository has changed due to license conflicts. I don't know if the Piper creator is here in Mastodon, but maybe I'll...

@Musharraf @Zach Bennoui

in reply to Luis Carlos

Luis Carlos

in reply to Luis Carlos 1 month ago

@mush42 @ZBennoui Maybe I'll use Gemini to recode all of it? Becaus I'm scared if something has moved..

@Musharraf @Zach Bennoui

in reply to Luis Carlos

🇨🇦Samuel Proulx🇨🇦

in reply to Luis Carlos 1 month ago

If you want to do that with gemini-cli, make sure to give it access to a directory with the NVDA Addon developer guide, the API documentation for the synth you’re coding, and tell it where the NVDA repository is. The prompt should include each file name, and what it is. Then Gemini can read the documentation without searching the web, and that will save you a lot of time and tokens. If you have any header files or other bindings, also give it those.

in reply to Luis Carlos

🇨🇦Samuel Proulx🇨🇦

in reply to Luis Carlos 1 month ago

It’s already using rpc so a quick and dirty update would be easy. I just don’t use it myself, and as an amateur developer, I’m only comfortable maintaining code I personally use daily. Plus NVDA 2026 includes onx internally so we need to figure out how to access that.

Unknown parent

🇨🇦Samuel Proulx🇨🇦

Unknown parent 1 month ago

But if we make a general framework, aren’t we just inventing sapi again? That’s why I wanted to start by trying to make individual addons. But now that I tried and learned a lot doing so, it’s clear the disadvantages make it unsustainable. Including all of torch in an NVDA addon is a bad idea, it turns out.

Unknown parent

Luis Carlos

Unknown parent 1 month ago

@mush42 @ZBennoui And also the framework. Adding support for other lightwait TTS providers

@Musharraf @Zach Bennoui

in reply to 🇨🇦Samuel Proulx🇨🇦

Luis Carlos

in reply to 🇨🇦Samuel Proulx🇨🇦 1 month ago

@mush42 @ZBennoui I mean the Sonata framework, not the addon itself

@Musharraf @Zach Bennoui

in reply to Luis Carlos

🇨🇦Samuel Proulx🇨🇦

in reply to Luis Carlos 1 month ago

Right but Sherpa onnx might replace that. Or the various Microsoft AI providers. We don’t need to care about cross platform for NVDA. And so few people in this community have the expertise to write our own inference framework, I feel strongly that we really need to use something off the shelf. Otherwise, history will repeat itself, and the person working on the framework will get a job or otherwise be forced to drop the project. I can only think of four blind people who even know rust. And I’m not one of them.And the others all have full time jobs.

in reply to Luis Carlos

🇨🇦Samuel Proulx🇨🇦

in reply to Luis Carlos 1 month ago

Exactly that has already happened, and I was forced to pay money to strip Camlorn’s custom 3d audio library out of unspoken. Because he’s not maintaining that anymore, and he’s the only one in our community with that particular set of skills. If we as blind people want to write software for ourselves that will last, we have to stop depending on custom low level frameworks and make the things that already exist work for us.

Unknown parent

🇨🇦Samuel Proulx🇨🇦

Unknown parent 1 month ago

It already does for image description. But I don’t know how to get at it.

in reply to 🇨🇦Samuel Proulx🇨🇦

Luis Carlos

in reply to 🇨🇦Samuel Proulx🇨🇦 1 month ago

@mush42 @ZBennoui I don't know, maybe I'll ahve to wait if NVDA 2026.1 will integrate ONX directly instead of using frameworks, that way Piper and other TTS providers will work more than fast

@Musharraf @Zach Bennoui

in reply to 🇨🇦Samuel Proulx🇨🇦

Luis Carlos

in reply to 🇨🇦Samuel Proulx🇨🇦 1 month ago

@mush42 @ZBennoui If that's true, maybe for local AI speech synthesis could work

@Musharraf @Zach Bennoui

in reply to Luis Carlos

🇨🇦Samuel Proulx🇨🇦

in reply to Luis Carlos 1 month ago

Maybe. But I don't know if they include any of the audio parts required along with onnx.

⇧

Tamas G 1 month ago • •

Tamas G
1 month ago