Jamie Teh

I spent several hours over the last few days implementing WASAPI audio output for NVDA for some reason. As I suspected, I don't think it's really any more responsive, but I'm hoping it might eventually fix some tricky bugs with the old WinMM implementation, though it'll probably introduce a bunch of its own. Still quite some way to go before it's fully featured; e.g. it doesn't support any device other than the default yet, nor can it recover if a device disappears. #NVDASR

#nvdasr

Peter Vágner likes this.

Peter Vágner reshared this.

in reply to Jamie Teh

Mohamed Al-Hajamy 💾

in reply to Jamie Teh • 1 year ago • •

Does it help with the crackling issues that occur in synths that send small chunks of audio?

in reply to Mohamed Al-Hajamy 💾

Jamie Teh

in reply to Mohamed Al-Hajamy 💾 • 1 year ago • •

@MutedTrampet I'm not sure. I've only tested with eSpeak and OneCore so far. It probably won't cope with that yet, but it might be possible going forward. That said, I added a buffered option to the existing audio code years ago which should have fixed this for such synths if they set it correctly.

@Mohamed Al-Hajamy 💾

in reply to Mohamed Al-Hajamy 💾

Jamie Teh

in reply to Mohamed Al-Hajamy 💾 • 1 year ago • •

@MutedTrampet What specific synths do you know of that send small chunks of audio?

@Mohamed Al-Hajamy 💾

in reply to Jamie Teh

Mohamed Al-Hajamy 💾

in reply to Jamie Teh • 1 year ago • •

Even eSpeak does it, I think. Someone filed an issue with sentences and settings that would cause it to crackle.

in reply to Mohamed Al-Hajamy 💾

Jamie Teh

in reply to Mohamed Al-Hajamy 💾 • 1 year ago • •

@MutedTrampet Ah, the crackling at end of lines bug? Yeah, this should fix that. I thought you were referring to a synth which consistently sends small chunks, rather than just at end of lines, etc.

@Mohamed Al-Hajamy 💾

in reply to Jamie Teh

Jamie Teh

in reply to Jamie Teh • 1 year ago • •

@MutedTrampet The approach for indexing/callbacks is fundamentally different and doesn't require pushing smaller buffers to get reliable timing. However, synths that push really tiny chunks consistently might cause problems. I can probably deal with those, but I don't know of one right now, so I'm not sure if it's worth putting time into.

@Mohamed Al-Hajamy 💾

in reply to Jamie Teh

x0

in reply to Jamie Teh • 1 year ago • •

@MutedTrampet eloquence_threshold supposedly ran into this, IBMTTS mitigates it by collecting the buffers into a bigger block but that destroys indexing.

@Mohamed Al-Hajamy 💾

in reply to x0

Jamie Teh

in reply to x0 • 1 year ago • •

@x0 @MutedTrampet Okay. No legal/current synth I can try though, which is going to make this difficult to test. :) Might have to simulate it by dropping eSpeak's buffer size or something.

@x0 @Mohamed Al-Hajamy 💾

in reply to Jamie Teh

Quin

in reply to Jamie Teh • 1 year ago • •

Ooh, that's cool. Almost all the NVDA bugs I've seen are related to nvwave in one way or another, is this what you're replacing?

in reply to Quin

Jamie Teh

in reply to Quin • 1 year ago • •

@TheQuinbox The internals of nvwave, yes. It'll still be nvwave for compatibility reasons, but the underlying implementation is entirely new. Most of the gruntwork is offloaded to C++ code.

@Quin

in reply to Jamie Teh

Jamie Teh

in reply to Jamie Teh • 1 year ago • •

I almost wish I hadn't started this NVDA + WASAPI thing. I've spent many hours on it now (713 lines of code so far) and now I probably won't be able to let it go until it's done. It works pretty well now, but there are still edge cases that need fixing; e.g. if you force a non-default device and then disconnect it mid playback. Ug.

in reply to Jamie Teh

Jamie Teh

in reply to Jamie Teh • 1 year ago • •

I'm also providing an option to allow NVDA synth drivers to pass raw memory pointers for audio data instead of converting to a Python bytes object, which is a lot of unnecessary memory copying and overhead when the ultimate audio buffer is just raw memory (no Python objects) anyway. I've updated eSpeak and OneCore already and it works quite nicely, though I don't really notice a difference on my system.

in reply to Jamie Teh

Leonard de Ruijter

in reply to Jamie Teh • 1 year ago • •

I love the initiative! I assume it will be API breaking, or can that be avoided?

in reply to Leonard de Ruijter

Jamie Teh

in reply to Leonard de Ruijter • 1 year ago • •

@leonardder Currently, it's API breaking for synths which choose to use it in that they won't work with the old nvwave. Synths using the old method will still work with the new nvwave though. However, I realised there's a way to implement the raw pointer thing with the old nvwave. It can just convert to a Python bytes buffer on the fly.

@Leonard de Ruijter

in reply to Jamie Teh

Charli Jo

in reply to Jamie Teh • 1 year ago • •

what is it for?

in reply to Charli Jo

Jamie Teh

in reply to Charli Jo • 1 year ago • •

@CharliJo I'm rewriting NVDA's audio output code to use a more modern Windows framework. It should hopefully improve audio stability a bit, though the advantages probably aren't noticeable for most people.

@Charli Jo

⇧

Jamie Teh 1 year ago • •

Jamie Teh
1 year ago • •