Hello everyone,
time for another of my crazy ideas!
While attending classes of theoretical mathematics, I'm usually facing 3 problems:
1. I can't write down notes and pay attention at the same time
2. Sometimes, I don't get the context of the explained concept right away, I need few moments to think it through or even lookup additional details in my notes or on the Internet. So, I either don't do so and end up just sitting in the class being unable to understand anything, because that concept was important for later topics, or, I do the lookup asynchronously, what however means I get out of sync with the explanation and find myself in the same situation, except now I can't do much with it.
3. If the class requires active work, my mind gets submerged in the problem and can't track anything in the physical world, resulting in shattered context and missed information.
Recording classes can fix all of these issues, however for the cost of doubling the processing time for each class, since raw recordings don't hold any information about their content and need to be listened through in full to get a good-quality notes.
Semantic audio
SDAM lets you capture recordings with assigned meaning. In the simplest usage, you can just start the recording and add a mark whenever something you will want to write down later is said, when the class is over, you can just return to those labels and quickly create the notes, you can be sure you have covered everything important without the need to go through the whole thing again. At the same time, those marks can serve as reference points, if you need to return in your memory to the part of your class dealing with a particular topic, because you feel you may have missed something or just want to hear it again, you can get to the relevant part in few clicks.
Time travel
However, SDAM also offers a different operation mode. If you have headphones with active noise cancellation technology, you can use it to travel in time during the class. After activating this function, the program will work in augmented reality mode, where you can hear what's happening around you. And if you don't get something, need to research or simply mishear, there's nothing simpler than pausing the time or rewinding it back, you will get to repeat the past events without missing on anything that's happening in the meantime, because everything is being recorded for you in the background. So when you're done, you can simply continue listening to the class as it was happening while you were dealing with other things, or, even increase the speed twice or triple to get in sync again.
The program is also equipped with a built-in notepad, so you can make use of it to do your note-taking stuff, calculations and other textual operations.
1/2
Musharraf
in reply to Musharraf • • •Tamas G
in reply to Musharraf • • •Musharraf
in reply to Tamas G • • •If your existing voices are trained with Piper, then they'll work with this version.
If they fail to work for any reason, you can copy the config from any working voice to your voice, and edit relevant values.
Tamas G
in reply to Musharraf • • •Devin Prater :blind:
in reply to Tamas G • • •Musharraf
in reply to Tamas G • • •@Tamasg
If you have the original checkpoint, you can convert it to the new format.
Take a look at this script, which I used to export Piper's checkpoints:
github.com/mush42/piper-rt-mak…
I need to update the docs, and add a section on training voices.
piper-rt-maker/tasks.py at main · mush42/piper-rt-maker
GitHubTamas G reshared this.
Tamas G
in reply to Musharraf • • •Musharraf
in reply to Tamas G • • •Here're the steps to convert the checkpoint to fast format:
# Clone piper fork containing export code
git clone github.com/mush42/piper
cd ./piper
# Checkout streaming branch
git checkout streaming
cd ./piper/src/python
pip3 install -r requirements.txt
# Upgrade torch
pip3 install --upgrade torch pytorch-lightning onnx
source ./build_monotonic_align.sh
# Export. Edit paths
python3 -m piper_train.export_onnx_streaming --debug [checkpoint path] [export directory]
GitHub - mush42/piper: A fast, local neural text to speech system
GitHubTamas G
in reply to Musharraf • • •Tamas G
in reply to Musharraf • • •Musharraf
in reply to Tamas G • • •I didn't edit that script, it came from piper repo.
Anyways it does not effect the installation. I encountered it myself when exporting voices.
Tamas G
in reply to Musharraf • • •Tamas G
in reply to Musharraf • • •Tom Grant
in reply to Tamas G • • •Musharraf
in reply to Musharraf • • •After installing this version, you will lose all of your installed voices. Please use the voice manager to re-install the voices again.
Aryan
in reply to Musharraf • • •Nick Giannak III
in reply to Musharraf • • •Musharraf
in reply to Nick Giannak III • • •A dataset designed specifically for screen reader usage, goes a long way toward creating a good quality voice.
If guidelines are the issue, we can come up with a set of guidelines based on Microsoft/Google guidelines which are openly available.
Nick Giannak III
in reply to Musharraf • • •Timothy Wynn
in reply to Musharraf • • •AttributeError: 'WinmmWavePlayer' object has no attribute 'setVolume'
Pratik Patel
in reply to Timothy Wynn • • •Musharraf
in reply to Pratik Patel • • •Which voices?
Custom voices or the ones downloaded from the voice manager?
I'd appreciate it if you can provide NVDA logs.
Pratik Patel
in reply to Musharraf • • •Pratik Patel
in reply to Musharraf • • •After updating to the latest beta, the issue i reported still exists. I removed all voices, uninstalled the add-on, reinstalled it, and added voices again. Here's a link to the log.
dropbox.com/scl/fi/914e332qia2…
Piper.log
DropboxMusharraf
in reply to Pratik Patel • • •It seams like the server is not running.
Are you running NVDA on a 32-bit/ARM-64 machine. Sonata only works on 64-bit versions of Windows.
Otherwise, check if the server generated any logs in the following file path:
[NVDA config directory]\sonata\logs\sonata-grpc.log
If not, try running the following binary from a cmd window and report the output:
[NVDA config directory]\addons\sonata_neural_voices\synthDrivers\sonata_neural_voices\bin\sonata-grpc.exe
Pratik Patel
in reply to Musharraf • • •Thanks for trying to troubleshoot this. I'm running this on a Windows 64 bit on an Intel machine. Not Arm. The log file is not generated. Trying to run sonata-grpc.exe from the bin directory results in the following message:
The term 'sonata-grpc.exe' is not recognized as the name of a cmdlet, function, script file, or operable program.
Pratik Patel
in reply to Musharraf • • •I ran it as "./sonata-grpc.exe" and it gave me
"Starting sonata-grpc serverr at 127.0.0.1:49314"
Musharraf
in reply to Pratik Patel • • •Maybe send me NVDA log to diagnose why the TTS server isn't running.
Musharraf reshared this.
Pratik Patel
in reply to Musharraf • • •Here is the most recent log.
dropbox.com/scl/fi/h3bfsprt1q5…
Piper2.log
DropboxPratik Patel
in reply to Musharraf • • •Peter Vágner
in reply to Musharraf • •Thanks for all the fantastic work you are putting into this.
Musharraf
in reply to Peter Vágner • • •Here's how to build the sonata-grpc binary:
git clone github.com/mush42/sonata
cd ./sonata/sonata-grpc
# With Rust installed
cargo build --release
GitHub - mush42/sonata: A cross-platform engine for neural TTS models.
GitHubPeter Vágner likes this.
Musharraf
in reply to Peter Vágner • • •If you just want to set the eSpeak-ng data directory, you don't need to re-build the binary.
Just set the following environment variable before launching sonata-grpc:
SONATA_ESPEAKNG_DATA_DIRECTORY=[your custom espeak-data directory parent]
Peter Vágner likes this.
Peter Vágner
in reply to Musharraf • •Tom Grant
Unknown parent • • •Tamas G
Unknown parent • • •Tamas G
Unknown parent • • •Tamas G
Unknown parent • • •Andre Louis
in reply to Tamas G • • •Just downloaded this myself. Have you updated Keynote in recent times to take advantage of this new AddOn? That's one I'm very keen on trying now that my machine can handle them again, after the addon rewrite. Thanks.
@fireborn @TomGrant91 @mush42
Tamas G
in reply to Andre Louis • • •Andre Louis
in reply to Tamas G • • •Tamas G
in reply to Andre Louis • • •JamminJerry
in reply to Tamas G • • •Andre Louis
in reply to Tamas G • • •Musharraf
in reply to JamminJerry • • •If you provide the logs, I'll be able to diagnose the issue.
JamminJerry
in reply to Musharraf • • •JamminJerry
in reply to Musharraf • • •JamminJerry
in reply to Musharraf • • •Musharraf
in reply to JamminJerry • • •Just send the NVDA log.
JamminJerry
in reply to Musharraf • • •JamminJerry
in reply to Musharraf • • •Musharraf
in reply to JamminJerry • • •An easier way is to press insert+F1, then select all and copy.
You can paste it in a plain text file, save and send it.
JamminJerry
in reply to Musharraf • • •dropbox.com/scl/fi/66sx9tsqvxl…
Andre Louis
in reply to Musharraf • • •