Muslim mindset: “I’m fasting, don’t eat in front of me or I might be tempted.”
Christian: practices self-control and doesn’t make a public show of fasting.
Muslim man: sees a woman who isn’t fully covered and says, “Cover yourself or I’ll be tempted.”
Christian man: sees the same thing and says, “I need to guard my heart and discipline my eyes so I don’t sin.”
Christianity deals with the heart. We emphasise self-discipline and self-control. Islam, on the other hand, tries to control the environment instead, asking others to change because the individual hasn’t learned to master himself.
When the heart is truly transformed, temptation loses its power. Self-control means taking responsibility for your own desires, not placing the burden on others. A disciplined heart governs the flesh, not the other way around.
)
🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •I can go into more detail about why all the options are bad if you want. But this is the sort of problem that eats years of your life, requires advanced mathematics (digital signal processing at a minimum), and advanced linguistics, on top of being a good systems-level programmer.
Sam's Stuff - The State of Modern AI Text To Speech Systems for Screen Reader Users
stuff.interfree.caAaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •@fastfinge I just so happen to be an (unemployed) machine learning researcher by trade, with advanced mathematics, linguistics, and programming skills. Maybe not systems-level programming, but I could probably find someone who does that and work with them.
Given that the first two responses I've gotten were both about accessibility, there might be more of a market for this than you think, and also, it might make a good way to demo my skills even if it isn't paid work.
🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •The reason I say systems-level programming is mostly because for a text to speech system used by a blind power user, you need to keep an eye on performance. If the system crashes and the computer stops talking, the only choice the user has is to hard reset. It would be running and speaking the entire time the computer is in use, so memory leaks and other inefficiencies are going to add up extremely quickly.
From what I can tell, the ideal is some sort of formant-based vocal tract model. Espeak sort of does this, but only for the voiced sounds. Plosives are generated from modeling recorded speech, so sound weird and overly harsh to most users, and I suspect this is where most of the complaints about espeak come from. A neural network or other sort of machine learning model could be useful to discover the best parameters and run the model, but not for generating audio itself, I don't think. This is because most modern LLM-based neural network models can't allow changing of pitch, speed, etc, as all of that comes from the training data.
Secondly, the phonemizer needs to be reproducible. What if, say, it mispronounces "Hermione". With most modern text to speech systems, this is hard to fix; the output is not always the same for any given input. So a correction like "her my oh nee" might work in some circumstances, but not others, because how the model decides to pronounce words and where it puts the emphasis are just a black box. The state of the art, here, remains Eloquence. But it uses no machine learning at all, just hundreds of thousands of hand-coded rules and formants. But, of course, it's closed source (and as far as anyone can tell the source has actually been lost since the early 2000's), so goodness knows what all those rules are.
Aaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •@fastfinge Reading your linked article article and this reply, I get the sneaking suspicion that HDC (hyperdimensional computing) or other one- or few-shot learning methods that are designed to factor the model into independent components that can be quickly recomposed in new ways might be appropriate. The idea would be to, as you suggest, learn the values for these components using machine learning, but also the mapping between them and the sounds produced, so that each becomes separately tunable on the fly.
HDC has the added advantage that it is great for working with "fuzzy", human-interpretable rule representations, is typically extremely efficient compared to neural nets, and even meshes well with neural nets and gradient descent-based optimization.
Do you happen to have data of any sort that could be used for training?
🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •When it comes to open-source speech data, LJSpeech is the best we have, though far from perfect: keithito.com/LJ-Speech-Dataset/
And here's a link to GnuSpeech, the only open-source fully articulatory text to speech system I'm aware of: github.com/mym-br/gnuspeech_sa?tab=readme-ov-file
I'm afraid I don't have any particular data of my own.
GitHub - mym-br/gnuspeech_sa: Articulatory speech synthesizer
GitHubAaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •@fastfinge thanks! I'll have a look at these.
Were you wanting to collaborate on this?
🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •Aaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •Aaron
in reply to Aaron • • •🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •eloquence_64/eloquence.py at master · fastfinge/eloquence_64
GitHubAaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •nvda/source/synthDrivers/espeak.py at master · nvaccess/nvda
GitHubJoe (TBA)
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to Joe (TBA) • • •🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •Aaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •@fastfinge Looking for the source, I found this:
github.com/dectalk
It looks like both DECtalk and DECtalkMini are being actively maintained, with commits as recent as 1 to 2 months ago. I was hoping the copyright for the "mini" version would be unencumbered, but no such luck. It would have to be a re-implementation from scratch using this code as a guide. That's a lot easier than implementing a new system out of nothing, though.
DECTalk
GitHub🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •I also have no idea about any associated IP or patents, though. Wouldn't whoever does it need to be able to prove they never saw the original code, just its outputs? Otherwise you're still infringing, aren't you? In this regard, it's probably actually a bad thing that the dectalk sourcecode is so widely available.
And most of the commits seem to be about just getting it to compile on modern systems with modern toolchains. I dread to think how unsafe closed-source C code written in 1998 is.
🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •🇨🇦Samuel Proulx🇨🇦
in reply to Aaron • • •But eloquence gets the closest, gnuspeech second, espeak third, dectalk fourth, and every AI system I've tried a distant last.
gnuspeech_sa/the_chaos.txt at master · mym-br/gnuspeech_sa
GitHubAaron
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •David Nash
in reply to Aaron • • •@fastfinge (context: both Aaron and I are USAians)
It doesn't help that:
1. it's 150 or so years old, so a few pronunciations have changed a bit
2. the pronunciations and spellings (and hence some of the apparent mismatches) are UK English, not US English.
At a minimum, you'll have to envision skipping "r"s after vowels at the ends of words for many of these to make sense. As for the rest, I recognized a few of those from past experience with older UK English (e.g. "clerk" with an "a" sound), but a couple left me scratching my head saying "that's how people actually said or spelled it then and there?"
🇨🇦Samuel Proulx🇨🇦
in reply to David Nash • • •🇨🇦Samuel Proulx🇨🇦
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •🇨🇦Samuel Proulx🇨🇦
in reply to 🇨🇦Samuel Proulx🇨🇦 • • •Nicks World
in reply to Aaron • • •Aaron
in reply to Nicks World • • •@NicksWorld
Have your tried LibreOffice? I have read that it is accessible, but I trust real users better.
What specific features do you wish for most?
I have a feeling it's probably a big ask for a single developer, but I could at least take a look at the source for LibreOffice (unlike MS products) and see if I can add the features without retooling the whole codebase.
Nicks World
in reply to Aaron • • •Aaron
in reply to Nicks World • • •@NicksWorld I would probably need to sit with you to understand the dynamics of the flow and where it gives you trouble. There are a few things to unpack here, on first reading:
* Not sure what misformatting you're finding.
* By getting to the categories, do you mean navigating columns by their headers?
* Do you have specific spreadsheets you are working with regularly? If so, I might be able to come up with a different way to collect and/or present the information that is more naturally suited to blind users, like a Q&A format with predetermined flow.
Spreadsheets are designed specifically with sighted users in mind, so there's an element of inaccessibility baked into them. By organizing the information into a more linear, language-based flow instead of a spreadsheet, that could potentially make the process much more natural for a screen reader, and the data could then be automatically formatted as, or loaded into, a spreadsheet. I'd be interested to get your thoughts on this.
Nicks World
in reply to Aaron • • •Aaron
in reply to Nicks World • • •@NicksWorld
Sure, I bet it's a bit of a pain for you with text-based discussions! I'm awkward on phones but willing to give it a shot, if you think it's worthwhile and can put up with my spoken awkwardness and fumbling with words. (I communicate so much better when I can write! lol)
Can I suggest, though, that first it might make sense to get familiar with LibreOffice and see if it does a better job with the interface than Excel or other such software? It would be a shame to waste our time and effort on a problem that's already solved. It might also turn out that you have different pain points with the open source software that I can actually modify.
Nicks World
in reply to Aaron • • •Aaron
in reply to Nicks World • • •@NicksWorld
Here's the main page:
libreoffice.org/
And here's the download page:
libreoffice.org/download/downl…
You will need to select your OS for the download.
Home | LibreOffice - Free and private office suite - Based on OpenOffice - Compatible with Microsoft
www.libreoffice.orgNicks World
in reply to Aaron • • •Aaron
in reply to Nicks World • • •Zach Bennoui
in reply to Aaron • • •Aaron
in reply to Zach Bennoui • • •