Items tagged with: tts

Search

Items tagged with: tts


As I suspected, MLX-Audio made a great progress! Now it supports Spark-TTS, Dia TTS, Sesame CSM, OuteTTS, Orpheus,, Kokoro, Bark, Parakeet, Whisper! #TTS #ML #AI @ZBennoui github.com/Blaizzy/mlx-audio


Do you use a screen reader and read arabic content with it? Have you ever wondered why Arabic tts literally always sucks, being either super unresponsive, or gets most things wrong all the time? I've been wanting to rant about this for ages!
Imagine if English dropped most vowels: "Th ct st n th mt" for "The cat sat on the mat" and expected you to just KNOW which vowels go where. That's basically what Arabic does all day every day! Arabic uses an abjad, not an alphabet. Basically, we mostly write consonants, and the vowels are just... assumed? Like, they are very important in speech but we don't really write them down except in very rare and special cases (children's books, religious texts, etc). No one writes them at all otherwise and that is very acceptable because the language is designed that way.
A proper Arabic tts needs to analyze the entire sentence, maybe even the whole paragraph because the exact same word could have different unwritten vowels depending on its location, which actually changes its form and meaning! But for screen readers, you want your tts to be fast and responsive. And you do that by skipping all of that semantic processing. Instead it's literally just half-assed guess work which is almost wrong all the time, so we end up hearing everything the wrong way and just cope with it.
It gets worse. What if we give the tts a single word to read (which is pretty common when you're more closely analyzing something). Let's apply that logic to English. Imagine you are the tts engine. You get presented with just 'st', with no surrounding context and have to figure out the vowels here. Is it Sit? Soot? Set? Maybe even stay? You literally don't know, but each of those might be valid even with how wildly the meaning could be different.
It's EXACTLY like that in Arabic, but much worse because it happens all the time. You highlight a word like 'كتب' (ktb) on its own. What does the TTS say? Does it guess 'kataba' (he wrote)? 'Kutiba' (it was written)? 'Kutub' (books (a freaking NOUN!))? Or maybe even 'kutubi' (my books)? The TTS literally just takes a stab in the dark, and usually defaults to the most basic verb form, 'kataba', even if the context screams 'books'!
So yeah. We're stuck with tools that make us work twice as hard just to understand our own language. You will get used to it over time, but It adds this whole extra layer of cognitive load that speakers of, say, English just don't have to deal with when using their screen readers.

#screenreader #blind #tts


Here's another #AI#TTS service, free for now, called #Zonos. playground.zyphra.com/audioOr free for self-hosting on github: github.com/Zyphra/Zonos
#AI #tts #zonos


For all you retro text to speech nerds out there, here's a recording of Superior Software's "Speech" for the Amstrad CPC computer (1986). This was a small piece of Z80 machine code that played phonemes through the computer's AY-3-8912 sound chip

It sounds pretty harsh, and there are a number of hard clicks in the output I couldn't easily fix.

I've included what I think is the spoken script as alt text

#RetroComputing #TTS #Spech #SuperiorSoftware #AmstradCPC





Pozdravljena Slovenija! Speech Recognition & Synthesis by Google, AKA Google TTS has been updated and now it supports Slovene, a South Slavic language which is the official language of Slovenia 🇸🇮 Take a look at our list of languages with available TTS engines on Android accessibleandroid.com/list-of-… #Android #TTS


Not sure if I've talked about this here, but there are like, tons of stuff here. I'm gonna try that Linux distro through Crostini on my old ChromeBook that I don't do anything else with anymore. The VM's over the Internet are really cool too! The AT Museum is cool. And I'm curious to see if the Windows 2000/XP on **Android** works. That'd be trippy.

nashcentral.duckdns.org/

#accessibility #blind #AssistiveTechnology #Linux #gaming #TTS #VirtalMachine #foss #Android


Pred pár dňami sme vďaka @Zvonimir Stanecic a ďalším dobrovoľníkom zverejnili prvý slovenský ženský hlas pre #rhvoice. Hlas dostal aj pekné netradičné meno Jasietka. K dispozícii sú hlasy pre #Windows #nvdasr #android aj #linux . Aktualizácie sa zároveň dočkala aj celková podpora pre slovenčinu, vrátane už skôr zverejneného slovenského hlasu Ondro. Ak potrebujete #tts #textToSpeech k čítaču obrazovky, na čítanie kníh, inštrukcie pre GPS navigáciu, pozrite si prosím podrobnosti na jednoduchom webe.

hlas.ondrosik.sk/


I wrote a Dart package for speech-dispatcher, enabling TTS on Linux. It also works with Flutter for Linux. The package supports most features of speech-dispatcher, with additional support for less commonly used functionality like history coming soon. It still needs broader testing, so feel free to open any issues you encounter. My hope is that one of the TTS packages for Flutter will start using it to add Linux support, as most, if not all, currently don't support Linux.

You can find the package on GitHub at github.com/the-byte-bender/dar… and on pub.dev at pub.dev/packages/dart_speechd.

#dart #flutter #tts #linux


Xin chao and Dumelang! That's how you say Hello in Vietnamese and Setswana. RHVoice Text-to-Speech engine for Android was updated today, adding two new languages: Vietnamese 🇻🇳 and Setswana 🇧🇼. Take a look at our list of languages with available TTS engines on Android accessibleandroid.com/list-of-… #Android #TTS


Great job @eeejay 👏. Even if i don't speak a word Arabic the Piper TTS improvement can absolutely be heard.

Btw. @Marco if you are interested in Piper usage or clone your voice for #tts maybe one or two of my video tutorials are helpful:
youtu.be/b_we_jma220
or youtu.be/GGvdq3giiTQ



@WestphalDenn @Talon I don't think it's that catastrophic as this sounds any more. I can't say this is not happening but it's very rare. I am on @GNOME . If I get an unresponsive experience I can press alt+F2, type in orca --replace and the screen reader comes back. I can even bind this to a global keyboard shortcut.
When I am leaving my office I am often closing some 15 browser windows, some 10 terminal windows, about 5 different files open in the text editor.
Most used apps on my desktop include #Firefox #Thunderbird file manager (pcmanfm or nautilus), Gedit, VLC media player, electron based apps such as teamsforlinux, losslesscut and gnome-terminal.
Next I'm using @LibreOffice, I am also using #Emacs with #speechd-el a little and finally some other less frequently used apps.
As for the #TTS or the #audio setup I am using #RHVoice, speech-dispatcher and @PipeWire Project .
Finally with @Matt Campbell and @Lukáš Tyrychtr we do have tallented visually disabled developers dogfooding or partially dog fooding so let me finish this post by saying it really is gold era of a linux #a11y and we are looking forward for what it brings us in the future.


Dnes si dovolím odkázať na veľmi vydarený český hlas pre hlasový výstup #TTS #RHVoice. Sám autor o tom píše tu: groups.io/g/Blind-android/mess… . O použití pre #Windows #Linux a #Android sa dočítate aj na jednoduchej komunitnej stránke hlas.ondrosik.sk/ . Gro používateľov sú ťažko zrakovo postihnutí používatelia, ktorí si bez kvalitného hlasového výstupu a čítača obrazovky nedokážu svoj digitálny život predstaviť, možno ale aj vám by sa mohol hodiť takýto hlas pre váš počítač či smartfón nezávislí od obrovských korporácií. Mohli by ste ho napr. použiť na čítanie kníh alebo počas navigácii cez GPS.



Another interesting #TTS #AI system. I need to look closer into it in order to see if it's a voice cloning approach or something else: github.com/yl4579/StyleTTS2
#AI #tts



Voicify is a tool that automatically recognizes the language of presented text and utilizes the TTS engines and voices installed on the user’s phone to read it. Check out this tutorial on How to Use Voicify, the Automatic Language Switching Tool by Tech-Freedom accessibleandroid.com/how-to-u… #Android #TTS


Over the past year, I've been experimenting with neural text to speech in various forms. I have done hours of experimentation and research, training models and getting varying results along the way. Some of you may have heard of Piper, an open source synthesizer and add on for NVDA that can be trained by anyone. It is currently in active development, and I have been there from the beginning, testing and evaluating the various versions. For years, I have had a goal to create a high-quality voice that is truly usable by a screen reader user, and yesterday I managed to achieve this. I'm really excited to share Alba, a female Scottish English voice. I'm considering this a beta phase, and I'm looking for feedback to make improvements as needed. Please note that you will most likely get an error upon installation, however the voice should still show up to NVDA, and I'm working on fixing this as soon as possible.
Link to Piper: github.com/rhasspy/piper/tree/…
Link to addon: github.com/mush42/piper-nvda?r…
Link to Alba: drive.google.com/file/d/1wZHuI… #TTS #AI #ScreenReader #Piper