reshared this
reshared this
So I've seen this talked about here a bit, but I wanted to give more context on Kokoro TTS. This model was open sourced back on December 25, and was trained almost entirely on synthetic data taken from Eleven Labs and Open AI. Legality aside, the quality speaks for itself. This is an 82 million parameter model, which is very small by today's standards, but that means it's incredibly fast even on CPU.
The main dev responsible for training seems to know much more than the average open source enthusiast about how to make high-quality TTS, and I think the results speak for themselves. The model is under very active development and still quite young, more data is currently being collected, and a new version will be trained and released likely in the coming months. Their Discord is quite active, and I'm over there as well if you'd like to join. I think this has the potential to be a great option for blind screen reader users, who may not be able to afford something like Vocalizer on Windows, but we're not quite there just yet in terms of performance.
Here is a demo of one of the voices reading about Android.
Link to model card on Huggingface: huggingface.co/hexgrad/Kokoro-…
Link to Discord: discord.gg/QuGxSWBfQy
We’re on a journey to advance and democratize artificial intelligence through open source and open science.huggingface.co
Peter Vágner likes this.
reshared this
Živé.sk zistilo znepokojujúce informácie k prieniku na servery Úradu geodézie, kartografie a katastra SR.redakcia Živé.sk (Živé.sk)
reshared this
reshared this
Peter Vágner likes this.
Peter Vágner likes this.
Peter Vágner likes this.
@WofWca Looking more it looks to me you are working on some amazing accessibility related improvements. While I was testing a few weeks ago I had issues navigating the lists. hmm, perhaps I need to figure out how to build with this PR and see for my-self if it might be related. github.com/deltachat/deltachat…
Edit: oh, there are more PRs resolving keyboard navigation and accessibility related issues. It's really right about perfect time I have discovered this thing.
Thanks and keep up the great work please
Peter Vágner likes this.
What is Delta Chat? Delta Chat is a reliable, decentralized and secure messaging app, available for mobile and desktop platforms. Delta Chat feels like Whatsapp or Telegram but you can also use and...delta.chat
reshared this
🦆 √ 🐄 likes this.
There is one import difference you need to know:
* Chatmail account: you can reach all delta chat users
* Regular email account: you can reach delta chat users + classic email users
So if you use chatmail, you can not reach people who are not using delta chat.
another point:
If you do NOT use chatmail, you will not have push notifications if you are an iOS user.
It is possible for #chatmail users to communicate with classic email users who have published their public key.
You just have to do manual chatmail registration, save your login details and private key securely, and use it with something that supports #pgp like #Thunderbird or #Mailvelope.
Very nice, glad to hear a confirmation that it's working from a real user!
We might need to investigate automatic focus mode switching.
Hello folks I need your help, after a lot of efforts, I am on the final steps of making #ArcaneChat available in #GooglePlay
now they ask to create a "closed testing release" where only some invited testers via their google account email address can participate and install the app before they allow a public release
please write to me in private sending me your google account's email address (gmail I guess) to join as beta-tester, thanks a lot in advance! ♥️🙏
reshared this
like this
Peter Vágner reshared this.
TTS with kokoro and onnx runtime. Contribute to thewh1teagle/kokoro-onnx development by creating an account on GitHub.GitHub
Peter Vágner likes this.
reshared this
🇨🇦Samuel Proulx🇨🇦 likes this.

Peter Vágner likes this.
Peter Vágner reshared this.
@IzzyOnDroid has been doing an amazing job getting our repo to over 30% of apps being reproducible. Maintaining a rebuilder takes a lot of constant work. Thank you!
As I've written before:
[...] the ecosystem is constantly moving: old toolchain and dependency bugs get fixed, but new ones keep popping up. [...] Reproducible Builds are not just an item on a checklist [...] It's an ongoing process involving not just upstream app developers, but also maintainers of repositories, clients, and rebuilders; those involved in outreach and writing documentation; developers and maintainers of tooling, toolchains, and dependencies. And often requires a lot of collaborative debugging :)
See also our "Review of 2024 and Outlook for 2025: Reproducible Builds, Security Measures and more":
android.izzysoft.de/articles/n…
#IzzyOnDroid #ReproducibleBuilds
2024 winkt zum Abschied, 2025 klopft an die Tür: Was haben wir 2024 erreicht, und was sind unsere Pläne und Hoffnungen für 2025? Werft mit uns einen Blick zurück auf die eingeführten Sicherheitsmaßnahmen, auf die Fortschritte bei Reproducible Builds …IzzyOnDroid
Peter Vágner likes this.
reshared this
⭐️ Principles Of Web Accessibility
A set of high-level guiding principles for approaching design and remediation for an accessible web.
By @heydon
github.com/Heydon/principles-o…
How to approach accessible web interface design. Contribute to Heydon/principles-of-web-accessibility development by creating an account on GitHub.GitHub
reshared this
Get ready to have your tech world rocked every week with Steven Scott and Shaun Preece on Double Tap! These guys are the ultimate duo - mixing humor, passion, and top-notch expertise to keep you in the loop with the latest in assistive tech for blind…YouTube
reshared this
Peter Vágner likes this.
Peter Vágner reshared this.
After a short break over the holidays, KDE developer Nate Graham is back with his 'This Week in Plasma' series to highlight the interesting KDE Plasma desktop changes made each week.lxer.com
reshared this
TIL #Catima is apparently mirrored onto #RuStore through #Aptoide, or so someone reported: github.com/CatimaLoyalty/Andro…
Few problems with that: I have no control over the RuStore listing, nor do I control the Aptoide listing. Both may very well be malware.
It should go without saying: don't download apps from sketchy unofficial "app stores" and other APK download sites.
catima.app/ and github.com/CatimaLoyalty/Andro… link you to all the safe and supported download sources :)
Catima, a Loyalty Card & Ticket Manager for Android - CatimaLoyalty/AndroidGitHub
reshared this
As part of the 20th anniversary of the BBS Documentary's release, I've ripped the 3 DVDs that were included in the project and have them hosted at Internet Archive. These ISO files can be played in the VLC player like DVDs, and include all bonus features, subtitles, director's commentary, etc.
archive.org/details/BBS_Docume…
The Bulletin Board System: it brought life online. Long before the Internet connected the planet and changed nearly evertying, there was a brave and...Internet Archive
reshared this
like this
reshared this
Ahoj Fediverse! 👋
Tohle oficiální účet české #PeerTube instance 📼 VHSky.cz. Budeme vás tu upozorňovat na nové autory a zajímavá videa. A rádi uslyšíme zpětnou vazbu, co se vám líbí a co byste na VHSky.cz rádi viděli.
reshared this
The #38C3 presentation about the massive #Volkswagen data leak is over and the leak was as bad as it sounded. 100,000s of cars could be located down to cm precision. Also a lot of metadata that is NOT supposed to be public. Disturbing.
The main takeaway was that the real problem here wasn't the leak itself, but that this data was collected in the first place.
I agree. It's not a good idea to create mountains of very personal data and then pray it never leaks.
reshared this
reshared this
Today is the day. Welcome to THE charger!
USB-C is officially the common standard for charging electronic devices in the EU.
This means:
🔌The same charger for all new phones, tablets and cameras
⚡ Harmonised fast-charging technology
🔄 Reduced e-waste
🛑 No more “Sorry, I don’t have the right cable”
One charger to rule them all.
reshared this
In case you missed it, the hackers who reverse-engineered DRM on Polish trains got sued by the train manufacturer…
…multiple times.
You can donate to their defense fund:
ccc.de/en/updates/2024/das-ist…
Context:
Their original talk from last year
media.ccc.de/v/37c3-12142-brea…
My piece about the first lawsuit against them
rys.io/en/175.html
We've all been there: the trains you're servicing for a customer suddenly brick themselves and the manufacturer claims that's because you...media.ccc.de
reshared this
Peter Vágner likes this.
reshared this
does anyone know of a
blog-like cms with features:
- self-hostable
- subscribe to posts with email
- rss
- activityPub
- supports images upload
- posts in markdown in webUI
- smolweb/minimalist
- no nodeJS
- no SSG
plz boost 
reshared this
Gajim’s port to GTK4 is almost finished. 🥁 Currently we’re testing thoroughly to make the switch as smooth as possible.
We went ahead and made lots of small improvements, e.g. writing messages while offline, better styling for image previews, improved chat filters and more.
Gajim also improved its spam fighting toolkit: The next release will allow you to moderate all messages of a spammer at once. 🤖
If you like to support Gajim, please consider making a donation: liberapay.com/Gajim
Peter Vágner reshared this.
When I spent a year with the DotPad and now 6 months and going with the Monarch, besides the fact that the Monarch is the only Multi Line Tactile display still available in Australia, here are some of my advantage points for the Monarch over the DotPad.
Can be used as a stand alone device,
20 hours battery life,
On/off, USBC, USBA, HDMI, 3.5MM ear phone jack, and volume up/down buttons,
Users used to the Braille Note, Brailliant, and Mantis will pick up same concepts used on the Monarch (its KeySoft),
Plug in a QWERTY keyboard and/or portable HDMI display,
Braille input keyboard,
10 line by 32 cell tactile display,
Removable metal template to replace membrane over the pins,
Speech output,
Navigation and Zoom keys, Point and click to move cursor or Zoom tactile graphics,
Button to refresh tactile array at any time,
Tactile horizontal and vertical scroll bars to let user know where they can scroll,
Easily jump around tactile graphic with keyboard short cuts,
increase line spacing when reading Braille,
Easy to identify cursor in menus, and place focussed line that contains cursor to the top of the display,
Access various file formats,
Various applications including web Browser (beta), email (beta), Tactile Library (APH), Maths graphing calculator, Word Processer, Braille Editor, Book Reader (victor Reader),
2025 WingIT app for iOS drawing, and Terminal screen reader support.
reshared this
Inspired by Google's move to remove @organicmaps from the Playstore without warning, I finally decided to move my > 3,000 Google Maps saved places to Organic Maps. To facilitate doing this for others' benefit, I made a quick webpage to convert your Google Maps GeoJSON data to GPX and KMZ files that render well in Organic Maps.
reshared this
reshared this
Did you know that you can configure custom notification sounds per contact or group chat in #Conversations_im?
Apparently not many people knew that so the next version will make, what essentially is a native Android feature, easier to access via the overflow menu of contact or group chat details.
gultsch.video/w/8wZSkoad1bv4VH…
Conversations 2.17.7 makes it easier to configure custom notifications for contacts and channelsPeerTube
Peter Vágner reshared this.
I finally turned off GitHub Copilot yesterday. I’ve been using it for about a year on the ‘free for open-source maintainers’ tier. I was skeptical but didn’t want to dismiss it without a fair trial.
It has cost me more time than it has saved. It lets me type faster, which has been useful when writing tests where I’m testing a variety of permutations of an API to check error handling for all of the conditions.
I can recall three places where it has introduced bugs that took me more time to to debug than the total time saving:
The first was something that initially impressed me. I pasted the prose description of how to communicate with an Ethernet MAC into a comment and then wrote some method prototypes. It autocompleted the bodies. All very plausible looking. Only it managed to flip a bit in the MDIO read and write register commands. MDIO is basically a multiplexing system. You have two device registers exposed, one sets the command (read or write a specific internal register) and the other is the value. It got the read and write the wrong way around, so when I thought I was writing a value, I was actually reading. When I thought I was reading, I was actually seeing the value in the last register I thought I had written. It took two of us over a day to debug this. The fix was simple, but the bug was in the middle of correct-looking code. If I’d manually transcribed the command from the data sheet, I would not have got this wrong because I’d have triple checked it.
Another case it had inverted the condition in an if statement inside an error-handling path. The error handling was a rare case and was asymmetric. Hitting the if case when you wanted the else case was okay but the converse was not. Lots of debugging. I learned from this to read the generated code more carefully, but that increased cognitive load and eliminated most of the benefit. Typing code is not the bottleneck and if I have to think about what I want and then read carefully to check it really is what I want, I am slower.
Most recently, I was writing a simple binary search and insertion-deletion operations for a sorted array. I assumed that this was something that had hundreds of examples in the training data and so would be fine. It had all sorts of corner-case bugs. I eventually gave up fixing them and rewrote the code from scratch.
Last week I did some work on a remote machine where I hadn’t set up Copilot and I felt much more productive. Autocomplete was either correct or not present, so I was spending more time thinking about what to write. I don’t entirely trust this kind of subjective judgement, but it was a data point. Around the same time I wrote some code without clangd set up and that really hurt. It turns out I really rely on AST-aware completion to explore APIs. I had to look up more things in the documentation. Copilot was never good for this because it would just bullshit APIs, so something showing up in autocomplete didn’t mean it was real. This would be improved by using a feedback system to require autocomplete outputs to type check, but then they would take much longer to create (probably at least a 10x increase in LLM compute time) and wouldn’t complete fragments, so I don’t see a good path to being able to do this without tight coupling to the LSP server and possibly not even then.
Yesterday I was writing bits of the CHERIoT Programmers’ Guide and it kept autocompleting text in a different writing style, some of which was obviously plagiarised (when I’m describing precisely how to implement a specific, and not very common, lock type with a futex and the autocomplete is a paragraph of text with a lot of detail, I’m confident you don’t have more than one or two examples of that in the training set). It was distracting and annoying. I wrote much faster after turning it off.
So, after giving it a fair try, I have concluded that it is both a net decrease in productivity and probably an increase in legal liability.
Discussions I am not interested in having:
The one place Copilot was vaguely useful was hinting at missing abstractions (if it can autocomplete big chunks then my APIs required too much boilerplate and needed better abstractions). The place I thought it might be useful was spotting inconsistent API names and parameter orders but it was actually very bad at this (presumably because of the way it tokenises identifiers?). With a load of examples with consistent names, it would suggest things that didn't match the convention. After using three APIs that all passed the same parameters in the same order, it would suggest flipping the order for the fourth.
reshared this
that very much matches my own experience. (I've not specifically used copilot but the jetbrains built-in local thing 🤷)
It helped with boiler plate code and then introduced subtle bugs that took multiples of the time saved to find.
I've been thinking about this for ages, but never had the time to craft the words around it.
People keep saying that "Maths should be fun" ... and I push back with "It should be engaging ... 'fun' is a different thing.
So @rakhichawla has posted pretty much exactly this, but better than I ever could.
I'm copying it here with permission.
Please read this, then as it says at the end ... let's have a deeper conversation about this ...
1/n
(PS: I'd love this to get boosted to get outside my bubble ... you're all amazing, but there will be other opinions, and other thoughts that could be helpful or valuable)
Hashtags: #MathEd #MathsEd #MathEdChat #MathsEdChat #MathChat #MathsChat #MTBoS #TMWYK
reshared this
Continuing a decade-long tradition #Conversations_im is currently available for free on Google Play.
play.google.com/store/apps/det…
Merry Christmas 🎄 Happy Holidays ☃️ and have fun at #38C3
An encrypted, user friendly XMPP instant messaging client optimized for mobileplay.google.com
Peter Vágner likes this.
reshared this
Yesterday, I made a silly production for a group of friends. We like to sometimes welcome each other to the start of whatever week number of the year it happens to be.
In 2023, week 52 started on Christmas Day, so I made a little "Merry Weeksmas!" production using Elevenlabs, featuring the Ssanta voice, and a guy called Lars, who likes gongs.
Lots of inside jokes that will be somewhat familiar to those who listened to my show, Things and stuff, quite a while back, on a now-defunct internet radio station called TBRN.
I had every intention of doing something short and sweet, but what actually came out was an 18 minute, 25 second long pretty much full-on audio drama thing.
Santa ends up in a car accident and misses Christmas. Lars has a fight with his wife. Is Christmas ruined, or are you just being welcomed to week 52 in a ridiculous way?
Again, lots of inside jokes, including amateur radio references, but hopefully you'll enjoy even if you don't get them, or know that they are there to be gotten. I'll admit, I had way too much fun putting this stupid thing together. I don't get to do creative production much anymore.
Featuring a few tracks from @Onj. my Ableton Move even gets a short cameo. So does my old Yamaha Motif Classic.
I should also mention that I don't speak German. Some things may not make sense. I'm OK with that, personally, because this is a silly production.
If you're bored, click here and hear what this is all about, I guess...
borris.me/audio/w52.mp3
Peter Vágner likes this.
reshared this
reshared this
I looked up the enterprise management features of the Firefox and Chrome browsers and wrote down how to turn off the advertising features once by installing the right config files in the right places. Not hard and affects all accounts and profiles, so a big time saver
blog.zgp.org/turning-off-brows…
Linux paths so far, but I know there are some Mac OS command line users on here -- anyone have this set up on your system? If so and you have instructions posted anywhere I'll link to them
reshared this
# smartctl -l selftest /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.4-arch1-1] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 29550 3432497913
# 2 Conveyance offline Completed: read failure 70% 29550 3432490825
Peter Vágner likes this.
reshared this
Peter Vágner likes this.
reshared this
Rebuilding much of my infrastructure in NixOS again because I'm sick of shit breaking and being hard to fix. In the process I've been revisiting services I run to ask whether they're worth maintaining or whether a simpler alternative would do.
Started poking Nextcloud today and oh wow is its app ecosystem extensive. I thought it was just a nerdy contact/calendar/file-syncing app but no, you can really build out your own household-specific portal with it.
Right now for instance I'm playing with the Cookbook app. So far I've pasted in a few recipe URLS and have gotten back nicely-accessible, no-fluff copies of those recipes, saved locally, linking back to the originals, with spinners that seem to scale them by serving size. That's really neat.
So I don't think I'll be replacing it with Radicale quite yet. It can apparently somehow parse emails and extract travel itinerary data into calendar events with kitinerary but I'm not sure how. That's one more proprietary service down.
Peter Vágner likes this.
Peter Vágner reshared this.
K vanocum jsme vam pripravili maly darek. Videa z konference jsme nahrali na peertube, takze muzete svatky vyuzit k jejich sledovani bez reklam.
vhsky.cz/c/openalt_konference/
Muzete je taky sledovat primo pres mastodon @openalt@vhsky.cz
Kanál s přednáškami z konference OpenAlt, která se koná tradičně v listopadu v Brně a zaměřuje se na otevřený software, data, IT bezpečnost, DIY a IoT.VHSky
reshared this
Jeseň ubehla rýchlejšie než som plánoval, ale to nie je to podstatné, čo som chcel povedať. Dosť som totiž uvažoval, ako poňať pokračovanie môjho predchádzajúceho blogu, a teda ako by mala vyzerať lekcia 2.
Nakoniec som sa rozhodol nenosiť drevo do lesa a prichádzam iba s prekladom článku z eff.org - How to: Understand and Circumvent Network Censorship licencovaného pod otvorenou licenciou CC-BY takže to čo teraz budete čítať, nie je moje dielo, iba môj pokus o preklad.
herrman.sk/home/ako-sa-priprav…
#eff #cenzura #blog #vladamrzkejluzy #dns #vpn #proxy #tor
Informačná bezpečnosť sa týka každého!www.herrman.sk
reshared this
Tamas G
in reply to Zach Bennoui • • •Zach Bennoui
in reply to Tamas G • • •Tamas G
in reply to Zach Bennoui • • •StyleTTS2 Training from Scratch Notebooks · yl4579 StyleTTS2 · Discussion #144
GitHubLuis Carlos
in reply to Tamas G • • •Zach Bennoui
in reply to Tamas G • • •Zach Bennoui
in reply to Zach Bennoui • • •Tamas G
in reply to Zach Bennoui • • •Zach Bennoui
in reply to Tamas G • • •Tamas G
in reply to Zach Bennoui • • •Peter Vágner
in reply to Tamas G • •Do I understand correctly that kokoro is adaptation of style2 specifically for english?
💙
in reply to Peter Vágner • • •Peter Vágner
in reply to 💙 • •@Winter blue tardis🇧🇬🇭🇺 I hope you don't mind me being so curious. I am learning slowly and working on these things even slower but I have already helped to train slovak piper voice. The result was not that awfull hover it was not great either. So for a few months I am now working on improving espeak phonemization rules for slovak language trying to make sure all the language specific features are respected as much as possible even before training. I know at least piper and optispeech are using espeak for phonemization under the hood. These days another piper training of slovak voice with these improvements is running over here and it turns out it's sounding much better in terms of pronounciation. However only time will tell us if people will like it though. All this work is based off of previous work we were doing with friends while making slovak voices for RHVoice TTS. So we have text prompts and high quality voice recordings for slovak.
And now those questions:
Apart of the robotic sounding voice, How do you like bulgarian espeak pronounciation?
Is it similar in complexity to russian? I can't speak nor understand russian, however I do know russian espeak rules include a huge list of exceptions and russian speaking people don't still like it that much.
Do you have high quality bulgarian recordings of single speaker or do you know of a public data set that may include such a recordings?
💙
in reply to Peter Vágner • • •Peter Vágner
in reply to 💙 • •Perhaps I am missremembering but I guess you might also be able to understand hungarian and there are some recordings for hungarian you might be able to experiment with if you are not going to learn with english data.
💙
in reply to Peter Vágner • • •Peter Vágner
in reply to 💙 • •So while I can't really understand and speak other languages, I might be able to try answering some of your questions either alone or in cooperation with other guys helping within this team and if you are passionate enough I think it's very likelly you will be able to achieve great result.
Again let me repeat @Zvonimir Stanecic is the number one language expert leading teams working on czech, hungarian, croatian, serbian, slovak and other languages support for RHVoice. I think @Cleverson you might find this story of mine and my friends inspiring too.
Cleverson has moved
in reply to Peter Vágner • • •Peter Vágner
in reply to Cleverson has moved • •Cleverson has moved
in reply to Peter Vágner • • •Peter Vágner
in reply to Cleverson has moved • •Cleverson has moved
in reply to Peter Vágner • • •Peter Vágner
in reply to Cleverson has moved • •Each engine I've worked with so far does at least two stages with the text it's asked to speak.
First it transforms all the written letters into its internal representation of individual sounds aka phonemes.
within this part none of the audio data is involved at all and it does not matter if we do have formant synthesis similar to eSpeak, HTK based synthesis similar to RHVoice it's just we have kind of dissassembled the text phrases into sounds and Written that as a code.
The engine then uses this data for producing speech according to the trained model.
So while I'm saying we need espeak while training piper or optispeech, I mean we will be using its phonemizer regardless of the audio data we will use for training.
Real linguists are able to apply knowledge they have acquired from phonology and morphology of the language. It's predicate or at least widelly known so eventhough we are not linguists but are motivated enough we can gradually improve this part and continue tweaking the phonemizer until we like the pronounciation.
So the engine wo'wounwon't be learning this part while training.
Programming the actual TTS signal processing is much more involved task, I think we can't do it on our own and we defer to the language model. It trains it-self to inherit characteristics like sounding, intonation, inflection and loads of the other properties from the audio recordings we will be using for training our chosen engine.
Cleverson has moved
in reply to Peter Vágner • • •Peter Vágner
in reply to Cleverson has moved • •Tamas G
in reply to Peter Vágner • • •Zvonimir Stanecic
in reply to Cleverson has moved • • •💙
in reply to Peter Vágner • • •Zvonimir Stanecic
in reply to 💙 • • •Luis Carlos
in reply to Tamas G • • •miki
in reply to Zach Bennoui • • •This confirms my hypothesis that the primary reason open source neural TTS is so bad is lack of good datasets.
For some reason, there are many companies willing to open-source their LLMs, even though they're trained on books3 and other content scraped from the internet, but that isn't happening for tts.
Zach Bennoui
in reply to miki • • •Cleverson has moved
in reply to Zach Bennoui • • •Peter Vágner
in reply to Cleverson has moved • •I am training on my laptop although it takes much more time than doing it on a high performance GPU better suited for that task. Other people including @Zach Bennoui and @Tamas G are training in the cloud as described here: github.com/ZachB100/Piper-Trai…
Zach Bennoui likes this.
Kaveinthran (no longer here) reshared this.
Zach Bennoui
in reply to Peter Vágner • • •