Items tagged with: Transcription

Search

Items tagged with: Transcription


The enshittification of AI has lead to the choice of AI used by VLC to be groaned at. I even saw a post cross my feed of someone looking for a replacement for VLC.

VLC is working on on-device realtime captioning. This has nothing to do with generating images or video using AI. This has nothing to do with LLMs.

(edit: There's claims VLC is using a local LLM. It will use whisper.cpp, and not be using OpenAI's models. I don't know which models they will be using. I cannot find any reference to VLC using a LLM.)

While it would be preferred to use human generated captions for better accuracy, this is not always possible. This means a lot of video media is inaccessible to those with hearing impairment.

What VLC is doing is something that will contribute to accessibility in a big way.

AI transcription is still not perfect. It has its problems. But this is one of those things that we should be hoping to advance.

I'm not looking to replace humans in creating captions. I think we're very far from ever being able to do this correctly without humans. But as I said, there's a ton of video content that simply do not have captions available, human generated or not.

So long as they're not trying to manipulate the transcription using GenAI means, this is the wrong one to demonize.

#AI #Transcription #VLC #HearingImpaired #Deaf #Accessibility



We’re a disabled-led #transcription and #ClosedCaptioning company, dedicated to creating work for #disabled, #ChronicallyIll and #neurodivergent freelancers.

By offering discounts for junior, under-funded, independent, & marginalized #researchers & #ContentCreators, we make our services accessible so you can make your content more accessible.

Thanks @SusanJonesArts!

Get a quote today at academicaudiotranscription.com….

#DisabilityPrideMonth #a11y #accessibility



Aiko iOS app, totally free, Hi quality transcription, all performed on your device for privacy. This app was mentioned in the latest AppleVis Newsletter, so it is totally accessible. Just know that the app is kind of large, it is 2 GB in size. Probably because it does everything on the device, and all of the languages it supports.

Description from the app store followed by the link.
High-quality on-device transcription. Easily convert speech to text from meetings, lectures, and more.
The transcription is powered by OpenAI's Whisper running locally on your device. The audio never leaves your device. You can export the transcription as subtitles too. Aiko favors accuracy over speed.
Supports 100 different languages.
The app was made possible thanks to Whisper by OpenAI and whisper.cpp by Georgi Gerganov.
■ FAQ ‣ Can I edit the text in the app? I don't plan to support any editing. Export the transcription and edit it in a proper text editor.
‣ Why is the app so large? The app delivers the highest quality transcription on the market for 100 different languages. Rather than asking why it's so large, the real question is how is it so small.
‣ Why is the transcription so slow? The favors accuracy over speed. However, performance is expected to improve in the coming months.
‣ Can I delete some of the languages to save space? This is unfortunately not possible. The model has all the languages stored together in a way that makes it impossible to remove just some languages. More FAQs on the website.
■ Technical details The app uses the Whisper medium or small model depending on available memory.
■ Support You can contact me through the feedback button in the app or at sindresorhus@gmail.com

apps.apple.com/us/app/aiko/id1…

#Aiko #iOS #Quality #Transcription #Free #Blind #Accessible


Mindblowing 🤯

#Whisper is an #openSource #speechRecognition model written in #Python by #OpenAI. I’ve just seen it in action. Extract an #mp3 from a video, run it through Whisper, and it turns every spoken word into text. It even does a very decent job in #Danish. Perfect for subtitling #TV and #video. I am very impressed.

github.com/openai/whisper

#ai #language #transcription #speechToText