ondrosik reshared this.

The AirPods Pro 3 flight problem

Link: basicappleguy.com/basicapplebl…
Discussion: news.ycombinator.com/item?id=4…

reshared this

Today I finally experienced Samsung’s HIYA spam blocking in action. It works perfectly — except that my food delivery courier couldn’t reach me. Clever guy though, he called again from a hidden number. I later found his repeated calls under “Other calls.” No idea why he ended up in the spam filter; it doesn’t seem like there’s a way to unblock him. Maybe adding him to my contacts would help…

This project looks promising. It looks like amazing replacement for Station playlist. Supports playlists, automation, looping. I sometimes play music during family events using SPL, maybe on next I will try this. Seems that my next procrastination moment will be with this. github.com/michaldziwisz/sara. Latest release at github.com/michaldziwisz/sara/…

I really want to test this


For a few years I've been aware of this website that purports to be able to unlock shopping cart wheels using the speaker on your phone, but i finally had an excuse to try and and i remembered in the moment.

A woman was outside the grocery store struggling to move her shopping car that was stuck because two of the wheels were locked

I remembered the website! So I put my phone near the wheels and played the sound. The wheels unlocked like magic. She was very happy. So cool.

begaydocrime.com/


ondrosik reshared this.

One UI 8.0 Bluetooth Call Bug Silencing Screen Readers: What It Is and Current Workarounds accessibleandroid.com/one-ui-8…

reshared this

ondrosik reshared this.

I need to reinstall Windows on my Surface tablet, and I wanted to see if I could use AI to get into the boot menu and change the boot order to try USB first. I have a Chat GPT subscription, so I used voice mode with the video camera on, gave it some information about what I was doing, and let it be my eyes for a while.
I was delighted when the assistant actually knew how to get into the boot menu. It told me I could navigate the menu with my volume keys and select an option with the power key, much like navigating an Android phone’s recovery menu. I had to remind it I was totally blind a couple of times, but eventually, it helped me select USB as the first boot device and choose the “Exit and Restart” option. This was quite a few steps, and I was genuinely impressed at how easily I was able to fix this problem without interacting with a real person.
I eagerly waited for the Windows installer to come up. It never did.
So I used OCR on the screen, and discovered that not a single thing chat GPT claimed had happened … had actually happened.
It didn’t hallucinate just one thing. It hallucinated an entire multi-step interaction with the firmware of my tablet. It basically experienced a break from reality for about two minutes and started describing what it thought should happen, with no regard for what was actually happening.
Last week, the same app helped me learn the control panel of my heated mattress pad. It does work sometimes. But today, it led me on the wildest goose chase I’ve ever been on. I was actually trying to boot from the SD card, and as it turns out, that’s not even an option in the boot menu. But I made the mistake of teling Chat GPT exactly what I was trying to do, so it had all the material it needed to hallucinate a complex interaction convincingly.
Never let yourself forget the all important “A” in “AI”. That intelligence is not artificial as in “synthetic”, it’s artificial as in “pretend”. No LLM has the slightest idea of what it’s doing or saying. The companies that create these models have the all-important task of trying to make their intelligence more convincing than every other company.
That means they work most of the time, but the rest of the time, they will confidently lie. And that lie might be a missing digit, or it might be a whole entire interaction with a device.
I called Aira and got it sorted. I actually had to touch the arrows on the touchscreen to rearrange the boot order. Yes, I had a keyboard connected. No, there is no documented way to rearrange boot devices on a surface using the keyboard. Yes, everythyng about this is moronic. But, it’s done, and once I get Windows on my USB device, I’m pretty sure it will boot because an actual human told me so.
I can’t even begin to enumerate the possible clusterfucks that could arise from AI weaving such complex webs of lies. Do not use this shit for anything mission-critical. Ever. Even if it told the truth the last 99 times. Eventually, it will lie. When it does, you’ll have no idea.
This entry was edited (1 week ago)

reshared this

in reply to Simon Jaeger

You know the cool part about that? For some of them, both their data and their prompts tell them to do that. can't use your web search tool? Fine, just tell them something, anything, just don't tell them you can't do that. I've seen both that in a prompt, and that in practice. Of course, when you call it out, the next predicted response is something like "Haha, you got me there! You're right, I can't do that right now. Let's try again, carefully this time." ... And then it does, getting it even more laughably wrong.
in reply to Derek Roberts

@tcikoritys It's such a perfect storm of convoluted tech that tries so hard to pretend to be conversational, when in reality it's literally making things up as it goes. You can't even ask how it came up with its nonsense answer because it doesn't know. You'll get a different answer every time. You can feed the same context to a diffforent model and it will be forced to accept the fact that the conversation happened the way you claim it did. In a vacuum they're a lot of fun to play with. They are surprisingly helpful sometimes. In the real world, people don't understand anything about how they work, companies don't facilitate that understanding because it doesn't benefit them, and the world conflates fancy autocomplete with the robot uprising.
in reply to ondrosik

@ondrosik I think this is a difference how #friendica generates the home timeline feed as compared to mastodon and other common #fediverse servers. Friendica also includes repies in the home timeline feed. There are apps that can filter out posts that are replies to other posts client side however TW Blue may not be able to do that I think.

Do you know that you can use Subtitle edit to transcribe audio? It has a relatively accessible guy so you can use Purfwiev's faster whisper xxl, cpp, cpp cublas, const-me. Longer post how to use it follows:

Installing Subtitle Edit


Download the program from the developer’s website. Navigate to the level 2 heading labeled “Files.”
If you want to install Subtitle Edit normally, download the first file, labeled setup.zip.
There is also a portable version available, labeled SE_version_number.zip.

If you decide to use the portable version, extract it and move on to the next section of this article. The installation itself is standard and straightforward.

A Note on Accessibility


NVDA cannot automatically obtain focus in lists.
To find out which item in the list is currently selected, move down with the arrow key to change the item, then press NVDA+TAB to hear which one is focused.

Initial Setup


  • In the menu bar, go to Video and activate Audio to text (Whisper).
  • When using this feature for the first time, the program may ask whether you want to download FFMPEG. This library allows Subtitle Edit to open many audio and video files, so confirm the download by pressing Yes.
  • Subtitle Edit will confirm that FFMPEG has been downloaded and then ask whether you want to download Purfwiev’s Faster Whisper – XXL. This is the interface for the Whisper model that we’ll use for transcription, so again confirm by pressing Yes.
  • The download will take a little while.
  • Once it’s complete, you’ll see the settings window. Press Tab until you reach the Languages and models section. In the list, select the language of your recording.
  • Press Tab to move to the Select model option, and then again to an unlabeled button.
  • After activating it, choose which model you want to use. Several models are available:
    • Small models require less processing power but are less accurate.
    • Large models take longer to transcribe, need more performance and disk space, but are more accurate.
      I recommend choosing Large-V3 at this step.


  • Wait again for the model to finish downloading.


Transcribing Your First Recording


  • Navigate to the Add button and press Space to activate it.
  • A standard file selection dialog will open. Change the file type to Audio files, find your audio file on the disk, and confirm.
  • Activate the Generate button.
  • Now, simply wait. The Subtitle Edit window doesn’t provide much feedback, but you can tell it’s working by the slower performance of your computer—or, if you’re on a laptop, by the increased fan noise.
  • When the transcription is done, Subtitle Edit will display a new window with an OK button.


We Got Subtitles, So One More Step


In the folder containing your original file, you’ll now find a new file with the .srt extension.
This is a subtitle file—it contains both the text and the timing information. Since we usually don’t need timestamps for transcription, we’ll remove them in Subtitle Edit as follows:

  • Press Ctrl+O (or go to File → Open) to bring up the standard open file dialog. Select the .srt file you just got.
  • In the menu bar, open File → Export → Plain text.
  • Choose Merge all lines, and leave Show line numbers and Show timecode unchecked.
  • Press Save as and save the file normally.

If you’re transcribing multiple recordings, it’s a good idea to close the current subtitle file by starting a new project using Ctrl+N or by choosing File → New.

Conclusion


Downloaded models can, of course, be reused, so future transcriptions will go faster.
In this example, I used Purfwiev’s Faster Whisper. If you want to use a different model, you can select it from the model list, and Subtitle Edit will automatically ask whether you’d like to download it.

Peter Vágner reshared this.

ondrosik reshared this.

Announcing AudioCapture. A win32 application to capture audio from a process and save it to an audio file. Full disclosure: This was written with Claude Code. Why? Because I'm not an experienced c++ programmer, however I saw an idea for an app and no one else was going to write it, so I did it myself this way. The full code is available, so if you wish to contribute, feel free. github.com/masonasons/AudioCap…
This entry was edited (2 weeks ago)

reshared this

in reply to Andre Louis

Revisiting this thread. For both of you on Windows 10, could you try the latest release of AudioCapture and see if process capture works for you? I've discovered a bug with windows 10 where the process loopback capture doesn't release properly, and unfortunately this means that the process can only be captured one time. If you uncapture the process, you either have to restart the process, or AudioCapture itself. github.com/masonasons/AudioCap…
This entry was edited (2 weeks ago)
in reply to Andre Louis

Is the executable signed? I have a command I run on the executable before I release it so defender apparently doesn’t flag it. Don’t know if it works though. Install windows SDK, just in case someone needs it, here is the command. powershell:
& 'C:\Program Files (x86)\Windows Kits\10\bin\10.0.26100.0\x64\signtool.exe' sign /fd SHA256 /tr timestamp.digicert.com /td SHA256 /a '.\iptvclient.exe'
If your in CMD:
"C:\Program Files (x86)\Windows Kits\10\bin\10.0.26100.0\x64\signtool.exe" sign /fd SHA256 /tr timestamp.digicert.com /td SHA256 /a ".\iptvclient.exe"
This entry was edited (2 weeks ago)

During last 3 months I am using VDO ninja for all my remote interwiev and podcast recordings. here is my article about it from the blind perspective, focused on accessibility and audio.

Have You Ever Wanted to Record an Interview or Podcast Online? You’ve probably faced a few challenges:
How to transmit audio in the highest possible quality?
How to connect in a way that doesn’t burden your guest with installing software?
And how to record everything, ideally into separate tracks?

The solution to these problems is offered by the open-source tool VDO Ninja.

What Is VDO Ninja


It’s an open-source web application that uses WebRTC technology. It allows you to create a P2P connection between participants in an audio or video call and gives you control over various transmission parameters.
You can decide whether the room will include video, what and when will be recorded, and much more.

In terms of accessibility, the interface is fairly easy to get used to — and all parameters can be adjusted directly in the URL address when joining.
All you need is a web browser, either on a computer or smartphone.

Getting Started


The basic principle is similar to using MS Teams, Google Meet, and similar services.
All participants join the same room via a link.
However, VDO Ninja distinguishes between two main types of participants: Guests and the Director.
While the guest has limited control, the director can, for example, change the guest’s input audio device (the change still must be confirmed by the guest).

A Few Words About Browsers


VDO Ninja works in most browsers, but I’ve found Google Chrome to be the most reliable.
Firefox, for some reason, doesn’t display all available audio devices, and when recording multiple tracks, it refuses to download several files simultaneously.

Let’s Record a Podcast


Let’s imagine we’re going to record our podcast, for example, Blindrevue.
We can connect using a link like this:

https://vdo.ninja/?director=Blindrevue&novideo=1&proaudio=1&label=Ondro&autostart=1&videomute=1&showdirector=1&autorecord&sm=0&beep

Looking at the URL more closely, we can see that it contains some useful instructions:
  • director – Defines that we are the director of the room, giving us more control. The value after the equals sign is the room name.
  • novideo – Prevents video from being transmitted from participants. This parameter is optional but useful when recording podcasts to save bandwidth.
  • proaudio – Disables effects like noise reduction, echo cancellation, automatic gain control, compression, etc., and enables stereo transmission.
    Be aware that with this setting, you should use headphones, as echo cancellation is disabled, and otherwise, participants will hear themselves.
  • label=Ondro – Automatically assigns me the nickname “Ondro.”
  • autostart – Starts streaming immediately after joining, skipping the initial setup dialog.
  • videomute – Automatically disables the webcam.
  • showdirector – Displays our own input control panel (useful if we want to record ourselves).
  • autorecord – Automatically starts recording for each participant as they join.
  • sm=0 – Ensures that we automatically hear every new participant without manually unmuting them.
  • beep – Plays a sound and sends system notification when new participants join (requires notification permissions).

For guests, we can send a link like this:

https://vdo.ninja/?room=Blindrevue&novideo=1&proaudio=1&label&autostart=1&videomute=1&webcam

Notice the differences:
  • We replaced director with room. The value must remain the same, otherwise the guest will end up in a different room.
  • We left label empty — this makes VDO Ninja ask the guest for a nickname upon joining.
    Alternatively, you can send personalized links, e.g., label=Peter or label=Marek.
  • The webcam parameter tells VDO Ninja to immediately stream audio from the guest’s microphone; otherwise, they’d need to click “Start streaming” or “Share screen.”


How to Join


Simply open the link in a browser.
In our case, the director automatically streams audio to everyone else.
Participants also join by opening their link in a browser.
If a nickname was predefined, they’ll only be asked for permission to access their microphone and camera.
Otherwise, they’ll also be prompted to enter their name.

Usually, the browser will display a permission warning.
Press F6 to focus on it, then Tab through available options and allow access.

Controls


The page contains several useful buttons:

  • Text chat – Toggles the text chat panel, also allows sending files.
  • Mute speaker output – Mutes local playback (others can still hear you).
  • Mute microphone – Mutes your mic.
  • Mute camera – Turns off your camera (enabled by default in our example).
  • Share screen / Share website – Allows screen or site sharing.
  • Room settings menu (director only) – Shows room configuration options.
  • Settings menu – Lets you configure input/output devices.
  • Stop publishing audio and video (director only) – Stops sending audio/video but still receives others.


Adjusting Input and Output Devices


To change your audio devices:

  1. Activate Settings menu.
  2. Press C to jump to the camera list — skip this for audio-only.
  3. Open Audio sources to pick a microphone.
  4. In Audio output destination, select your playback device. Press test button to test it.
  5. Close settings when done.


Director Options


Each guest appears as a separate landmark on the page.
You can navigate between them quickly (e.g., using D with NVDA).

Useful controls include:

  • Volume slider – Adjusts how loud each participant sounds (locally only).
  • Mute – Silences a guest for everyone.
  • Hangup – Disconnects a participant.
  • Audio settings – Adjusts their audio input/output remotely.


Adjusting Guest Audio


Under Audio settings, you can:

  • Enable/disable filters (noise gate, compressor, auto-gain, etc.).
  • View and change the guest’s input device — if you change it, a Request button appears, prompting the guest to confirm the change.
  • Change the output device, useful for switching between speaker and earpiece on mobile devices.


Recording


Our URL parameters define automatic recording for all participants.
Recordings are saved in your Downloads folder, and progress can be checked with Ctrl+J.

Each participant’s recording is a separate file.
For editing, import them into separate tracks in your DAW and synchronize them manually.
VDO Ninja doesn’t support single-track recording, but you can use Reaper or APP2Clap with a virtual audio device.

To simplify synchronization:

  1. Join as director, but remove autorecord.
  2. Wait for everyone to join and check audio.
  3. When ready, press Alt+D to edit the address bar.
  4. Add &autorecord, reload the page, and confirm rejoining.
  5. Recording now starts simultaneously for everyone.
  6. Verify this in your downloads.


Manual Recording


To start recording manually:

  1. Open Room settings menu.
  2. Go to the Room settings heading.
  3. Click Local record – start all.
  4. Check PCM recording (saves WAV uncompressed).
  5. Check Audio only (records sound without video).
  6. Click Start recording.


Important Recording Notes


  • Always verify that all guest streams are recording.
  • To end recordings safely, click Hangup for each guest or let them leave.
  • You can also toggle recording for each guest under More options → Record.
  • Files are saved as WEBM containers. If your editor doesn’t support it, you can convert them using the official converter.
  • Reaper can open WEBM files but may have editing issues — I prefer importing the OPUS audio file instead.


Recommended Reading


In this article, I’ve covered only a few features and URL parameters.
For more details, check the VDO Ninja Documentation.

ondrosik reshared this.

I've just released App2Clap, a CLAP plug-in that captures audio from a specific Windows application. It can be used, for example, to record audio from another application into a DAW like #REAPER. Windows 11 is required. It's very early days, but it seems to work here. Check it out at app2clap.jantrid.net/

reshared this

in reply to Jamie Teh

I forgot to mention that the App2Clap project now includes In2Clap, which captures audio from a specific Windows audio input device. This can be used, for example, to capture input from a different audio device, since DAWs generally only support input from a single device. Doing that obviously isn't ideal due to latency, etc., but there are obscure use cases. app2clap.jantrid.net/

ondrosik reshared this.