🇨🇦Samuel Proulx🇨🇦

2 months ago

🇨🇦Samuel Proulx🇨🇦
2 months ago

Final update: The developer is now on Mastodon via @andrew_guide.

Update: The developer has removed the ability to download Guide until the security issues mentioned in the linked thread are fixed.

Update: this product contains some code flaws that are concerning from a security perspective, beyond just giving control of your computer to an LLM. You might want to read this thread before installing the product: toot.cafe/@matt/114258349401221651

Update: I've exchanged some long emails with Andrew, the lead developer. He's open to dialogue, and moving the project in the right direction: well-scoped single tasks, more granular controls and permissions, etc. He doesn't strike me as an #AI maximalist can and should do everything all the time kind of guy. He's also investigating deeper screen reader interaction, to let AI just do the things we can't do that it's best at. I stand by my thoughts that the project isn't yet ready for prime time. But as someone else in the thread said, I don't think it should be written off entirely as yet another "AI will save us from inaccessibility" hype train. There is, in fact, something here if it gets polished and scoped a bit more.

Just tried guide for fun. It's supposed to be an app to use #AI to help #blind folks get things done. I asked "Where are the best liver and onions in Ottawa?" It:
1. Decided it needed to search the web.
2. Thought that the "stardew access" icon on my desktop was a kind of web browser, so clicked it.
3. Imagined an "accept cookies" dialogue it needed to accept.
4. Decided that didn't work, so looked for Google Chrome (I don't have chrome installed on that machine)
5. Finally opened edge from the start menu. By the way, it just...left Stardew open and running. Because apparently having Stardew Valley running in the background is a vital part of finding liver and onions in Ottawa.
6. Opened a random extension from my edge toolbar (goodlinks).
7. Clicked the address bar and loaded google.com, instead of just doing the search right from the address bar.
8. Got blocked because it couldn't sign into my Google account, even though it could have also searched from the Google homepage.

To be fair to AI, that was the kind of open-ended task AI is terrible at. If I had asked it to check an inaccessible checkbox, or read a screenshot, or something, I'm sure it would have been fine.

Anyway, I'm still better at using a computer than an AI. So is my 87 year old grandfather, for that matter. www.guideinteraction.com

Guide - AI assistant for people with low vision or blindness

With Guide, inaccessible doesn't have to mean undoable. Guide is a Windows AI assistant that helps people with low vision or blindness navigate the digital world.

^Guide

Matt Campbell

2025-03-31 18:00:52

There's a new product that has been gaining some buzz in the blind community, a Windows app called Guide that uses AI to perform tasks on your computer. It's pitched as a way to get around web accessibility problems in particular. I won't link to the thing itself, because I don't want to give it that validation, but I'll link to a previous discussion thread about it: fed.interfree.ca/notes/a5wf4ys…
I've spent some time taking this app apart. The level of shoddy work here is deeply disgusting. 1/?

This entry was edited (2 months ago)

Tissman reshared this.

in reply to 🇨🇦Samuel Proulx🇨🇦

🇨🇦Samuel Proulx🇨🇦

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

So I wanted to try guide on a real accessibility issue. However, it seems that #codeberg has finally fixed their #inaccessible#captcha. Now, if you tab into the #captcha field, you're told what you need to type to get past it. Good job codeberg! #a11y

#a11y #codeberg #captcha #inaccessible

Pitermach reshared this.

in reply to 🇨🇦Samuel Proulx🇨🇦

They Might Be Stupid

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

is codeberg staying online consistently for you? Last I heard, they were still combatting AI training models, or something AI-related, that was sapping up all of their resources.

Made me revisit how things in Forgejo were progressing toward federating pull requests without resorting to emailing patches.

in reply to They Might Be Stupid

🇨🇦Samuel Proulx🇨🇦

in reply to They Might Be Stupid 2 months ago

@thelonelyghost It seems to be. But I’ve only had the account for a day, and all I’ve done is migrate some GitHub repos.

@They Might Be Stupid

in reply to 🇨🇦Samuel Proulx🇨🇦

They Might Be Stupid

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

I've had an account to username squat mainly, and had lofty goals of migrating off of GitHub, but hadn't put in the time to test the CICD system. Didn't want to tax the community resource if BigCorp was willing to foot the bill elsewhere. Then again, Copilot training. Ugh.

Maybe I'll revisit it again this weekend.

in reply to They Might Be Stupid

🇨🇦Samuel Proulx🇨🇦

in reply to They Might Be Stupid 2 months ago

@thelonelyghost Let me know how it goes if you do!

@They Might Be Stupid

in reply to 🇨🇦Samuel Proulx🇨🇦

The Evil Chocolate Cookie

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

the residence of pelican town will be interested to know that their universe is now a web browser

in reply to The Evil Chocolate Cookie

🇨🇦Samuel Proulx🇨🇦

in reply to The Evil Chocolate Cookie 2 months ago

@evilcookies98 But do they have liver and onions?

@The Evil Chocolate Cookie

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

This seems vastly overkill for what I want, a captcha solver, but do you have any suggestions on how to make this solve image only captchas? Or if there's a better tool just for that task I'd love it!

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter Anything I would suggest relies on lllms. And I know you think those are evil and won’t use them even for accessibility.

@Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

I guess you didn't notice me using the AI feature in the be My Eyes app for Windows to read handwritten scanned PDF's, but putting aside your attempt at snark, would love suggestions in this area. You can criticize tech culture and still use tech, you know.

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter Your criticism is most likely to harm the access that even you apparently hypocritically rely on. So I’m unlikely to recommend you something I rely on for my living, lest you insight a backlash against it. It’s a matter of trust.

@Robert Kingett

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter I don’t post about the AI that actually works on the fediverse. The anti AI crowd here is likely to get upset that it can solve captchas, describe adult images, or make infographics for me, and attempt to shut it down or take it away.

@Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

I can assure you none of us have that kind of power. If anything, it will be a lack of long term financial investments for these companies that causes them to go under, which is outside the critiques of the Fediverse.

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter Maybe not. But I’m not suffering under the delusion that the Luddite’s and anti AI crowd are my friends anymore than the gross capitalist growth at all costs tech bros are. As blind folks we have to take what we can from both. Because both will screw us if it gets them an inch closer to their goals. So I keep the AI that actually works off the fediverse.

@Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

Shame because the luddites are making some pretty cool scripts and software. But while I believe the worst that can happen is you'll just get scolded by anti-AI folks, I actually fully support doing what you gotta do to survive under this world.

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter Yes, I use some of them. But I’m not gonna tell them about the AI that can solve captchas they’re using to block AI crawlers.

@Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Tara Owton

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

Anyone who is anti AI has never actually used it properly, they probably used it ones, it’s made a mistake or hallucination, and then they think it’s total rubbish.

in reply to Tara Owton

🇨🇦Samuel Proulx🇨🇦

in reply to Tara Owton 2 months ago

@oasisfan@WeirdWriter I wouldn't go that far. But most anti AI folks are fanatics, and thus dangerous to accessibility, as they will happily throw the baby out with the bathwater.

@Tara Owton @Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

I will just leave this here, from a blind developer toot.cafe/@matt/11425836163340…

Matt Campbell

2025-03-31 18:03:58

First, it's an Electron+Python monstrosity. Specifically, the Python backend runs as a web server on the local machine, and the Electron frontend connects to that local web server. Along with the size of Electron itself, the frontend app is about 27 MB, mostly a node_modules tree with no hint of tree-shaking / dead code elimination. The front-end JavaScript code is not minified at all, so once you extract the .asar file, it's easy to look at it. 2/?

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter@oasisfan If you had checked, you would have found I updated my original post.

@Tara Owton @Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

Except that I’m not Tyler. You got the wrong person.

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter@oasisfan Uh what?

@Tara Owton @Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

Sorry about that. Mona moved focus from a different conversation

in reply to Robert Kingett

🇨🇦Samuel Proulx🇨🇦

in reply to Robert Kingett 2 months ago

@WeirdWriter@oasisfan So what you're saying is that you got the wrong person. LOL

@Tara Owton @Robert Kingett

in reply to Robert Kingett

Tara Owton

in reply to Robert Kingett 2 months ago

Yep, it can solve those CAPTCHAs where you have to choose the right picture, but the cloud vision NVDA add-on can do this, it just won’t click on things for you, but it can tell you what the pictures are.

in reply to Tara Owton

🇨🇦Samuel Proulx🇨🇦

in reply to Tara Owton 2 months ago

@oasisfan@WeirdWriter I don't trust cloud vision. It's an addon written in Russian, that sends all data to a Russian domain. Including your Be My Eyes username and password if you give it that. If it was communicating with the services directly, fine. But if you check the sourcecode, it's not.

@Tara Owton @Robert Kingett

in reply to 🇨🇦Samuel Proulx🇨🇦

Jage

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

Yup for now it's definitely better with specific directions rather than blanket tasks.

in reply to Jage

🇨🇦Samuel Proulx🇨🇦

in reply to Jage 2 months ago

@Jage Also, the only way to start a new chat is to open the program and close it again?

@Jage

in reply to 🇨🇦Samuel Proulx🇨🇦

Jage

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

Or put another way, I'm not willing to write off the idea.

in reply to Jage

🇨🇦Samuel Proulx🇨🇦

in reply to Jage 2 months ago

@Jage Nope. It's far, far better at Balatro than me. And now I'm upset and I hate it forever. ROFL JK

@Jage

in reply to Jage

James Scholes

in reply to Jage 2 months ago

I think the lack of a privacy/"what I'll do with your data" section of the website is a big miss for a product that will have access to do whatever it feels like on my computer. @fastfinge

@🇨🇦Samuel Proulx🇨🇦

This entry was edited (2 months ago)

in reply to James Scholes

🇨🇦Samuel Proulx🇨🇦

in reply to James Scholes 2 months ago

@jscholes@Jage On a more serious note, I think the interface as presented is just way, way too generalist, and its freedom too unrestricted. Things I'd like to see:
1. it gets prompted with the name of the currently focused app, and the mouse is disallowed from leaving that app window.
2. Bringing it up with alt+ctrl+g gives prompts of what tasks it thinks it could perform inside the current app. Instead of the current general "ask me to do anything!"
3. It should have access to the DOM for browsers so it's not taking screenshots and acting only on that all the time if it's being asked to do something on a webpage.
4. It really, really needs training on NVDA focus mode. It says to turn off speech, but that doesn't solve for it trying to type into an edit field when NVDA isn't in focus mode. It does this constantly.

I still don't think it'd be ready for prime time, but it would be closer.

@James Scholes @Jage

in reply to James Scholes

🇨🇦Samuel Proulx🇨🇦

in reply to James Scholes 2 months ago

@jscholes@Jage It's also a security thing. Sure, it was pretty funny that it opened Stardoo Valley. But if I was using it on my real machine, I can think of a lot of apps I wouldn't want it to unexpectedly launch.

@James Scholes @Jage

in reply to 🇨🇦Samuel Proulx🇨🇦

starkraving666

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

holy shit. and then it sent all your personal data to palantir. UGH we have to stop using this shit!!!

in reply to starkraving666

🇨🇦Samuel Proulx🇨🇦

in reply to starkraving666 2 months ago

@starkraving666 It's useful for a lot of things. Narrowly defined tasks like "describe this image" or "What is this control" or "what does this say" especially. But we're trying to create giant mega-models with all the data instead of tiny specialized models with the right data. I don't need the same AI that gives me alt-text for images I can't see to also write me python code and draw me a unicorn and tell me a story. If I want those things (and sometimes I do), they should be different specialized AI's that just do that one thing. It's okay if the Python-writing AI can't read me the menu when I go out to eat.

@starkraving666

in reply to 🇨🇦Samuel Proulx🇨🇦

aburka 🫣

in reply to 🇨🇦Samuel Proulx🇨🇦 2 months ago

this kind of thing, where you just give an LLM full access to fuck around on your computer, terrifies me. Even more so that it's designed to be used by people who literally can't see what it's doing. How can anyone think this is safe?

in reply to aburka 🫣

🇨🇦Samuel Proulx🇨🇦

in reply to aburka 🫣 2 months ago

We can hear exactly what it’s doing with our screen readers. And there is a stop hot key. I’m not concerned about safety. I’m more concerned about privacy, and correctly setting expectations. I work in tech, and I spend at least a couple of hours a day struggling with inaccessible apps and websites, and have to have a co-worker do certain tasks for me. Anything that can make this better is of interest to me. Sure, everyone should build accessible apps. But for the most part, nobody bothers. But I still have to work. However, solutions like this need to be strictly sandboxed to the app I actually need help with, not given access to the entire computer. Not only is this more secure, private, and safer, it also works better. The AI doesn’t need to get distracted by the fact that I have BBalatro installed, or that a windows notification popped up. Just give it the current window.Also, the examples for use need to be more like “Ask me to check a keyboard inaccessible checkbox for you” or “I can help you read an image without alt text”. The current examples are more like “Ask me to book a flight for you”. In its current form, AI can’t do that. The developer has strongly committed to fixing both of these things.

This entry was edited (2 months ago)

⇧

🇨🇦Samuel Proulx🇨🇦

🇨🇦Samuel Proulx🇨🇦 2 months ago • •

🇨🇦Samuel Proulx🇨🇦
2 months ago