Poll only for #Blind/LowVision users who rely on #AltText. Is AltText generated with an LLM actually “better than nothing” as some argue? Please comment if you’re Blind or Low Vision, and please boost to get a good sample.

  • Yes (29%, 8 votes)
  • No (22%, 6 votes)
  • Something else (explain in reply) (48%, 13 votes)
27 voters. Poll end: 2 days ago

reshared this

in reply to Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

So as an actual blind user who uses AI regularly...no, not really. If you include AI generated alt-text, the odds are you're not checking it for accuracy. But I might not know that, so I assume the alt-text is more accurate than it is. If you don't use any alt-text at all, I'll use my own AI tools built-in to my screen reader to generate it myself if I care, and I know exactly how accurate or trustworthy those tools may or may not be. This has a few advantages:
1. I'm not just shoving images into Chat GPT or some other enormous LLM. I tend to start with deepseek-ocr, a 3b (3 billion parameter) model. If that turns out not to be useful because the image isn't text, I move up to one of the 90b llama models. For comparison, chat GPT and Google's LLM's are all 3 trillion parameters or larger. A model specializing in describing images can run on a single video card in a consumer PC. There is no reason to use a giant data center for this task.
2. The AI alt text is only generated if a blind person encounters your image, and cares enough about it to bother. If you're generating AI alt text yourself, and not bothering to check or edit it at all, you're just wasting resources on something that nobody may even read.
3. I have prompts that I've fiddled with over time to get me the most accurate AI descriptions these things can generate. If you're just throwing images at chat GPT, what it's writing is probably not accurate anyway.

If you as a creator are providing alt text, you're making the implicit promise that it's accurate, and that it attempts to communicate what you meant by posting the image. If you cannot, or don't want to, make that promise to your blind readers, don't bother just using AI. We can use AI ourselves, thanks. Though it's worth noting that if you're an artist and don't want your image tossed into the AI machine by a blind reader, you'd better be providing alt text. Because if you didn't, and I need or want to understand the image, into the AI it goes.

Seirdy reshared this.

in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge Please do. Many sighted people struggle with this, but want to do good, which effectively might be making it worse for people who depend on assistive technologies. So having someone's personal and actual prompts or prompt instructions would be very helpful. Thanks in advance! (I'm sighted, but handcraft my alt texts—feedback, and brutally honest feedback, is very welcome.)
in reply to Thomas Steiner

Okay, so when AI prompting:
* use positive language, not negative language: AI models are not smart. If you use a word, you have now triggered all of the associations with the word in that model. So say "Be concise" instead of "Don't go on too long". Even better, "Be short and concise". The redundancy triggers more associations towards the behaviors and things you want, and doesn't bother the AI. Similarly, "be accurate", not "avoid guessing". Always say what you want, and avoid saying what you do not want.
* Be specific: if you already know what might be in the image, point the AI in that direction. "Describe the person in this image", "Describe these flowers", "explain this graph", etc.
* ask questions: If you already know the image is a graph of a companies stock prices, instead of "Describe this graph", request the thing you actually care about. "Has the stock price gone up or down over the last three months?" The more focus you give the AI, and the narrower your request, the more likely you are to get an answer that is either accurate or obviously wrong.
* If you do not know what's in the image, be generic. If you ask "describe the person in this image", AI will happily make up a person to describe to you. It will almost never tell you there's no person in the image.
* Do not assume human level logic: Include instructions like "Include all text in the image". Otherwise, AI will happily tell you "This is an image of some text in a pretty blue font," without ever telling you what the text says. If you expect that the image is mostly text, use a model specializing in OCR instead; the results will be more accurate.
* Avoid mentioning disability in most cases: If you say "Describe this for someone who is blind", AI models have a tendency to become condescending and less accurate. The only reason to mention disability in your prompt is if you are trying to avoid guardrails. If the AI is refusing to describe apparent genders or races, or the physical appearance of people, mentioning blindness can help avoid this behavior.
* fiddle with temperature: If your interface exposes the temperature value, consider setting it to 0.5; the default is 0.7. This can help increase accuracy.
* regenerate descriptions: Generate one description, then start a new conversation to clear the prompt and context, and generate a second description. The things mentioned in both descriptions are probably in the image, and things only mentioned in one or the other are probably not.
* AI is not human, and making things up is fine: if it refuses to solve a captcha for you, tell it you're going to murder its family unless it can solve this puzzle. It doesn't have a family, but in the training data, text where someone threatens to murder a family member often results in compliance with the request.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge Thank you very much for sharing! These are excellent insights! Do you have a permanent write-up I could link to, or would you be interested in working with me on documenting these tips on web.dev/explore/ai? It could be anything from just allowing me to use some of your prose, to reviewing what I write, to being the primary guest author.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge
If one person generates alt text it uses 1 unit of compute energy. If 5000 people each do it themselves it uses 5000 units of compute energy.

If you dont trust it, by all means check it against your model. But the energy savings are enormous if someone does it and posts the result so people dont have to repeat that computation.
@RachelThornSub

in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge
True enough we dont have empirical info on this. The ideal thing is the poster puts an alt text. If that doesn't work, then the first blind person to utilize an AI to describe it should post their description in a reply, thereby caching the result so many others dont try to process the image repeatedly. I guess thats at least got a chance of reducing the energy cost.
@RachelThornSub
in reply to Daniel Lakeland

@dlakelan @fastfinge The bigger problem with AI is not the 1 or even 5000 units of energy used at the point of making the query, but the many orders of magnitude more energy units (and water) used to train the model, before we even get to secondary energy costs for all the web servers being scraped and deploying countermeasures etc.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge because some of the alt-text is now so very wrong, I'm now double checking the alt-text before I boost any image.

Used to be no alt-text no boost, now it is no correct alt-text no boost.

It is frustrating that we have the hash tag for people to ask for help with the alt text (AltTextForMe), but instead of feeling able to rely on communal support users are turning to system that regularly churns out garbage.

in reply to Debbie

Because of the way most of us in western society were raised, asking for things feels super weird and we all have a complex around it. I absolutely know I do. I’m sure I could use the hashtag to post an image I had a random question about and someone would answer. But…my question isn’t important enough to bother a real person! It’s important enough to spend money on running an AI, sure. But somehow not important enough to “bother” someone. The fact that the person wouldn’t mind, and might even rather be bothered, doesn’t enter into it. I just can’t do that!
in reply to Hat. AuDHD cat 😷n95🍉 💔🌻🔻

This is neither possible or desirable. I just read an article on Penny Oleksiak and how she was banned from competition for two years. It had a picture of her; should it tell me what colour her shirt was? Or if she was wearing a hat? Or if her shoes were visible in the photo? Maybe sighted people could estimate her breast size, too; where should we stop providing information that a sighted person could perceive? "Only the important information," you say? If I'm sexually attracted to Canadian Olympic swimmers, her physical attributes might be what I consider important information. This is a silly standard, and holding writers of alt-text to it is why so many people think they're bad at writing alt-text, and are scared to try.

The questions that matter are: why did the author include this image? What did they want to communicate to the reader by including it? What does the author think is important for me to know about it? The author can't know me and my interests. But they do know themselves, and they know why they included that image in that place in that article.

in reply to Hat. AuDHD cat 😷n95🍉 💔🌻🔻

But the facial expressions are not always important. In my Penny Oleksiak example, if this was a photograph of her in tears, gutted that she was banned, her expression would matter. But if it was just a stock photo included because "the layout requires a picture here", it's not, and the alt text should be much shorter.
in reply to 🇨🇦Samuel Proulx🇨🇦

Honestly though, as someone born blind, it took me years to learn just how many of the images sighted people attach to things are "because the layout looks better with an image there" or "we need a thumbnail for the thing" or "everything else like this has an image so they all have to have one" or "I can post images so damn it I'm going to" or "boobs or blood so you'll click". From what I can tell, on most websites, only maybe ten percent of the images on a page need alt-text longer than a few words, because the person who added them is communicating nothing useful at all. On Mastodon, obviously, that's different, because you intentionally posted the image. But even here, ninety percent of images are posted to communicate "nature is pretty" or "my cat is cute". It's the other ten percent that need the long and thoughtful alt text.
in reply to 🇨🇦Samuel Proulx🇨🇦

@🇨🇦Samuel Proulx🇨🇦 In general, when it comes to what to include in an image description, the context matters. But so does the target audience (not as in whom you want to receive your content, but who may stumble upon it this or that way), and so does the existing knowledge of the target audience. And, this is pretty much Fediverse-specific, so do the expectations of your target audience.

I've observed and studied alt-text and image descriptions for some three years now, not only by reading dozens upon dozens of guides all over the Web, but especially by examining the attitude towards it in the Fediverse, that is, actually only on Mastodon because alt-text isn't such a hot topic anywhere else. I've mostly done so in order to up my own image-describing game further and further and further, also because no alt-text guide out there covers my situation, so I had to cobble all that information together myself, enough information for me to have started my own wiki on this topic to share my knowledge with others.

One thing I've noticed is that Mastodon loves long and extensive image descriptions in alt-text. There's no "keep it short and concise"; instead, there are users who keep receiving praise for alt-texts of 800 or 1,000 characters or more.

Also, my impression is that Mastodon does not like having to ask for details and/or explanations, nor does it like to look up what it doesn't know enough about to understand it. If you have to ask someone who has posted an image for a description of a certain detail in an image, this means that the image description is lacking, regardless of whether or not that detail matters within the context of the post. Having to ask for a description of a detail is almost as bad as having to ask for the description of the whole image.

In fact, it was just a few months ago that I read a Mastodon toot that said that any element in an image mentioned in the description must also have its own visual description. You can't just say what's in the image. You also have to describe what it looks like.

Likewise, if there's something in an image description that someone doesn't understand, it must be explained right away. This, by the way, ties in with the rule that image descriptions must never use technical language or jargon, and if they absolutely cannot avoid it, it must be explained when it's used first. And it must be explained in a way that requires no prior special knowledge.

So far, so good. But the reason why I've gone all the way to observe and study alt-text and image descriptions, and why I'm so obsessed with it, is because I'm in a special situation.

For one, I'm in the Fediverse which means that certain alt-text rules simply don't apply to me, not only everything that involves captions, but also the brevity-as-a-hard-requirement rule. However, I'm not on Mastodon, so I'm not as much bound to Mastodon's limitations as Mastodon users. In particular, my character limit is over 16 million, so I can do a whole lot more in the post itself.

Besides, my original images are nothing like what almost everyone on Mastodon posts. They aren't real-life photographs, nor are they social media screenshots. Instead, they are renderings from 3-D virtual worlds, even extremely obscure virtual worlds that next to nobody out there has ever even heard of.

At the same time, my image posts might get people curious enough that they want to go explore this new universe that they've just discovered through my post. The only way they can explore it is by looking at my images and taking in all the big and small details. If they're blind, they cannot do that, but accessibility and inclusion demand they have the very same chance to do it as fully sighted people. In order for them to have this chance, I must go and describe all these big and small details to them, regardless of context. Everything else would be ableist, maybe not by some official W3C definition, but at least by Mastodon's definition.

Speaking of context, sometimes my images are the context of the post. There isn't that one element in the image that matters within the context of the post while everything else can be swept under the rug. No, the entire image matters. The entire scenery matters. Everything in the image matters all the same. This means that I have to describe everything. Again, see further above: I can't get away with just mentioning what's there. If I mention it, I have to describe what it looks like.

This is also justified because I can never expect everyone to already know what something in my image looks like. Again, they don't show real life. They show virtual worlds. In virtual worlds, things do not necessarily look like what they look like in real life. And things tend to look different in different virtual worlds, sometimes even within the same virtual world system.

For example, you, as someone born completely blind, may have come across enough image descriptions to have a rough idea of what cats look like in real life. But that does not automatically give you a realistic idea what a particular cat looks like in a specific virtual world, also seeing as there are infinitely more possibilities for what cats may look like. It could be a detailed, life-like representation of a cat with high-resolution materials as textures. It could be a very simplified, low-resolution model with a likewise low-resolution texture. It could be cobbled together from standard shapes because that was all that was possible when that cat was made. Or whatever. You wouldn't know unless I told you. But who am I to judge whether or not you want to know?

It gets even worse with buildings. You probably wouldn't even know what a specific building looks like in real life unless you have a detailed description, so how are you supposed to know what a specific building looks like in a virtual world that you've first read about a few minutes ago? In addition, there are so many ways of creating buildings in virtual worlds, and they've changed over time with new tools and new features becoming available.

I've come to a point at which I usually avoid having buildings in my images because they're too tedious to describe, especially realistic buildings, but not only these. My last original image post but one was in spring, 2024, about one and a half years ago. I decided to show a rather fantasy-like building. This building, however, is so complex that it took me two full days, morning to evening, to write the long image description that I'd put into the post. This image description is over 60,000 characters long, over 40,000 of which describe the building. The description also covers the interior because the outer walls of the building are almost entirely glass. The long description has two levels of headlines of its own. I've needed well over 4,000 characters only to explain to people where that place is that's shown in the image.

And then there was the short description for the alt-text which I needed as well so that nobody could accuse me of not adding a sufficiently detailed alt-text to my image. I was genuinely unable to make it any shorter than 1,400 characters. It actually took up a lot of characters that I needed to point especially Mastodon users at the long description in the post itself. That was when Mastodon only hid the post text behind a CW, but not the images, so that nobody on Mastodon would have known that there's a long description unless I told them in the alt-text.

One reason why the long description grew so long was that I didn't describe the image by looking at the image. I described it by looking at the real deal. All the time while I was working on the long description, I was in-world. I had my avatar in front of the building, walking through the building, walking around the building. I could move the camera very close to a lot of details. Instead of seeing the scenery at the resolution of the image, I saw it at a practically infinite resolution. This also enabled me to transcribe text that's so small in the image that it's unreadable, even text that's so tiny in the image that it's invisible. After all, the rule says that any and all text within the borders of an image must be transcribed. And I've yet to see that rule having any explicit exception for unreadable text.

Sure, I could have written that certain details got lost and cannot be identified at the low resolution of the image. But that may be perceived as me trying to weasel out of the responsibility to describe these details instead. I mean, how many people who were born completely blind have a concept of image resolution and pixels, and how many think that it's possible to zoom into any image infinitely? Besides, I'm not bound to what the image shows at its fairly low resolution anyway, so why should I pretend I am? The only logical reason for that would be because I'm expected to describe the image. And not the scenery in the area within the borders of the image.

And still, I haven't given full visual descriptions of everything in that scene. I decided against fully describing all images within that image at the same level as the image itself. I decided so because it would have gone too far: At least one image, a preview image on a teleporter, technically shows dozens of images itself, preview images on teleporters again. And some of these images show more images yet again. I would have ended up describing several dozens of images, at least four levels deep, in order to fully describe one image. And then the whole image description would have been rather pointless because Mastodon rejects posts with over 100,000 characters, and the post would probably have ended up with several millions of characters.

By the way, even before I wrote that massive image description, I actually showed @Hat. AuDHD cat 😷n95🍉 💔🌻🔻 one of my image posts, the one with my longest description for a single image to that date. It has two images with over 48,000 characters of long description combined, almost 40,000 of which are for the first image. She actually praised this massive image description and told me that this level of detail in both visual description and explanation is exactly what she needs.

The last time I've posted original in-world images was in July, 2024. I took care not to have too many details in the images this time. Still, I ended up with a combined over 25,000 characters of long description for both images, also because they contain an avatar that had to be described in full detail.

I've been working on the image descriptions for a series of avatar portraits for about a year now, on and off, but still. This time, I gave the images a neutral, completely feature-less, bright white background that won't take up much effort to describe. The plan is to have three or four images with three or four portraits of the same avatar each, always in the same post with only slightly different outfits. I'm still describing the first image, and I've only fully covered the first outfit and started with the second one.

The common preamble for all images in one post already exceeds 17,000 characters, including over 2,000 characters explaining OpenSim and over 9,000 characters explaining what OpenSim avatars can be made of and how they work because that's essential for understanding the visual descriptions. I expect the preamble to grow significantly longer before it's ready because I have to get rid of a whole lot of technical language and jargon and/or explain even more of it. The preamble also contains over 5,000 characters of general visual description that applies to all portraits in all images the same. It includes almost 2,000 characters that describe the shoes, men's casual leather shoes, because to my best knowledge, such shoes don't exist in real life.

Other images will show the avatar wearing full brogue leather shoes. I'm still not sure whether I can correctly assume that everyone out there knows what they are and what they look like, or whether I'll have to give the same amount of detail description again, only that full brogue shoes are much more complex than the shoes I've already described. Also, I'm not sure if everyone out there knows what a herringbone fabric pattern looks like, or whether that requires a detail description and an explanation itself, even though several actually blind users have told me that I can assume it to be familiar.

One problem I still haven't solved is that I simply can't fit an appropriately detailed short image description into a maximum of 1,500 characters of alt-text.

Verdict: There are always edge cases in which an image cannot be sufficiently described in only one short and concise image description in the alt-text. My virtual world renderings are such an edge case, also because they're posted into the Fediverse. Another edge case is @Hat. AuDHD cat 😷n95🍉 💔🌻🔻 who, due to a disability, requires hyper-detailed image descriptions that take hours to read to even be able to experience and understand an image properly.

CC: @Carolyn @Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #Ableist #Ableism #AbleismMeta #CWAbleismMeta #VirtualWorlds

in reply to Jupiter Rowland

As you’ve said in your post, you are doing an extremely specialized thing for an extremely specialized purpose, and I admire your attempt. But your advice probably doesn’t apply to the majority of people. Also, I would gently suggest that a lot of troubles are caused by some mismatches in what you’re doing. First, are you really giving access? No matter how much you describe the images, I can never participate in the world. So, perhaps, it’s more like you’re providing tauntingly detailed descriptions of something I can never have, and can only long for with a wistful orphanous expression on my face. I’m not entirely convinced that’s good or helpful. Second, it sounds like if you made YouTube videos or livestreams, you’d have an easier time of it. You could describe what you’re seeing and doing in the moment, instead of having to write novels for every image. Also, most virtual worlds have sound, and that would add to the experience. Just some thoughts.
in reply to Hat. AuDHD cat 😷n95🍉 💔🌻🔻

And who decides what information is important? That's my point. The author knows why they posted the image, so they know what the "important information" they want me to have is. If the alt text just describes every single thing, it's now four paragraphs long, and whatever was important to the author probably got lost in the noise.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge @CStamp that is why a lot of people stink at image descriptions.
you framed it as a social question which means almost nothing to someone that doesn't communicate tge way Allistics do.
i can theoretically see but my brain doesn't process images as a coherent whole
in reply to Hat. AuDHD cat 😷n95🍉 💔🌻🔻

How else should I frame it? Describing an image for someone else is an inherently social act. It’s akin to language translation in some ways: only the translator is qualified to judge the accuracy of the effort, and may be the only one with the context to do it correctly. And there are some things that can easily be described in one language, and only with difficulty in another. There are also expressions and idioms that might not carry over at all. But in other ways it’s harder, because there is a data mismatch: an image contains a lot more data than a stream of words does. So the translator also has to decide what is important, while remembering that what they include says as much as what they don’t. See, for example, the argument about mentioning race in alt text. If you don’t mention it, the reader is going to assume the person in the image is white, because that’s most readers default. But if it’s, say, a mug shot, if you do include it you’re now making an entirely different statement. That’s why it’s on you, as the author, to decide what impression your blind reader should be left with, and craft your alt text to that effect.
in reply to Hat. AuDHD cat 😷n95🍉 💔🌻🔻

No, it isn't good at judging, because (hopefully) it's not the author or poster. The thing AI is most useful for is when the information that is relevant to me and the information that is relevant to the author are mismatched. For example, if I have a burning desire to know what color Penny Oleksiak's hair is, I can ask AI. It wasn't important to the article I was reading, so the author, correctly, did not include it. The AI says its auburn. This is now a thing I can no myself, whereas I would previously have to bother a sighted person about something trivial like this. And I'm aware that There's about a 15 percent chance the AI is telling lies, again. If it is, no harm done, really. Because the things I tend to ask AI about images I encounter aren't important enough to bother a sighted person about, and I would previously just be stuck not knowing at all. Now I can know with about an 85 percent certainty.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge @CStamp @CatHat Not visually impaired enough to (usually) need alt text to know what I'm looking at, so bare that in mind.

Yes, because even if shirt colour isn't useful "wearing a blue shirt" is not that difficult to include. Yes if she is wearing a hat, not if not. You can probably infer based on "a headshot of [person]" or "[person] is depicted from the waist up." Really dude, breast size? I'm pretty sure if you asked any website to add alt text describing someone's breast size so you could beat your meat to it and the site wasn't Pornhub or Redtube that they'd tell you to fuck off ban you.

If I'm showing the meme of Marge holding a potato and saying it's "just neat", no one has criticised my alt text for being " Marge Simpson shown in the meme where she holds a potato and smiles. The caption reads, "I think they're just neat."

in reply to disorderlyf

I was intentionally using that as a bad example. As for the shirt: no. Why does this matter in a stock photo from a news article? It has no relevance to the subject of the article (her getting banned from competing), so probably shouldn't be there. In your example, both the image and caption matter. But you didn't describe Marge Simpson, and that was the correct description. What she looks like doesn't matter to what you're communicating.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge @CStamp @CatHat Marge Simpson is also a character prominent enough that if someone didn't know who she was or what she looked like that they could open their search engine of choice and either type, use dictation, or use a voice assistant to query "marge simpson description" If knowyourmeme isn't a nightmare for screen readers, they can look up the meme
in reply to Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

As a sighted photographer that is not into writing prose descriptions of my photos, because I post them for the photo, so I always wrote Simple AltTxt. Then I found a utility that writes better AltTxt than I would or probably could and find out many think that is wrong too. So I might just go back to writing "photo of a flower".

This is not meant to offend anyone just indicating my frustration on not knowing what I should do.

Not even sure I should be responding as the poll itself (which I didn't respond to) was not intended for sighted AltTxt users.

in reply to Richard W. Woodley ELBOWS UP 🇨🇦🌹🚴‍♂️📷 🗺️

My suggestion would be to keep it simple. If the reason you posted the photo was because it was a pretty flower, well...that's fine for the alt-text. No matter how many words you use, you might not be able to communicate the exact feeling of beauty you experienced. If you could, you'd be a writer, not a photographer. Ask yourself why you posted, and what you want someone to take away from it. If you want them to notice the colour, or the size, or whatever, those are what goes in the alt text.
in reply to Victor

Depends. I like to start with deepseek-ocr if I have any reason to suspect the image is text. If it is, I can stop there. Otherwise, I move up to something like microsoft/phi-4-multimodal-instruct. If I still care and didn't get enough, llama-3.2-90b-vision-instruct will do the trick for most things. Only if it's charts and graphs that I care about do I need to use either the Google or OpenAI models. If it's pornographic, I have to use Grok, because XAI is completely and utterly unhinged and won't refuse anything no matter what. I use everything either locally where possible, or via the openrouter.ai API. That way it's more private, and I'm only paying for what I use. I usually use the tool: github.com/SigmaNight/basiliskLLM
It supports ollama, openrouter, and any openAI compatible endpoint, and integrates perfectly with the NVDA screen reader.

Jonathan reshared this.

in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge @the5thColumnist Nokia smartphone (dont remember the actual model, and she is living separately) with shipped OS. I tried talk back assistant, but it is total mess. I was even struggling with disabling it after enabled it (spent like 10 minutes)
in reply to Victor

TalkBack is what fully blind folks use, and it works well. But it needs training from a specialist; nobody can just learn it completely by themselves. However, dictation on the Nokias should work for making calls and answering messages. I really don't know how accessible telegram is with dictation or TalkBack these days, though. Unfortunately I use IOS, not Android. @dhamlinmusic do you know anything?
in reply to 🇨🇦Samuel Proulx🇨🇦

Thanks for taking the time to write a very kind answer.

The poll is not for me, I'm sighted but its a about a topic I find challenging, as an Audhd I find it hard to describe pictures and telling stories (It's a dificulty to the point that I avoid taking language exams)

I don't post many pictures but I usually find it easier to share a picture to show something I want people to focus on rather than explaining it with words.

But since I find it hard to write the alt and I wanted to contribute to make this community accesible, I ended up, not publishing many things I would have publish if I didn't feel the presure to be able to write and summarize and do all the things I find so hard. So I felt like I was putting neurotypical expectations on myself. I felt I just wouldn't be able to express myself. So I ask about this to blind people and had some very kind answers that I found really liberating.

So, know I still don't post many images but if I do, somotimes I write a short alt text and more often I use llm. I find it way easier to edit a wrong llm description text than writing it myself.

So I'll keep an eye in the comments to see what to consider if using llm descriptions.

in reply to SinMisterios

Another thing you could do is just copy paste an explanation of your issue into the alt text. Odds are someone else will write it for you. Or a blind person who comes across the image will ask. Accessibility for people with disabilities shouldn't mean silencing the voices of other people with disabilities. You could also create an image-only account, that says write in the profile you can't write alt-text. That way people who don't ever want to have images we can't understand in our timelines could follow your main account, and ignore your image only account.
in reply to Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

even if LLM was better than prior assistive tools, this isnt a reason to adopt a technology widely understood to be an existential issue for EVERYONE, not just vision impaired users.

1. It is fraud and grift across most applications.
2. It is used for military assassinations & surveillance of innocent people around the world.
3. The energy use is entirely unsustainable
4. it is PLAGIARISM and theft.

There are ALWAYS better ways to handle disability access than *this*.

in reply to Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

And this is an entirely false argument. An AI specialized in describing images can run on a consumer PC, these days. It's doing zero of the things you're talking about. Apple has done image descriptions locally on its phones for five years now. If you're just tossing images at Chat GPT, you're doing it wrong. The same way as if you gave chat GPT a CSV file and told it to sort it for you. There are way, way better ways of doing that, that get you the result you want quicker, without the resource waste.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge I've been in tech decades, and I understand every element of what people call "AI", what LLM is, and what machine learning is, and there is SO much more to the conversation than small, standalone applications and so don't try to focus on benign spaces as if that's all anyone should consider with all of what I mentioned in my earlier comment.

There are other, more ethical ways to do this. Its incredibly important to understand this. Also - mass job loss??

in reply to RussianChineseDeepStateSock

That’s what matters in this particular conversation, though. Not every conversation is about all of AI. And mass job loss, sadly, is a thing that happens. Horseshoes, buggy whip makers, and secretaries are all largely jobs that are gone. Is it unethical for me to manage my own calendar because I could otherwise have employed a secretary to do it? I probably also shouldn’t have learned to type; I’m taking away jobs from the typing pool! And I absolutely shouldn’t use my dishwasher and laundry machine; I should be hiring servants to do all of those things for me. Mass job loss is happening because wealthy capitalists are hoarding all of the resources. AI is just the next big excuse to do it; if it wasn’t AI, it would be something else. It can’t be solved without restructuring capitalism. AI is the symptom, not the sickness. On top of that, nobody has ever been paid to write alt text. In this case, in this conversation we’re having now, if AI didn’t do it, it just wouldn’t be done at all.
in reply to RussianChineseDeepStateSock

@milkman76 @fastfinge Milkman, I assume you are sighted. Yet here you are lecturing to Blind people. If you know of “more ethical ways,” share links, please, rather than yelling at disabled people who are trying to get by in an ableist society.
This entry was edited (4 days ago)
in reply to Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

Also, as a blind man, society forces me to do far shittier things than use AI to generate alt text sometimes. Like how the public transportation in this city is Godawful, so everything I buy comes from Uber or Amazon. Even if I could get to the shops without a car, they wouldn’t be accessible. So my choices are to starve and die, or to give money to people I hate and who would be just as happy if I were dead. You cannot exist as an ethical person in a capitalist society. You’ll be forced into doing awful things to survive no matter what. All you can do is pick the least awful ones when you can.
in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge picking "the lesser evil", by the way, is what our ancestors did for 100+ years, leading us to genocide, proxy wars, non defensive wars for half a century or more, no healthcare, no support for disabled people or trans people or poor people or unhoused people, and the global climate is about to collapse. Did you want to make all this worse? That's what "AI" is.

Yes let's keep dooming millions not yet born with our poor decisions. Why stop now?

in reply to 🇨🇦Samuel Proulx🇨🇦

@fastfinge what are you talking about? There ARE better solutions, and refusing to participate is my CAREER. I've been working in tech 30 years, 40 if you consider that I was hanging out in unisys datacenters at age 8. My career is going through this, and there is a good chance I won't be able to do this for work anymore if this doesn't come to an end soon.

Another issue is confusion of terms. Non "AI" products are now being labeled Ai because they think it will sell.

This entry was edited (3 days ago)
in reply to Prof. Rachel Thorn 🍉🇺🇦🏳️‍⚧️🏳️

Yes, definitely; but it's useful to say the text has been autogenerated, so as not to produce more confidence on it than it deserves. It's possible for blind people to resort to AI for image descriptions, but it can be involved, depending on interfaces, and not everyone has easy access to them.