Skip to main content


The single best thing about the reversibility of multimodal models like GPT-4 is that with the ability to generate Alt Text, #Blind people can now judge books by their covers! 📚👁️‍🗨️💡 #Accessibility #AI #Innovation #TechForGood 🚀🌐
in reply to Charlotte Joanne

I've started running llava locally to describe images that people upload on Discord servers. If they won't provide alt text, then I just open the image in a browser, save, and then proceed to punish my GPU for a while. I get a good description at the end of it and I can understand a lot better what is going on.