This generative model allows you to sketch out a scene with a few words, it then leverages an LLM to flesh out the details, with the ultimate goal of feeding those details to a downstream visual image generation model.
It is almost, but not quite, entirely the inverse of image captioning models.
This offers the closest experience to an image generation tool that's usable by people with visual impairments.
huggingface.co/spaces/lllyasvi…
Omost - a Hugging Face Space by lllyasviel
Discover amazing ML apps made by the communityhuggingface.co
Tamas G reshared this.
Luis Carlos
in reply to Musharraf • • •Andre Louis
in reply to Musharraf • • •Musharraf
in reply to Andre Louis • • •Yes, it can generate actual images. Just click that button and wait for a little while.
It seems to load a separate model on demand when you click the button, which takes time.
Musharraf
in reply to Andre Louis • • •The image is added to the chat history with the role of button, and a standard label.
Andre Louis
in reply to Musharraf • • •Musharraf
in reply to Andre Louis • • •I was able to save the generated image by focusing the entry containing the image, moving the mouse cursor to it, performing a right-click, and clicking "save image as".
Andre Louis
in reply to Musharraf • • •Erion
in reply to Musharraf • • •Huge thanks for making this!
I'm wondering, how is this different compared to generating an image via Stable Diffusion for example? Are the images more predictable, i.e. closer to what your prompt is?
Musharraf
in reply to Erion • • •Sadly, I didn't make this. This is way beyond my resources.
But thanks for the thought 😁
Musharraf
in reply to Erion • • •The main difference is that this provides a detailed textual description. It doesn't generate the actual image.
You give it a simple prompt of what you want, and it draws the scene textually in full details.
Erion
in reply to Musharraf • • •