Musharraf

1 year ago

Musharraf
1 year ago

This generative model allows you to sketch out a scene with a few words, it then leverages an LLM to flesh out the details, with the ultimate goal of feeding those details to a downstream visual image generation model.
It is almost, but not quite, entirely the inverse of image captioning models.
This offers the closest experience to an image generation tool that's usable by people with visual impairments.

huggingface.co/spaces/lllyasvi…

#a11y #genai #llm

Omost - a Hugging Face Space by lllyasviel

Discover amazing ML apps made by the community

^{huggingface.co}

Tamas G reshared this.

in reply to Musharraf

Luis Carlos

in reply to Musharraf 1 year ago

@datajake1999 Wow, this is something

@Jake Gross

in reply to Musharraf

Andre Louis

in reply to Musharraf 1 year ago

Can this itself give you a downloadable image? I'm trying to work that out, and I found the 'Render image' button but I don't know that it does anything noticeable.

in reply to Andre Louis

Musharraf

in reply to Andre Louis 1 year ago

@FreakyFwoof
Yes, it can generate actual images. Just click that button and wait for a little while.
It seems to load a separate model on demand when you click the button, which takes time.

@Andre Louis

in reply to Andre Louis

Musharraf

in reply to Andre Louis 1 year ago

@FreakyFwoof
The image is added to the chat history with the role of button, and a standard label.

@Andre Louis

in reply to Musharraf

Andre Louis

in reply to Musharraf 1 year ago

I got it, but couldn't save it even if I right clicked which was sad.

in reply to Andre Louis

Musharraf

in reply to Andre Louis 1 year ago

@FreakyFwoof
I was able to save the generated image by focusing the entry containing the image, moving the mouse cursor to it, performing a right-click, and clicking "save image as".

@Andre Louis

in reply to Musharraf

Andre Louis

in reply to Musharraf 1 year ago

I'll try again. I saved the output of the conversation to a file so maybe I can throw that back in, hit render, and try again.

in reply to Musharraf

Erion

in reply to Musharraf 1 year ago

Huge thanks for making this!

I'm wondering, how is this different compared to generating an image via Stable Diffusion for example? Are the images more predictable, i.e. closer to what your prompt is?

in reply to Erion

Musharraf

in reply to Erion 1 year ago

@erion
Sadly, I didn't make this. This is way beyond my resources.
But thanks for the thought 😁

@Erion

in reply to Erion

Musharraf

in reply to Erion 1 year ago

@erion
The main difference is that this provides a detailed textual description. It doesn't generate the actual image.
You give it a simple prompt of what you want, and it draws the scene textually in full details.

@Erion

in reply to Musharraf

Erion

in reply to Musharraf 1 year ago

Ah, this clears things up, thanks a lot!

⇧

Musharraf 1 year ago • •

Musharraf
1 year ago