Andre Louis

1 year ago

Andre Louis
1 year ago

This is the Llava:34B LLLM (local large language model) running on my #MacBook Pro describing a #screenshot from #BBC News. This, to me, is as good info as #BeMyAI would provide using #GPT4, so it goes to show that we can do this on-device and have some really meaningful results. Screenshot attached, #AltText contains the description.
Lately, I've taken to using this to describe images instead of using GPT, even if it takes a little longer for results to come back. I consider this to be quite impressive.

The image appears to be a screenshot of a news report. There's a man in the center, who is speaking or being interviewed. He has graying hair and is wearing a suit with no tie, suggesting a casual or formal setting without overly strict dress code. At the top left corner, there's a "Live North London" label, indicating that this might be an ongoing broadcast from North London. The text at the bottom of the image, which includes phrases like "Charity increases in abuse on Jewish people in UK" and other headlines suggesting political or social news events, indicates that the report is likely related to current affairs or issues affecting the United Kingdom. On the right side of the image, there's a thumbnail for another video with the headline "Trump in court case begins next month," which suggests coverage of legal proceedings involving former U.S. President Donald Trump. Additionally, there are other headlines visible on the screen that pertain to various topics, such as health concerns and crime statistics. The interface elements suggest this is a digital broadcast or live streaming platform, possibly from the BBC News website based on the URL in the top left corner. The overall image conveys a sense of professional journalism covering diverse topics relevant to current events.

#AltText #screenshot #macbook #bbc #GPT4 #bemyai

in reply to Andre Louis

Devin Prater :blind:

in reply to Andre Louis 1 year ago

Wow, I wish my hardware was anywhere near that good. I'm lucky enough to run the 7 and 13B. And, this isn't even the *best* Llava 34B can do right now! Ollama will be updated so Llava can take in an even better quality screenshot than it already does!

in reply to Devin Prater :blind:

Andre Louis

in reply to Devin Prater :blind: 1 year ago

It's great. I'd say this is better than GPT 3.5 when it was big. It's only going to get better from here as you say, so if this impresses us, where will we be in 6 months?

Unknown parent

victor tsaran

Unknown parent 1 year ago

Hallucinations can be fun too! :)

Unknown parent

Chi Kim

Unknown parent 1 year ago

@vick21 @bryansmart @pixelate You can actually try on their online demo if curious. llava.hliu.cc/

LLaVA

^{llava.hliu.cc}

@victor tsaran @Devin Prater :blind:

in reply to Chi Kim

victor tsaran

in reply to Chi Kim 1 year ago

@chikim @bryansmart @pixelate Wow, that was amazing. I gave it a picture of a building in New Orleans and asked to guess where the surroundings came from and it actually got very very close! This is good!

@Chi Kim @Devin Prater :blind:

Unknown parent

victor tsaran

Unknown parent 1 year ago

@chikim @bryansmart @pixelate Sorry, Gemini Pro is yesterday! :)

@Chi Kim @Devin Prater :blind:

⇧

Andre Louis 1 year ago • •

Andre Louis
1 year ago