This is the Llava:34B LLLM (local large language model) running on my #MacBook Pro describing a #screenshot from #BBC News. This, to me, is as good info as #BeMyAI would provide using #GPT4, so it goes to show that we can do this on-device and have some really meaningful results. Screenshot attached, #AltText contains the description.
Lately, I've taken to using this to describe images instead of using GPT, even if it takes a little longer for results to come back. I consider this to be quite impressive.
in reply to victor tsaran

@vick21 @bryansmart @pixelate So update: llava 1.6 support for llama.cpp is complete, but it only works in command line interface now. They still need to fix llama.cpp server, then ollama needs to upgrade. From my testing, it's much better, and it can process in 4 times higher resolutions (672x672, 336x1344, 1344x336.) Right now Ollama uses llava 1.6 weight, but model processes only in smaller older 336x336 resolution.