Skip to main content


Ollama seems to support Llava V1.6 34B! Best open source multimodal vision-language model I've tried so far! #LLM #ML #AI @freakyfwoof @vick21 @tristan @KyleBorah @Bri
in reply to Chi Kim

Downloading as we speak… I saw the announcement about Llava 1.6 but didn’t realize Ollama had a 34b model ready for us. To be super-honest with you though, GPT4V will still be faster and more responsive no matter how you slice it. :)
This entry was edited (2 months ago)
in reply to victor tsaran

OpenAI is everyone's target. Llava 1.6 even trained with synthetic dataset generated with GPT-4V. Not sure about accuracy, but it talks very much like GPT-4V now with similar tone. lol
in reply to Chi Kim

What do you run yours on? I spawned the model and, o my gosh, it took like for ever to load! :)
in reply to victor tsaran

Yeah it does take a while. I'm on last intel mac, so if you're on silicon chip, yours should be faster! lol
in reply to Chi Kim

Yeah, I am here with M2 Pro and this thing really takes its sweet time! :) Wow!
in reply to victor tsaran

There's a rumor that Llama-3 is going to be 150B parameters, also possibly 300B variant. You might want to practice waiting. lol
in reply to Chi Kim

I hope quantization folks are getting ready for it though! :)
in reply to victor tsaran

Yeah llava-34 on Ollama is quantized. Imagine 5 times slower with 150B. lol
in reply to Chi Kim

Sorry, I was responding to your post about Llama3. As for the Llava:34b I take back what I said. I was trying to load it from the console. It loads fairly fast, however, if I do it from Python or Ollama UI extension.
This entry was edited (2 months ago)
in reply to victor tsaran

Talking about big models, have you tried miqu-1-70b? Mistral CEo confirmed someone leaked their testing model in 4chan. It's one of the top models in the HF leaderboard now. People say the quality is as good as or better than gpt 3.5, close to gpt4. However, it's pretty painful to wait for responses. haha The model is stil up in HF, and you can use with Llama.cpp or create modelfile to use with Ollama! https://huggingface.co/miqudev/miqu-1-70b
in reply to Chi Kim

Reading the headlines, it seems like every new model is now approaching GPT3.5 or even 4. Seems like a trend! :) No, I didn't try it! :)
in reply to victor tsaran

Yeah OpenAI is running with a target on their back, so everyone wants to nock them down. lol Honestly though, getting close, but not quite yet. On a related note, it's crazy how couple of university folks beat Gemini Pro with Llava V1.6. Training their 34B model only took 30 hours with 32 A100s.
in reply to Chi Kim

Chi, keep forgetting to ask you. I remember that when scanning VO cursor in the past versions, I was able to get the OCR of the whole screen, even the elements outside of the current window. With VOCR2 I mostly only get the stuff inside the current window regardless of whether I OCR VO cursor or the Window. Is this the intent or a bug?
in reply to victor tsaran

You should only see the ones inside VO cursor. However, sometimes VOCursor can focuss on the elements that's not visible, and returns coordinates outside window. Then VOCR will grab stuff outside window. Basically it takes screenshot of what VO says VOCursor coordinates are.
in reply to victor tsaran

Can you try going to system settings, then choose general form the side bar, and move your VO cursor to general scrol area (don't interact with it), and scan VOCursor. Then you'll only see what's inside the general scroll area. Whereas if you scan the window, you'll see stuff in the side bar as well. Let me know if it's not how it works in your end.
in reply to Chi Kim

Yep, works just like you described! I just remember VO cursor scanning working differently in the previous version.
in reply to victor tsaran

Yeah, in previous version I used VO applescript command and asked VO to capture screenshot under VOCursor and save into a file. Apparently that sometimes worked sometimes didn't. VOCR V2 now asks VO to give VOCursor bounds, and VOCR captures the screenshot instead. I couldn't do it in JXA, so I had to figure out in regular AppleScript. Fortunately ChatGPT wrote it for me. haha