Skip to main content


Interesting, Apple released ferret, an open source multimodal Model! It's based on LLaVA and Vicuna. #AI #LLM #ML https://github.com/apple/ml-ferret/
#AI #ML #llm
in reply to Chi Kim

Yep, was just reading up on it. Wow, what a cryptic language that is! I wonder if it was used to build a screen recognition model in iOS? If it was, this LLM would be a very abstract version. I may be totally misreading / misunderstanding the paper though. I wish I had enough time to train the model and see what sort of input / output you’d get out of it.
This entry was edited (4 months ago)
in reply to victor tsaran

@vick21 I don't think it was used for iOS screen recognition because IOS screen recognition was out much earlier than this whole LLM trend took off. Also Llava is based on Vicuna which is based on Llama.
in reply to Chi Kim

I was sort of alluding to this in that farret is some sort of abstraction or perhaps has been inspired by some of the earlier work. But yeah, I am totally speculating, and perhaps even hallucinating, here! :)