Chi Kim

11 months ago

Chi Kim
11 months ago

Following xAI Grok-1 314B, Databricks DBRX 132B, Cohere Command R+ 104B, another big model drop this time from Mistral! Mistral 8x22B! #LLM #AI #ML twitter.com/mistralai/status/1…

in reply to Chi Kim

victor tsaran

in reply to Chi Kim 11 months ago

I find that we rarely talk about user value in those multi-billion parameter machines and more about parameters and specs. What do these numbers mean for an average user? Reminds me of the CPU speeds and memory sizes show-off conversations we had at the beginning of 2000s! :) What’s your take? :)

This entry was edited (11 months ago)

in reply to victor tsaran

Chi Kim

in reply to victor tsaran 11 months ago

@vick21 For everage user, probably not much. Because it used to be the higher the parameter count, the better model quality. But not anymore. Also all the leaderboard/benchmark became useless because people started cheating by finetuning the model with the dataset from benchmarks. lol

@victor tsaran

⇧

Chi Kim

Chi Kim

Chi Kim 11 months ago • •

victor tsaran

Chi Kim

Chi Kim
11 months ago