Skip to main content


Following xAI Grok-1 314B, Databricks DBRX 132B, Cohere Command R+ 104B, another big model drop this time from Mistral! Mistral 8x22B! #LLM #AI #ML https://twitter.com/mistralai/status/1777869263778291896
#AI #ML #llm
in reply to Chi Kim

I find that we rarely talk about user value in those multi-billion parameter machines and more about parameters and specs. What do these numbers mean for an average user? Reminds me of the CPU speeds and memory sizes show-off conversations we had at the beginning of 2000s! :) What’s your take? :)
This entry was edited (3 weeks ago)
in reply to victor tsaran

@vick21 For everage user, probably not much. Because it used to be the higher the parameter count, the better model quality. But not anymore. Also all the leaderboard/benchmark became useless because people started cheating by finetuning the model with the dataset from benchmarks. lol