Following xAI Grok-1 314B, Databricks DBRX 132B, Cohere Command R+ 104B, another big model drop this time from Mistral! Mistral 8x22B! #LLM #AI #MLtwitter.com/mistralai/status/1β¦
I find that we rarely talk about user value in those multi-billion parameter machines and more about parameters and specs. What do these numbers mean for an average user? Reminds me of the CPU speeds and memory sizes show-off conversations we had at the beginning of 2000s! :) Whatβs your take? :)
@vick21 For everage user, probably not much. Because it used to be the higher the parameter count, the better model quality. But not anymore. Also all the leaderboard/benchmark became useless because people started cheating by finetuning the model with the dataset from benchmarks. lol
victor tsaran
in reply to Chi Kim • • •Chi Kim
in reply to victor tsaran • • •