Remember seeing something about GPT-4 doing well on standardized tests? It turns out it may have memorized the answers.
aisnakeoil.substack.com/p/gpt-…
#gpt4 #AIHype #ThisIsWhyWeDontTestOnTheTrainingData
aisnakeoil.substack.com/p/gpt-…
#gpt4 #AIHype #ThisIsWhyWeDontTestOnTheTrainingData
GPT-4 and professional benchmarks: the wrong answer to the wrong question
OpenAI may have tested on the training data. Besides, human benchmarks are meaningless for bots.Arvind Narayanan (AI Snake Oil)