Spelling is a known thing that LLMs are bad at, and it doesn't reflect on their performance on other problems.
LLMs don't see language like we do, all input is first tokenized (AKA turned into subwords) before passed into an LLM.
Most common words are just one token, long and unusual words are two or three. Common combinations of letters, like "ing", "able", "anti" etc also form tokens, so a word like "antiwiral" might be passed as ["anti", "wir", "al"].
LLMs have never really seen letters, you can imagine them as extremely intellectually sophisticated people who enjoy all content through audiobooks and have never actually seen written text.
@miki I'm aware, yet I feel the public discourse fAIls at explaining their limits and constraints at most tasks, while overhyping their capabilities and applications. That's the direction I was heading for.
Mikołaj Hołysz
in reply to Lars Marowsky-Brée 😷 • • •Spelling is a known thing that LLMs are bad at, and it doesn't reflect on their performance on other problems.
LLMs don't see language like we do, all input is first tokenized (AKA turned into subwords) before passed into an LLM.
Most common words are just one token, long and unusual words are two or three. Common combinations of letters, like "ing", "able", "anti" etc also form tokens, so a word like "antiwiral" might be passed as ["anti", "wir", "al"].
LLMs have never really seen letters, you can imagine them as extremely intellectually sophisticated people who enjoy all content through audiobooks and have never actually seen written text.
Lars Marowsky-Brée 😷
in reply to Mikołaj Hołysz • • •