Lars Marowsky-Brée 😷

1 year ago

Lars Marowsky-Brée 😷
1 year ago

Tell me again how #GenAI will extract meaningful trends from and answer queries about your data set.

#chatgpt4o #fAIl

ChatGPT 4o completely loses it and bunkers down hard on there only being two "r"s in the word "strawberry", regardless of different attempts at getting it to correct itself.

reshared this

in reply to Lars Marowsky-Brée 😷

Mikołaj Hołysz

in reply to Lars Marowsky-Brée 😷 1 year ago

Spelling is a known thing that LLMs are bad at, and it doesn't reflect on their performance on other problems.

LLMs don't see language like we do, all input is first tokenized (AKA turned into subwords) before passed into an LLM.

Most common words are just one token, long and unusual words are two or three. Common combinations of letters, like "ing", "able", "anti" etc also form tokens, so a word like "antiwiral" might be passed as ["anti", "wir", "al"].

LLMs have never really seen letters, you can imagine them as extremely intellectually sophisticated people who enjoy all content through audiobooks and have never actually seen written text.

in reply to Mikołaj Hołysz