Far from being “stochastic parrots,” the biggest large language models seem to learn enough skills to understand the words they’re processing. […]
A trained and tested LLM, when presented with a new text prompt, will generate the most likely next word, append it to the prompt, generate another next word, and continue in this manner, producing a seemingly coherent reply. Nothing in the training process suggests that bigger LLMs, built using more parameters and training data, should also improve at tasks that require reasoning to answer.
But they do. Big enough LLMs demonstrate abilities — from solving elementary math problems to answering questions about the goings-on in others’ minds — that smaller models don’t have, even though they are all trained in similar ways.
“Where did that [ability] emerge from?” Arora wondered. “And can that emerge from just next-word prediction?” —Quanta Magazine
Post was last modified on 26 Jan 2024 10:26 am
It has long been assumed that William Shakespeare’s marriage to Anne Hathaway was less than…
Some 50 years ago, my father took me to his office in Washington, DC. I…
I first taught Wilson's Pittsburgh Cycle during an intensive 3-week online course during the 2020-21…
A federal judge ordered the White House on Tuesday to restore The Associated Press’ full…
Rewatching ST:DS9 After the recap of last week's "In Purgatory's Shadow," we see the Defiant,…
Rewatching ST:DS9 Kira helps Odo re-adjust to life as a shape-shifter, obliviously but brutally friendzoning…