The article goes on to explain the model routed cars over Central Park to solve the puzzle, among other embarrassing decisions.
All of which suggests, till now, these models are just ‘intelligent search’ with rules for organizing thoughts that are all human. Still a big step forward but not ‘thinking machines’ yet.
45,252 posts
Bag of heuristics’
New techniques for probing large language models—part of a growing field known as “mechanistic interpretability”—show researchers the way these AIs do mathematics, learn to play games or navigate through environments. In a series of recent essays, Mitchell argued that a growing body of work shows that it seems possible models develop gigantic “bags of heuristics,” rather than create more efficient mental models of situations and then reasoning through the tasks at hand. (“Heuristic” is a fancy word for a problem-solving shortcut.)
When Keyon Vafa, an AI researcher at Harvard University, first heard the “bag of heuristics” theory, “I feel like it unlocked something for me,” he says. “This is exactly the thing that we’re trying to describe.”
Vafa’s own research was an effort to see what kind of mental map an AI builds when it’s trained on millions of turn-by-turn directions like what you would see on Google Maps. Vafa and his colleagues used as source material Manhattan’s dense network of streets and avenues.
Thinking or memorizing?
Other research looks at the peculiarities that arise when large language models try to do math, something they’re historically bad at doing, but are getting better at. Some studies show that models learn a separate set of rules for multiplying numbers in a certain range—say, from 200 to 210—than they use for multiplying numbers in some other range. If you think that’s a less than ideal way to do math, you’re right.
All of this work suggests that under the hood, today’s AIs are overly complicated, patched-together Rube Goldberg machines full of ad-hoc solutions for answering our prompts. Understanding that these systems are long lists of cobbled-together rules of thumb could go a long way to explaining why they struggle when they’re asked to do things even a little bit outside their training, says Vafa. When his team blocked just 1% of the virtual Manhattan’s roads, forcing the AI to navigate around detours, its performance plummeted.