I think what this highlights is that we're really missing a clear definition of "AGI." By 2020 standards, we probably have AGI, but the bar keeps moving.
Is AGI just a frontier model that is continually running rather than triggered by human interaction?
Is AGI a model that can evolve itself and incorporate new knowledge?
My eternally grumpy point is that if we are on the cusp of human-level AGI, then surely we can make frog-level and fish-level AI right now. But ANN-powered robots (or AI agents in physics-based video games) seem extremely slow and stupid compared to a frog or a fish. And instead of asking difficult scientific questions about what intelligence really means, AI researchers keep shoving human facts and heuristics in LLMs to give an illusion of human-level intelligence. This is precisely the same mistake yesteryear's AI researchers made when developing Lisp expert systems: ignore any tricky science, focus on trivia, tricks, and (most importantly) vibes.[1]
I am not even convinced transformer ANNs are as smart as spiders, and I don't mean the cleverer jumping spiders. Here is my thought experiment: lets say you trained a transformer-powered robo-spider on one gazillion examples of spiderwebs in nature - between rocks, bushes, etc - and verified that in such natural environments you had "superspider" performance (whatever that means). Now test the robo-spider on a pantry, attic, garage, etc. Will the robo-spider be able to reliably spin a functional indoor web as well as a real spider? I doubt it.
I could be wrong of course! Maybe transformers can figure out the underlying geometric/physical principles in a reliable way. But zooming out a bit: despite the success of Be My Eyes / etc, I don't think any of us will live long enough to see an AI replacement for a seeing-eye dog. ("We are very sorry about your mother, there was an edge case where the AI didn't realize that green trucks were dangerous.")
[1] More people should seriously consider that LLMs are similar to Lisp expert systems with an easier user interface, but trading reliability for breadth and ease of development. I use Scheme all the time, clearly Lisp expert systems are useful, as are LLMs. But it is also clear that Lisp will never be a model for human/etc intelligence. See also Drew McDermott's classic paper, "Artificial Intelligence Meets Natural Stupidity" https://dl.acm.org/doi/10.1145/1045339.1045340
I don't think the bar has ever moved. The following is how AGI was defined on Wikipedia in 2013:
> a hypothetical artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that could successfully perform any intellectual task that a human being can.
Have we achieved this? I don't think so.
Rather, many people nowadays seem to imagine AGI as "a machine that can perform any intellectual task that most human beings can", which is actually a significantly lower bar - if the AI fails at something that a human can do, you must merely establish that some humans would also fail at it, and then the AI still qualifies as AGI by that definition.
It is fantasy (or more specifically, science fiction). And it's astonishing how uncritically it gets taken at face value through so much of the industry.
There is a generalized reasoning that current LLMs still miss. It's hard to put a finger on it. Things like hallucinations show that there isn't a self awareness of thought. "Thinking" models are getting closer.
From most of my network trying to make products based on LLMs, excluding cost, the biggest hurdles are hallucinations, and seemingly "non-sensical" reasoning or communicating. Subtle choices that just "feel" not quite right. Particularly when the LLM is being constrained for some activity.
Open-ended chat doesn't show these flaws as often.
I think what this highlights is that we're really missing a clear definition of "AGI." By 2020 standards, we probably have AGI, but the bar keeps moving.
Is AGI just a frontier model that is continually running rather than triggered by human interaction?
Is AGI a model that can evolve itself and incorporate new knowledge?
My eternally grumpy point is that if we are on the cusp of human-level AGI, then surely we can make frog-level and fish-level AI right now. But ANN-powered robots (or AI agents in physics-based video games) seem extremely slow and stupid compared to a frog or a fish. And instead of asking difficult scientific questions about what intelligence really means, AI researchers keep shoving human facts and heuristics in LLMs to give an illusion of human-level intelligence. This is precisely the same mistake yesteryear's AI researchers made when developing Lisp expert systems: ignore any tricky science, focus on trivia, tricks, and (most importantly) vibes.[1]
I am not even convinced transformer ANNs are as smart as spiders, and I don't mean the cleverer jumping spiders. Here is my thought experiment: lets say you trained a transformer-powered robo-spider on one gazillion examples of spiderwebs in nature - between rocks, bushes, etc - and verified that in such natural environments you had "superspider" performance (whatever that means). Now test the robo-spider on a pantry, attic, garage, etc. Will the robo-spider be able to reliably spin a functional indoor web as well as a real spider? I doubt it.
I could be wrong of course! Maybe transformers can figure out the underlying geometric/physical principles in a reliable way. But zooming out a bit: despite the success of Be My Eyes / etc, I don't think any of us will live long enough to see an AI replacement for a seeing-eye dog. ("We are very sorry about your mother, there was an edge case where the AI didn't realize that green trucks were dangerous.")
[1] More people should seriously consider that LLMs are similar to Lisp expert systems with an easier user interface, but trading reliability for breadth and ease of development. I use Scheme all the time, clearly Lisp expert systems are useful, as are LLMs. But it is also clear that Lisp will never be a model for human/etc intelligence. See also Drew McDermott's classic paper, "Artificial Intelligence Meets Natural Stupidity" https://dl.acm.org/doi/10.1145/1045339.1045340
I don't think the bar has ever moved. The following is how AGI was defined on Wikipedia in 2013:
> a hypothetical artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that could successfully perform any intellectual task that a human being can.
Have we achieved this? I don't think so.
Rather, many people nowadays seem to imagine AGI as "a machine that can perform any intellectual task that most human beings can", which is actually a significantly lower bar - if the AI fails at something that a human can do, you must merely establish that some humans would also fail at it, and then the AI still qualifies as AGI by that definition.
It is fantasy (or more specifically, science fiction). And it's astonishing how uncritically it gets taken at face value through so much of the industry.
It's religion.
There is a generalized reasoning that current LLMs still miss. It's hard to put a finger on it. Things like hallucinations show that there isn't a self awareness of thought. "Thinking" models are getting closer.
From most of my network trying to make products based on LLMs, excluding cost, the biggest hurdles are hallucinations, and seemingly "non-sensical" reasoning or communicating. Subtle choices that just "feel" not quite right. Particularly when the LLM is being constrained for some activity.
Open-ended chat doesn't show these flaws as often.
Yeah, I agree that something just feels missing, but I can't put a finger on it.
Maybe you're right that it's self-awareness. The current models seem to have no metacognition, and even the "reasoning" hack isn't quite the same.