I spent a while this weekend thinking about what might be the right approach for testing the intelligence of early-stage AGI systems that are aimed at human-level, roughly human-like general intelligence (either as an end goal or an intermediate developmental milestone).
Some of my thoughts are summed up in an essay I posted at
I’ll quote the first few paragraphs here:
One of the many difficult issues arising in the course of research on human-level AGI is that of “evaluation and metrics” – i.e., AGI intelligence testing.
It’s not so hard to tell when you’ve achieved human-level AGI — though there is some subtlety here, which I’ll discuss below. However, assessing the quality of incremental progress toward human-level AGI is a much subtler matter. In this essay I’ll present some thoughts on this issue, culminating in a couple specific proposals:
1) Online School Tests, in which AGIs are tested via their ability to succeed in existing online educational fora
2) of more immediate interest, a series of tests called the AGI Preschool Tests (AIP Tests, for short, pronounced “ape tests”), based on the notion of “multiple intelligences” and also on some novel ideas regarding learning-based intelligence testing.
The AIP Tests suggested here are specifically intended for AGI systems that control agents embodied in 3D worlds resembling the everyday human world, via either physical robots or virtually embodied agents. Very differently embodied AGI systems (e.g. systems to be initially taught purely via text without any simulated human-like or animal-like body) would potentially need qualitatively different testing methdologies.