Are You Smarter Than a Sixth-Generation Computer? Tests for Measuring Nonhuman Intelligence Are Needed in Order to Track Development

Article excerpt

Despite the amazing achievement of supercomputers such as IBM's Jeopardy champion DeepQA (aka Watson), we do not yet call them "intelligent." But what about the next generation of supercomputers, or the ones that come after that?

Before we can make forecasts about machine intelligence, we will need a gauge beyond simple metrics such as petaflops and data-transfer rates. We need to establish a standard metric of machine intelligence.

The idea of testing artificial intelligence goes back to Alan Turing and the eponymously named Turing test. Essentially, the Turing test involved engaging unseen human and machine participants in a text-based conversation. If the judges were unable to correctly identify the AI, then it would be said to have passed. Pass or fail: an all or nothing result.

Unfortunately, while this test is potentially useful for determining human equivalence, it's generally agreed that this isn't the only form of intelligence. A dolphin or a chimpanzee could never pass such a test, and yet both exhibit considerable intelligence. It's just that the nature and level of their intellects differ from that of humans.

The same could be said of machine intelligence. Just because human equivalence hasn't yet been achieved in silico, doesn't mean that rudiments of intelligence don't exist. Additionally, decades from now, an artificial general intelligence, or AGI, may be too dissimilar from the human mind to pass the Turing test, even though it might be very superior to us in most other ways.

For more than a century, psychometric tests have existed for people. While some may argue about the merits of assigning a numerical value to the intelligence of individuals, the fact remains that these tests have resulted in considerable knowledge about the distribution of intelligence in our species. Of course, such tests can't be applied to nonhumans. So how does one develop a test suitable for machines?

Over the years, a number of tests of machine intelligence have been proposed. Several, such as Linguistic Complexity and Psychometric AI, suffer from the same shortcomings as the Turing test, in that they test for human equivalence. Many other theorized tests aren't mathematically rigorous enough. To test a nonhuman and rate it on any sort of meaningful scale, we must accurately assess the complexity of the question or challenge set before it.

Recently, a framework for creating mathematically rigorous challenges has been conceived. This is described in a paper titled "Measuring Universal Intelligence: Toward. An Anytime Intelligence Test" by Jose Hernandez-Orallo and David L. Dowe, published in Artificial Intelligence. This test of "universal intelligence" is grounded in algorithmic information theory and complexity theory in order to structure its challenges. More specifically, it uses Levin's Kt complexity, a modification of Kolmogorov complexity, to assign a mathematical value to the challenge put before an intelligence. (Kolmogorov complexity is a measure of the minimum computational resources required to define an object. However, it isn't computable, so an approximate value is derived using Levin's Kt complexity.)

The test is referred to as an "anytime test," because, as structured, it isn't dependent on time. A value can be derived from minimal interaction, with increasing accuracy as the time is increased and more challenges undertaken.

This approach allows us to tailor challenges to the level of an intelligence--be it animal, machine, or even, in theory, an alien--and assign a meaningful value to the result. …