To follow this blog by email, give your address here...

Wednesday, June 15, 2011

Why is evaluating partial progress toward human-level AGI so hard?

This post co-authored by Ben Goertzel and Jared Wigmore

Here we sketch a possible explanation for the well-known difficulty of measuring intermediate progress toward human-level AGI is provided, via extending the notion of cognitive synergy to a more refined notion of ”tricky cognitive synergy.”

The Puzzle: Why Is It So Hard to Measure Partial Progress Toward Human-Level AGI?

A recurrent difficulty in the AGI field is the difficulty of creating a good test for intermediate progress toward the goal of human-level AGI.

It’s not entirely straightforward to create tests to measure the final achievement of human-level AGI, but there are some fairly obvious candidates here. There’s the Turing Test (fooling judges into believing you’re human, in a text chat) the video Turing Test, the Robot College Student test (passing university, via being judged exactly the same way a human student would), etc. There’s certainly no agreement on which is the most meaningful such goal to strive for, but there’s broad agreement that a number of goals of this nature basically make sense.

On the other hand, how does one measure whether one is, say, 50 percent of the way to human-level AGI? Or, say, 75 or 25 percent?

It’s possible to pose many ”practical tests” of incremental progress toward human-level AGI, with the property that IF a proto-AGI system passes the test using a certain sort of architecture and/or dynamics, then this implies a certain amount of progress toward human-level AGI based on particular theoretical assumptions about AGI. However, in each case of such a practical test, it seems intuitively likely to a significant percentage of AGI researcher that there is some way to ”game” the test via designing a system specifically oriented toward passing that test, and which doesn’t constitute dramatic progress toward AGI.

Some examples of practical tests of this nature would be

  • The Wozniak ”coffee test”: go into an average American house and figure out how to make coffee, including identifying the coffee machine, figuring out what the buttons do, finding the coffee in the cabinet, etc.
  • Story understanding – reading a story, or watching it on video, and then answering questions about what happened (including questions at various levels of abstraction)
  • Passing the elementary school reading curriculum (which involves reading and answering questions about some picture books as well as purely textual ones)
  • Learning to play an arbitrary video game based on experience only, or based on experience plus reading instructions

One interesting point about tests like this is that each of them seems to some AGI researchers to encapsulate the crux of the AGI problem, and be unsolvable by any system not far along the path to human-level AGI – yet seems to other AGI researchers, with different conceptual perspectives, to be something probably game-able by narrow-AI methods. And of course, given the current state of science, there’s no way to tell which of these practical tests really can be solved via a narrow-AI approach, except by having a lot of people try really hard over a long period of time.

A question raised by these observations is whether there is some fundamental reason why it’s hard to make an objective, theory-independent measure of intermediate progress toward advanced AGI. Is it just that we haven’t been smart enough to figure out the right test – or is there some conceptual reason why the very notion of such a test is problematic?

We don’t claim to know for sure – but in this brief note we’ll outline one possible reason why the latter might be the case.

Is General Intelligence Tricky?

The crux of our proposed explanation has to do with the sensitive dependence of the behavior of many complex systems on the particulars of their construction. Often-times, changing a seemingly small aspect of a system’s underlying structures or dynamics can dramatically affect the resulting high-level behaviors. Lacking a recognized technical term to use here, we will refer to any high-level emergent system property whose existence depends sensitively on the particulars of the underlying system as tricky. Formulating the notion of trickiness in a mathematically precise way is a worthwhile pursuit, but this is a qualitative essay so we won’t go that direction here.

Thus, the crux of our explanation of the difficulty of creating good tests for incremental progress toward AGI is the hypothesis that general intelligence, under limited computational resources, is tricky.

Now, there are many reasons that general intelligence might be tricky in the sense we’ve defined here, and we won’t try to cover all of them here. Rather, we’ll focus on one particular phenomenon that we feel contributes a significant degree of trickiness to general intelligence.

Is Cognitive Synergy Tricky?

One of the trickier aspects of general intelligence under limited resources, we suggest, is the phenomenon of cognitive synergy.

The cognitive synergy hypothesis, in its simplest form, states that human-level AGI intrinsically depends on the synergetic interaction of multiple components (for instance, as in the OpenCog design, multiple memory systems each supplied with its own learning process). In this hypothesis, for instance, it might be that there are 10 critical components required for a human-level AGI system. Having all 10 of them in place results in human-level AGI, but having only 8 of them in place results in having a dramatically impaired system – and maybe having only 6 or 7 of them in place results in a system that can hardly do anything at all.

Of course, the reality is almost surely not as strict as the simplified example in the above paragraph suggests. No AGI theorist has really posited a list of 10 crisply-defined subsystems and claimed them necessary and sufficient for AGI. We suspect there are many different routes to AGI, involving integration of different sorts of subsystems. However, if the cognitive synergy hypothesis is correct, then human-level AGI behaves roughly like the simplistic example in the prior paragraph suggests. Perhaps instead of using the 10 components, you could achieve human-level AGI with 7 components, but having only 5 of these 7 would yield drastically impaired functionality – etc. Or the same phenomenon could be articulated in the context of systems without any distinguishable component parts, but only continuously varying underlying quantities. To mathematically formalize the cognitive synergy hypothesis in a general way becomes complex, but here we’re only aiming for a qualitative argument. So for illustrative purposes, we’ll stick with the ”10 components” example, just for communicative simplicity.

Next, let’s suppose that for any given task, there are ways to achieve this task using a system that is much simpler than any subset of size 6 drawn from the set of 10 components needed for human-level AGI, but works much better for the task than this subset of 6 components(assuming the latter are used as a set of only 6 components, without the other 4 components).

Note that this supposition is a good bit stronger than mere cognitive synergy. For lack of a better name, we’ll call it tricky cognitive synergy. The tricky cognitive synergy hypothesis would be true if, for example, the following possibilities were true:

  • creating components to serve as parts of a synergetic AGI is harder than creating components intended to serve as parts of simpler AI systems without synergetic dynamics
  • components capable of serving as parts of a synergetic AGI are necessarily more complicated than components intended to serve as parts of simpler AGI systems.

These certainly seem reasonable possibilities, since to serve as a component of a synergetic AGI system, a component must have the internal flexibility to usefully handle interactions with a lot of other components as well as to solve the problems that come its way. In terms of our concrete work on the OpenCog integrative proto-AGI system, these possibilities ring true, in the sense that tailoring an AI process for tight integration with other AI processes within OpenCog, tends to require more work than preparing a conceptually similar AI process for use on its own or in a more task-specific narrow AI system.

It seems fairly obvious that, if tricky cognitive synergy really holds up as a property of human-level general intelligence, the difficulty of formulating tests for intermediate progress toward human-level AGI follows as a consequence. Because, according to the tricky cognitive synergy hypothesis, any test is going to be more easily solved by some simpler narrow AI process than by a partially complete human-level AGI system.


We haven’t proved anything here, only made some qualitative arguments. However, these arguments do seem to give a plausible explanation for the empirical observation that positing tests for intermediate progress toward human-level AGI is a very difficult prospect. If the theoretical notions sketched here are correct, then this difficulty is not due to incompetence or lack of imagination on the part of the AGI community, nor due to the primitive state of the AGI field, but is rather intrinsic to the subject matter. And if these notions are correct, then quite likely the future rigorous science of AGI will contain formal theorems echoing and improving the qualitative observations and conjectures we’ve made here.

If the ideas sketched here are true, then the practical consequence for AGI development is, very simply, that one shouldn’t worry all that much about producing compelling intermediary results. Just as 2/3 of a human brain may not be much use, similarly, 2/3 of an AGI system may not be much use. Lack of impressive intermediary results may not imply one is on a wrong development path; and comparison with narrow AI systems on specific tasks may be badly misleading as a gauge of incremental progress toward human-level AGI.

Hopefully it’s clear that the motivation behind the line of thinking presented here is a desire to understand the nature of general intelligence and its pursuit – not a desire to avoid testing our AGI software! Truly, as AGI engineers, we would love to have a sensible rigorous way to test our intermediary progress toward AGI, so as to be able to pose convincing arguments to skeptics, funding sources, potential collaborators and so forth -- as well as just for our own edification. We really, really like producing exciting intermediary results, on projects where that makes sense. Such results, when they come, are extremely informative and inspiring to the researchers as well as the rest of the world! Our motivation here is not a desire to avoid having the intermediate progress of our efforts measured, but rather a desire to explain the frustrating (but by now rather well-established) difficulty of creating such intermediate goals for human-level AGI in a meaningful way.

If we or someone else figures out a compelling way to measure partial progress toward AGI, we will celebrate the occasion. But it seems worth seriously considering the possibility that the difficulty in finding such a measure reflects fundamental properties of the subject matter – such as the trickiness of cognitive synergy and other aspects of general intelligence.

Is Software Improving Exponentially?

In a discussion on the AGI email discussion list recently, some folks were arguing that Moore's Law and associated exponential accelerations may be of limited value in pushing the world toward Singularity, because software is not advancing exponentially.

For instance Matt Mahoney pointed out "the roughly linear rate of progress in data compression as measured over the last 14 years on the Calgary corpus, "

Ray Kurzweil's qualitative argument in favor of the dramatic acceleration of software progress in recent decades is given in slides 104-111 of his presentation here.

I think software progress is harder to quantify than hardware progress, thus less often pointed to in arguments regarding technology acceleration.

However, qualitatively, there seems little doubt that the software tools available to the programmer have been improving damn dramatically....

Sheesh, compare game programming as I did it on the Atari 400 or Commodore 64 back in the 80s ... versus how it's done now, with so many amazing rendering libraries, 3D modeling engines, etc. etc. With the same amount of effort, today one can make incredibly more complex and advanced games.

Back then we had to code our own algorithms and data structures, now we have libraries like STL, so novice programmers can use advanced structures and algorithms without understanding them.

In general, the capability of programmers without deep technical knowledge or ability to create useful working code has increased *incredibly* in the last couple decades…. Programming used to be only for really hard-core math and science geeks, now it's a practical career possibility for a fairly large percentage of the population.

When I started using Haskell in the mid-90s it was a fun, wonderfully elegant toy language but not practical for real projects. Now its clever handling of concurrency makes it viable for large-scale projects... and I'm hoping in the next couple years it will become possible to use Haskell within OpenCog (Joel Pitt just made the modifications needed to enable OpenCog AI processes to be coded in Python as well as the usual C++).

I could go on a long time with similar examples, but the point should be clear. Software tools have improved dramatically in functionality and usability. The difficulty of quantifying this progress in a clean way doesn't mean it isn't there...

Another relevant point is that, due to the particular nature of software development, software productivity generally decreases for large teams. (This is why I wouldn't want an AGI team with more than, say, 20 people on it. 10-15 may be the optimal size for the core team of an AGI software project, with additional people for things like robotics hardware, simulation world engineering, software testing, etc.) However, the size of projects achievable by small teams has dramatically increased over time, due to the availability of powerful software libraries.

Thus, in the case of software (as in so many other cases), the gradual improvement of technology has led to qualitative increases in what is pragmatically possible (i.e. what is achievable via small teams), not just quantitative betterment of software that previously existed.

It's true that word processors and spreadsheets have not advanced exponentially (at least not with any dramatically interesting exponent), just as forks and chairs and automobiles have not. However, other varieties of software clearly have done so, for instance video gaming and scientific computation.

Regarding the latter two domains, just look at what one can do with Nvidia GPU hardware on a laptop now, compared to what was possible for similar cost just a decade ago! Right now, my colleague Michel Drenthe in Xiamen is doing CUDA-based vision processing on the Nvidia GPU in his laptop, using Itamar Arel's DeSTIN algorithm, with a goal toward providing OpenCog with intelligent visual perception -- this is directly relevant to AGI, and it's leveraging recent hardware advances coupled with recent software advances (CUDA and its nice libraries, which make SIMD parallel scientific computing reasonably tractable, within the grasp of a smart undergrad like Michel doing a 6 month internship). Coupled acceleration in hardware and software for parallel scientific computing is moving along, and this is quite relevant to AGI, whereas the relative stagnation in word processors and forks really doesn't matter.

Let us not forget that the exponential acceleration of various quantitative metrics (like Moore's Law) is not really the key point regarding Singularity, it's just an indicator of the underlying progress that is the key point.... While it's nice that progress in some areas is cleanly quantifiable, that doesn't necessarily mean these are the most important areas....

To really understand progress toward Singularity, one has to look at the specific technologies that most likely need to improve a lot to enable the Singularity. Word processing, not. Text compression, not really. Video games, no. Scientific computing, yes. Efficient, easily usable libraries containing complex algorithms and data structures, yes. Scalable functional programming, maybe. It seems to me that by and large the aspects of software whose accelerating progress would be really, really helpful to achieving AGI, are in fact accelerating dramatically.

In fact, I believe we could have a Singularity with no further hardware improvements, just via software improvements. This might dramatically increase the financial cost of the first AGIs, due to making them necessitate huge server farms ... which would impact the route to and the nature of the Singularity, but not prevent it.

Wednesday, May 18, 2011

The Serf versus the Entrepreneur?

This is a bit of a deviation from my usual topics, but I've been thinking a bit about economic development in various countries around the world (sort of a natural topic for me in that I travel a lot, have lived in several countries, and have done business and work in a lot of different countries including the US, Europe, Brazil, Hong Kong, Japan and China and Korea, Australia and NZ, etc.)

The hypothesis I'm going to put forth here is that the difference between development-prone and development-resistant countries, is related to whether the corresponding cultures tend to metaphorically view the individual as a serf or as an entrepreneur.

Of course, this is a very rough and high-level approximative perspective, but it seems to me to have some conceptual explanatory power.

Development-Prone versus Development-Resistant Cultures

The book "Culture Matters", which I borrowed from my dad (a sociologist) recently, contains a chapter by Mariano Grondona called "A Cultural Typology of Economic Development", which proposes a list of properties distinguishing development-prone cultures from development-resistant cultures. Put very crudely, the list goes something like this

  • Development-resistant vs. development-prone
  • Justice: present-focused vs future-focused
  • Work: not respected vs. respected
  • Heresy: reviled vs. tolerated
  • Education: brainwashing vs. more autonomy focused
  • Utilitarianism: no vs. yes
  • Lesser virtues (valuing a job well done, tidiness, punctuality, courtesy): no vs. yes
  • time focus: past/ spiritual far-future vs. practical moderately near future
  • rationality: not a focus vs. strongly valued
  • rule of man vs. rule of law
  • large group vs. individual as nexus of action
  • determinism vs. free will ism
  • salvation in the world (immanence) vs. salvation from the world (transcendence)
  • focus on utopian visions not rationally achievable vs. focus on distant utopias that are more likely rationally progressively achievable
  • optimism about action of "powers that be" vs. optimism about personal action
  • thoughts about political structure: absolutism vs compromise

A more thorough version of the list is given in this file "Typology of Progress-Prone and Progress-Resistant Cultures", which is Chapter 2 of book "The Central Liberal Truth: How Politics Can Change a Culture and Save it From Itself" by Lawrence Harrison. The title of Harrison's book (which I didn't read, I just read that chapter) presumably refers to the famous quote from Daniel Patrick Moynihan that

"The central conservative truth is that it is culture, not politics, that determines the success of a society. The central liberal truth is that politics can change a culture and save it from itself."

Harrison adds some other points to Grondona's list, such as

  • wealth: zero-sum vs. positive-sum
  • knowledge: theory vs. empirics
  • low risk tolerance (w/ occasional adventures) vs. moderate risk tolerance
  • advancement: social connections based vs. merit based
  • radius of trust: narrow vs. wide
  • entrepreneurship: rent-seeking vs. innovation

and presents it in a more nicely formatted and well-explained way than this blog post! I encourage you to click the above link and read the chapter for yourself.

Now, I find all this pretty interesting, but also in a way unsatisfying. A theory that centrally consists of a long list of bullet points always gives me the feeling of not getting to the essence of things.

Harrison attempts to sum up the core ideas of the typology as follows:

At the heart of the typology are two fundamental questions: (1) does the culture encourage the belief that people can influence their destinies? And (2) does the culture promote the Golden Rule. If people believe that they can influence their destinies, they are likely to focus on the future; see the world in positive-sum terms; attach a high priority to education; believe in the work ethic; save; become entrepreneurial; and so forth. If the Golden Rule has real meaning for them, they are likely to live by a reasonably rigorous ethical code; honor the lesser virtues; abide by the laws; identify with the broader society; form social capital; and so forth.

But this abstraction doesn't seem to me to sum up the essence of the typology all that well.

Lakoff's Analysis of the Metaphors Underlying Politics

When reading the above material, I was reminded of cognitive scientist George Lakoff's book "Moral Politics" whose core argument is summarized here.

Lakoff argues that much of liberal vs. conservative politics is based on the metaphor of the nation as a family, and that liberal politics tends to metaphorically view the government as a nurturing mother, whereas conservative politics tends to metaphorically view the government as a strict father.

While I don't agree with all Lakoff's views by any means (and I found his later cognitive/political writings generally less compelling than Moral Politics), I think his basic insight in that book is fairly interesting and significant. It seems to unify what otherwise appears a grab-bag of political beliefs.

For instance, the US Republican party is, at first sight, an odd combination of big-business advocacy with Christian moral strictness. To an extent this represents an opportunistic alliance between two interest groups that otherwise would be too small to gain power .. but Lakoff's analysis suggests it's more than this. As he points out, the "strict father" archetype binds together both moral strictness and the free-for-all, rough-and-tumble competitiveness advocated by the pro-big-business sector. And the "nurturant mother" archetype binds together the inclusiveness aspect of the US Democratic party, with the latter's focus on social programs to help the disadvantaged. Of course these archetypes don't have universal explanatory power, but they do seem to me to capture some of the unconscious patterns underlying contemporary politics.

So I started wondering whether there's some similar, significantly (though of course not completely) explanatory metaphorical/archetypal story one could use to explain comparative economic development. Such a story would then provide an explanation underlying the "laundry list" of cultural differences described above.

The Serf versus the Entrepreneur?

Getting to the point finally … it seems to me that the culture of development-resistant countries, as described above, is rather well aligned with the metaphor of the "serf and lord". If the individual views himself as the serf, and the state and government as the lord, then they will arrive at a fair approximation of the progress-resistant world-view as described in the above lists. So maybe we can say that progress-resistant nations tend to have a view of the individual/state relationship that is based on a "feudal" metaphor in some sense.

On the other hand, what is the metaphor corresponding to progress-friendly countries? One thing I see is a fairly close alignment with an entrepreneurial metaphor. Viewing the individual as an entrepreneur -- and the state as a sort of "social contract" between interacting, coopeting entrepreneurs -- seems to neatly wrap up a considerable majority of the bullet points associated with the progress-friendly countries, on the above list.

Note that this hypothetical analysis in terms of metaphors is not intended as a replacement for Lakoff's -- rather, it's intended as complementary. We understand the things in our world using a variety of different metaphors (as well as other means besides metaphor, a point Lakoff sometimes seems not to concede), and may match a single entity like a government to multiple metaphorical frames.

Finally... what value is this kind of analysis? Obviously, if we know the metaphorical frames underlying peoples' thinking, this may help us to better work with them, to encourage them to achieve their goals and fulfill themselves more thoroughly. If you know the metaphors underlying your OWN unconscious thinking, this can help you avoid being excessively controlled by these metaphors, taking more of your thinking and attitude under conscious control….

One way to empirically explore this sort of hypothesis would be to statistically study the language used in various cultures to describe the individual and the state and their relationship. However, this would require a lot of care due to the multiple languages involved, and certainly would be a large project, which I have no intention to personally pursue!

But nevertheless, in spite of the slipperiness and difficulty of validation of this sort of thinking, I find it interesting personally, as part of my quest to better understand the various cultures I come into contact with as I go about my various trans-continental doings....