The Multiverse According to Ben: June 2011

Friday, June 24, 2011

Unraveling Modha & Singh's Map of the Macaque Monkey Brain

(... plus some semi-related AGI musings at the end!)

On July 27 2010, PNAS published a paper entitled "Network architecture of the long-distance pathways in the macaque brain" by Dharmendra Modha and Raghavendra Singh from IBM, which is briefly described here and available in full here. The highlight of the paper is a connectivity diagram of all the regions of the macaque (monkey) brain, reproduced in low res right here:

See here for a hi-res version.

The diagram portrays "a unique network incorporating 410 anatomical tracing studies of the macaque brain from the Collation of Connectivity data on the Macaque brain (CoCoMac) neuroinformatic database. Our network consists of 383 hierarchically organized regions spanning cortex, thalamus, and basal ganglia; models the presence of 6,602 directed long-distance connections; is three times larger than any previously derived brain network; and contains subnetworks corresponding to classic corticocortical, corticosubcortical, and subcortico-subcortical fiber systems."

However, I found that the diagram can be somewhat confusing to browse, if one wants to look at specific brain regions and what they connect to. So my Novamente LLC co-conspirator Eddie Monroe and I went back to the original data files, given in the online supplementary information for the paper, and used this to make a textual version of the information in the diagram, which you can find here.

Our goal in looking at this wiring diagram is as a guide to understanding the interactions between certain human brain regions we're studying (human and monkey brains being rather similar in many respects). But I think it's worth carefully perusing for anyone who's thinking about neuroscience from any aspect, and for anyone who's thinking about AGI from a brain-simulation perspective.

Semi-Related AGI Musings

Complexity such as that revealed in Modha and Singh's diagrams always comes to my mind when I read about someone's "brain inspired" AGI architecture -- say, Hierarchical Temporal Memory architectures (like Numenta or DeSTIN, etc.) that consist of a hierarchy of layers of nodes, passing information up and down in a manner vaguely reminiscent of visual or auditory cortex. Such architectures may be quite valuable and interesting, but each of them captures a teensy weensy fraction of the architectural and dynamical complexity in the brain. Each of the brain regions in Modha and Singh's diagram is its own separate story, with its own separate and important functions and structures and complex dynamics; and each one interacts with a host of others in specially configured ways, to achieve emergent intelligence. In my view, if one wants to make a brain-like AGI, one's going to need to emulate the sort of complexity that the actual brain has -- not just take some brain components (e.g. neurons) and roughly simulate them and wire the simulations together in some clever way; and not just emulate the architecture and dynamics of one little region of the brain and proclaim it to embody the universal principles of brain function.

And of course this is the reason I'm not pursuing brain-like AGI at the moment. If you pick 100 random links from Modha and Singh's diagram, and then search the neuroscience literature for information about the dynamical and informational interactions ensuing from that link, you'll find that in the majority of cases the extant knowledge is mighty sketchy. This is an indicator of how little we still know about the brain.

But can we still learn something from the brain, toward the goal of making loosely brain-inspired but non-brain-like AGI systems? Absolutely. I'm currently interested in understanding how the brain interfaces perceptual and conceptual knowledge -- but not with a goal of emulating how the brain works in any detailed sense (e.g. my AGI approach involves no formal neurons or other elementary brainlike components, and no modules similar in function to specific brain regions), rather just with a goal of seeing what interesting principles can be abstracted therefrom, that may be helpful in designing the interface between OpenCog and DeSTIN (a hierarchical temporal memory designed by Itamar Arel, that we're intending to use for OpenCog's sensorimotor processing).

And so it goes... ;-)

Wednesday, June 15, 2011

Why is evaluating partial progress toward human-level AGI so hard?

This post co-authored by Ben Goertzel and Jared Wigmore

Here we sketch a possible explanation for the well-known difficulty of measuring intermediate progress toward human-level AGI is provided, via extending the notion of cognitive synergy to a more refined notion of ”tricky cognitive synergy.”

The Puzzle: Why Is It So Hard to Measure Partial Progress Toward Human-Level AGI?

A recurrent difficulty in the AGI field is the difficulty of creating a good test for intermediate progress toward the goal of human-level AGI.

It’s not entirely straightforward to create tests to measure the final achievement of human-level AGI, but there are some fairly obvious candidates here. There’s the Turing Test (fooling judges into believing you’re human, in a text chat) the video Turing Test, the Robot College Student test (passing university, via being judged exactly the same way a human student would), etc. There’s certainly no agreement on which is the most meaningful such goal to strive for, but there’s broad agreement that a number of goals of this nature basically make sense.

On the other hand, how does one measure whether one is, say, 50 percent of the way to human-level AGI? Or, say, 75 or 25 percent?

It’s possible to pose many ”practical tests” of incremental progress toward human-level AGI, with the property that IF a proto-AGI system passes the test using a certain sort of architecture and/or dynamics, then this implies a certain amount of progress toward human-level AGI based on particular theoretical assumptions about AGI. However, in each case of such a practical test, it seems intuitively likely to a significant percentage of AGI researcher that there is some way to ”game” the test via designing a system specifically oriented toward passing that test, and which doesn’t constitute dramatic progress toward AGI.

Some examples of practical tests of this nature would be

The Wozniak ”coffee test”: go into an average American house and figure out how to make coffee, including identifying the coffee machine, figuring out what the buttons do, finding the coffee in the cabinet, etc.
Story understanding – reading a story, or watching it on video, and then answering questions about what happened (including questions at various levels of abstraction)
Passing the elementary school reading curriculum (which involves reading and answering questions about some picture books as well as purely textual ones)
Learning to play an arbitrary video game based on experience only, or based on experience plus reading instructions

One interesting point about tests like this is that each of them seems to some AGI researchers to encapsulate the crux of the AGI problem, and be unsolvable by any system not far along the path to human-level AGI – yet seems to other AGI researchers, with different conceptual perspectives, to be something probably game-able by narrow-AI methods. And of course, given the current state of science, there’s no way to tell which of these practical tests really can be solved via a narrow-AI approach, except by having a lot of people try really hard over a long period of time.

A question raised by these observations is whether there is some fundamental reason why it’s hard to make an objective, theory-independent measure of intermediate progress toward advanced AGI. Is it just that we haven’t been smart enough to figure out the right test – or is there some conceptual reason why the very notion of such a test is problematic?

We don’t claim to know for sure – but in this brief note we’ll outline one possible reason why the latter might be the case.

Is General Intelligence Tricky?

The crux of our proposed explanation has to do with the sensitive dependence of the behavior of many complex systems on the particulars of their construction. Often-times, changing a seemingly small aspect of a system’s underlying structures or dynamics can dramatically affect the resulting high-level behaviors. Lacking a recognized technical term to use here, we will refer to any high-level emergent system property whose existence depends sensitively on the particulars of the underlying system as tricky. Formulating the notion of trickiness in a mathematically precise way is a worthwhile pursuit, but this is a qualitative essay so we won’t go that direction here.

Thus, the crux of our explanation of the difficulty of creating good tests for incremental progress toward AGI is the hypothesis that general intelligence, under limited computational resources, is tricky.

Now, there are many reasons that general intelligence might be tricky in the sense we’ve defined here, and we won’t try to cover all of them here. Rather, we’ll focus on one particular phenomenon that we feel contributes a significant degree of trickiness to general intelligence.

Is Cognitive Synergy Tricky?

One of the trickier aspects of general intelligence under limited resources, we suggest, is the phenomenon of cognitive synergy.

The cognitive synergy hypothesis, in its simplest form, states that human-level AGI intrinsically depends on the synergetic interaction of multiple components (for instance, as in the OpenCog design, multiple memory systems each supplied with its own learning process). In this hypothesis, for instance, it might be that there are 10 critical components required for a human-level AGI system. Having all 10 of them in place results in human-level AGI, but having only 8 of them in place results in having a dramatically impaired system – and maybe having only 6 or 7 of them in place results in a system that can hardly do anything at all.

Of course, the reality is almost surely not as strict as the simplified example in the above paragraph suggests. No AGI theorist has really posited a list of 10 crisply-defined subsystems and claimed them necessary and sufficient for AGI. We suspect there are many different routes to AGI, involving integration of different sorts of subsystems. However, if the cognitive synergy hypothesis is correct, then human-level AGI behaves roughly like the simplistic example in the prior paragraph suggests. Perhaps instead of using the 10 components, you could achieve human-level AGI with 7 components, but having only 5 of these 7 would yield drastically impaired functionality – etc. Or the same phenomenon could be articulated in the context of systems without any distinguishable component parts, but only continuously varying underlying quantities. To mathematically formalize the cognitive synergy hypothesis in a general way becomes complex, but here we’re only aiming for a qualitative argument. So for illustrative purposes, we’ll stick with the ”10 components” example, just for communicative simplicity.

Next, let’s suppose that for any given task, there are ways to achieve this task using a system that is much simpler than any subset of size 6 drawn from the set of 10 components needed for human-level AGI, but works much better for the task than this subset of 6 components(assuming the latter are used as a set of only 6 components, without the other 4 components).

Note that this supposition is a good bit stronger than mere cognitive synergy. For lack of a better name, we’ll call it tricky cognitive synergy. The tricky cognitive synergy hypothesis would be true if, for example, the following possibilities were true:

creating components to serve as parts of a synergetic AGI is harder than creating components intended to serve as parts of simpler AI systems without synergetic dynamics
components capable of serving as parts of a synergetic AGI are necessarily more complicated than components intended to serve as parts of simpler AGI systems.

These certainly seem reasonable possibilities, since to serve as a component of a synergetic AGI system, a component must have the internal flexibility to usefully handle interactions with a lot of other components as well as to solve the problems that come its way. In terms of our concrete work on the OpenCog integrative proto-AGI system, these possibilities ring true, in the sense that tailoring an AI process for tight integration with other AI processes within OpenCog, tends to require more work than preparing a conceptually similar AI process for use on its own or in a more task-specific narrow AI system.

It seems fairly obvious that, if tricky cognitive synergy really holds up as a property of human-level general intelligence, the difficulty of formulating tests for intermediate progress toward human-level AGI follows as a consequence. Because, according to the tricky cognitive synergy hypothesis, any test is going to be more easily solved by some simpler narrow AI process than by a partially complete human-level AGI system.

Conclusion

We haven’t proved anything here, only made some qualitative arguments. However, these arguments do seem to give a plausible explanation for the empirical observation that positing tests for intermediate progress toward human-level AGI is a very difficult prospect. If the theoretical notions sketched here are correct, then this difficulty is not due to incompetence or lack of imagination on the part of the AGI community, nor due to the primitive state of the AGI field, but is rather intrinsic to the subject matter. And if these notions are correct, then quite likely the future rigorous science of AGI will contain formal theorems echoing and improving the qualitative observations and conjectures we’ve made here.

If the ideas sketched here are true, then the practical consequence for AGI development is, very simply, that one shouldn’t worry all that much about producing compelling intermediary results. Just as 2/3 of a human brain may not be much use, similarly, 2/3 of an AGI system may not be much use. Lack of impressive intermediary results may not imply one is on a wrong development path; and comparison with narrow AI systems on specific tasks may be badly misleading as a gauge of incremental progress toward human-level AGI.

Hopefully it’s clear that the motivation behind the line of thinking presented here is a desire to understand the nature of general intelligence and its pursuit – not a desire to avoid testing our AGI software! Truly, as AGI engineers, we would love to have a sensible rigorous way to test our intermediary progress toward AGI, so as to be able to pose convincing arguments to skeptics, funding sources, potential collaborators and so forth -- as well as just for our own edification. We really, really like producing exciting intermediary results, on projects where that makes sense. Such results, when they come, are extremely informative and inspiring to the researchers as well as the rest of the world! Our motivation here is not a desire to avoid having the intermediate progress of our efforts measured, but rather a desire to explain the frustrating (but by now rather well-established) difficulty of creating such intermediate goals for human-level AGI in a meaningful way.

If we or someone else figures out a compelling way to measure partial progress toward AGI, we will celebrate the occasion. But it seems worth seriously considering the possibility that the difficulty in finding such a measure reflects fundamental properties of the subject matter – such as the trickiness of cognitive synergy and other aspects of general intelligence.

Is Software Improving Exponentially?

In a discussion on the AGI email discussion list recently, some folks were arguing that Moore's Law and associated exponential accelerations may be of limited value in pushing the world toward Singularity, because software is not advancing exponentially.

For instance Matt Mahoney pointed out "the roughly linear rate of progress in data compression as measured over the last 14 years on the Calgary corpus, http://www.mailcom.com/challenge/ "

Ray Kurzweil's qualitative argument in favor of the dramatic acceleration of software progress in recent decades is given in slides 104-111 of his presentation here.

I think software progress is harder to quantify than hardware progress, thus less often pointed to in arguments regarding technology acceleration.

However, qualitatively, there seems little doubt that the software tools available to the programmer have been improving damn dramatically....

Sheesh, compare game programming as I did it on the Atari 400 or Commodore 64 back in the 80s ... versus how it's done now, with so many amazing rendering libraries, 3D modeling engines, etc. etc. With the same amount of effort, today one can make incredibly more complex and advanced games.

Back then we had to code our own algorithms and data structures, now we have libraries like STL, so novice programmers can use advanced structures and algorithms without understanding them.

In general, the capability of programmers without deep technical knowledge or ability to create useful working code has increased *incredibly* in the last couple decades…. Programming used to be only for really hard-core math and science geeks, now it's a practical career possibility for a fairly large percentage of the population.

When I started using Haskell in the mid-90s it was a fun, wonderfully elegant toy language but not practical for real projects. Now its clever handling of concurrency makes it viable for large-scale projects... and I'm hoping in the next couple years it will become possible to use Haskell within OpenCog (Joel Pitt just made the modifications needed to enable OpenCog AI processes to be coded in Python as well as the usual C++).

I could go on a long time with similar examples, but the point should be clear. Software tools have improved dramatically in functionality and usability. The difficulty of quantifying this progress in a clean way doesn't mean it isn't there...

Another relevant point is that, due to the particular nature of software development, software productivity generally decreases for large teams. (This is why I wouldn't want an AGI team with more than, say, 20 people on it. 10-15 may be the optimal size for the core team of an AGI software project, with additional people for things like robotics hardware, simulation world engineering, software testing, etc.) However, the size of projects achievable by small teams has dramatically increased over time, due to the availability of powerful software libraries.

Thus, in the case of software (as in so many other cases), the gradual improvement of technology has led to qualitative increases in what is pragmatically possible (i.e. what is achievable via small teams), not just quantitative betterment of software that previously existed.

It's true that word processors and spreadsheets have not advanced exponentially (at least not with any dramatically interesting exponent), just as forks and chairs and automobiles have not. However, other varieties of software clearly have done so, for instance video gaming and scientific computation.

Regarding the latter two domains, just look at what one can do with Nvidia GPU hardware on a laptop now, compared to what was possible for similar cost just a decade ago! Right now, my colleague Michel Drenthe in Xiamen is doing CUDA-based vision processing on the Nvidia GPU in his laptop, using Itamar Arel's DeSTIN algorithm, with a goal toward providing OpenCog with intelligent visual perception -- this is directly relevant to AGI, and it's leveraging recent hardware advances coupled with recent software advances (CUDA and its nice libraries, which make SIMD parallel scientific computing reasonably tractable, within the grasp of a smart undergrad like Michel doing a 6 month internship). Coupled acceleration in hardware and software for parallel scientific computing is moving along, and this is quite relevant to AGI, whereas the relative stagnation in word processors and forks really doesn't matter.

Let us not forget that the exponential acceleration of various quantitative metrics (like Moore's Law) is not really the key point regarding Singularity, it's just an indicator of the underlying progress that is the key point.... While it's nice that progress in some areas is cleanly quantifiable, that doesn't necessarily mean these are the most important areas....

To really understand progress toward Singularity, one has to look at the specific technologies that most likely need to improve a lot to enable the Singularity. Word processing, not. Text compression, not really. Video games, no. Scientific computing, yes. Efficient, easily usable libraries containing complex algorithms and data structures, yes. Scalable functional programming, maybe. It seems to me that by and large the aspects of software whose accelerating progress would be really, really helpful to achieving AGI, are in fact accelerating dramatically.

In fact, I believe we could have a Singularity with no further hardware improvements, just via software improvements. This might dramatically increase the financial cost of the first AGIs, due to making them necessitate huge server farms ... which would impact the route to and the nature of the Singularity, but not prevent it.