Saturday, March 19, 2011

Toward a General Theory of Feasible General Intelligence

Along with practical work on the OpenCog design (and a host of other research projects!), during the past few years I've written a series of brief papers sketching ideas about the theory of general intelligence ... the goal being to move toward a solid conceptual and formal understanding of general intelligence in real-world environments under conditions of feasible computational resources. My quest for such an understanding certainly isn't done yet, but I think I've made significant progress.

This page links to the 5 papers in this series, and also gives their abstracts. 3 of the papers have been published in conference proceedings before, but 2 are given for the first time in this blog post (Three Hypotheses about the Geometry of Mind and Self-Adaptable Learning). All of this material will appear in Building Better Minds eventually, in slightly modified and extended form.

These theoretical ideas have played a significant, largely informal role in guiding my work on the OpenCog design. My feeling is that once practical R&D work is a bit further along, so that we're experimenting in a serious way with sophisticated proto-AGI systems, then theory and practice will start developing in a closely coupled way. So that a good theory of general intelligence will probably come in lock-step along with the first reasonably good AGI systems. (See some more comments on the relation between these theory papers and OpenCog, at the end of this blog post.)

A brief note on math: There is a fair bit of mathematical formalism here, but no deep, interesting theorems are proven. I don't think this is because no such theorems exist in this material; but I just haven't taken then time to really explore these ideas with full mathematical rigor. That would be fun, but I've prioritized other sorts of work. So far, I've mainly been seeking conceptual clarity with these ideas rather than full mathematical rigor; and I've used mathematical formalism here and there because that is the easiest way for me to make my ideas relatively precise. (Being trained in math rather than formal philosophy, I find the former a much more convenient way to express my ideas when I want to be more precise than everyday language permits.) My hope is that, if I never find the time, others will come along and turn some of these ideas into theorems!

Toward a Formal Characterization of Real-World General Intelligence
Presented at AGI-10, in Lugano

Two new formal definitions of intelligence are presented, the ”pragmatic general intelligence” and ”efficient pragmatic general intelligence.” Largely inspired by Legg and Hutter’s formal definition of ”universal intelligence,” the goal of these definitions is to capture a notion of general intelligence that more closely models that possessed by humans and practical AI sys- tems, which combine an element of universality with a certain degree of specialization to particular environments and goals. Pragmatic general intelligence mea- sures the capability of an agent to achieve goals in environments, relative to prior distributions over goal and environment space. Efficient pragmatic general intelligences measures this same capability, but normalized by the amount of computational resources utilized in the course of the goal-achievement. A methodology is described for estimating these theoretical quantities based on observations of a real biological or artificial system operating in a real environment. Finally, a mea- sure of the ”degree of generality” of an intelligent system is presented, allowing a rigorous distinction between ”general AI” and ”narrow AI.”

The Embodied Communication Prior: A Characterization of General Intelligence in the Context of Embodied Social Interaction
Presented at ICCI-09, in Hong Kong

We outline a general conceptual definition of real-world general intelligence that avoids the twin pitfalls of excessive mathematical generality, and excessive anthropomorphism.. Drawing on prior literature, a definition of general intelligence is given, which defines the latter by reference to an assumed measure of the simplicity of goals and environments. The novel contribution presented is to gauge the simplicity of an entity in terms of the ease of communicating it within a community of embodied agents (the so-called Embodied Communication Prior or ECP). Augmented by some further assumptions about the statistical structure of communicated knowledge, this choice is seen to lead to a model of intelligence in terms of distinct but interacting memory and cognitive subsystems dealing with procedural, declarative, sensory/episodic, attentional and intentional knowledge.

Cognitive Synergy: A Universal Principle for General Intelligence?
Presented at ICCI-09, in Hong Kong

Do there exist general principles, which any system must obey in order to achieve advanced general intelligence using feasible computational resources? Here we propose one candidate: cognitive synergy, a principle which suggests that general intelligences must contain different knowledge creation mechanisms corresponding to different sorts of memory (declarative, procedural, sensory/episodic, attentional, intentional); and that these different mechanisms must be interconnected in such a way as to aid each other in overcoming memory-type-specific combinatorial explosions.

Three Hypotheses About the Geometry of Mind (with Matthew Ikle')
Presented for the first time right here!

What set of concepts and formalizations might one use to make a practically useful, theoretically rigorous theory of generally intelligent systems? We present a novel perspective motivated by the OpenCog AGI architecture, but intended to have a much broader scope. Types of memory are viewed as categories, and mappings between memory types as functors. Memory items are modeled using probability distributions, and memory subsystems are conceived as “mindspaces” – geometric spaces corresponding to different memory categories. Two different metrics on mindspaces are considered: one based on algorithmic information theory, and another based on traditional (Fisher information based) “information geometry”. Three hypotheses regarding the geometry of mind are then posited: 1) a syntax-semantics correlation principle, stating that in a successful AGI system, these two metrics should be roughly correlated; 2) a cognitive geometrodynamics principle, stating that on the whole intelligent minds tend to follow geodesics in mindspace; 3) a cognitive synergy principle, stating that shorter paths may be found through the composite mindspace formed by considering multiple memory types together, than by following the geodesics in the mindspaces corresponding to individual memory types.


Self-Adaptable Learning
Presented for the first time right here!

The term ”higher level learning” may be used to refer to learning how to learn, learning how to learn how to learn, etc. If an agent is good at ordinary everyday learning, but also at learning about which learning strategies are most amenable to higher-level learning, and does both in a way that is amenable to higher level learning -– then it may be said to possess self-adaptable learning. Goals and environments in which higher-level learning is a good strategy for intelligence, may be called adaptationally hierarchical – a property that everyday human environments are postulated to possess. These notions are carefully articulated and formalized; and a concept of cognitive continuity is also introduced, which is argued to militate in favor of self-adaptability in a learning system.

P.S. A Comment on the Relation of All This Theory to OpenCog

I think there is a lot of work required, to transform the abstractions from those theory papers of mine into a mathematical theory that is DIRECTLY USEFUL rather than merely INSPIRATIONAL for concrete AGI design.

So, the OpenCog design, for instance, is not derived from the abstract math and ideas in the above-linked papers ... it's independently created, based on many of the same quasi-formal intuitions as the ones underlying those papers.

You could say I'm approaching the problem from two directions at once, and hoping I can get the two approaches to intersect...

One direction is OpenCog --- designing and building a concrete proto-AGI system, and iteratively updating the design based on practical experience

The other is abstract theory, as represented in those papers

If all goes well, eventually the two ends will meet, and the abstract theory will tell us concretely useful things about how to improve the OpenCog design. That is only rather weakly true right now.

I have the sense (maybe wrong) I could make the ends meet very convincingly in about one year of concentrated work on the theory side. However, I currently only spend maybe 5% of my time on that sort of theory. But hopefully I will be able to make it happen in less than 20 years via appropriate collaborations...

2 comments:

Anonymous said...

I've got to scramble to keep up with your prodigious output! / FS

beauty said...

The blog contains informational and educational material. The post enhance my thoughts and experience. So nice!