The Multiverse According to Ben

Thursday, July 08, 2010

Ruminations on Ramona 4

You may have already seen the new version of Ray Kurzweil's "Ramona" chatbot on the lovely new Kurzweilai.net website ... if not I encourage you to check her out:

http://www.kurzweilai.net/ramona4/ramona.html

I've been getting a lot of email questions about Ramona 4 during the last couple days, so I figured I'd write a blog post here to answer them all at once.

My AI company Novamente LLC assembled the chatbot behind Ramona 4, around a year ago as I recall. Most of it was done by Murilo Queiroz, with some help from myself and Lucio Coelho. It was a really fun project technically; and it was the first time I'd worked together with Ray and Amara Angelica on a practical project, so that was nice too. Novamente had nothing to do with the graphics or speech aspect, just the part that produces output text in response to input text.

The Ramona 4 system we put together doesn't actually use any of the Artificial General Intelligence technology we are building at Novamente LLC (much but not all of which has been open-sourced in the OpenCog project). It uses some conventional chatbot technology combined with lookups into online knowledge resources plus stochastic text generation -- i.e. it's a happy mashup of other open-source tools, all tuned and tweaked and postprocessed for the specific goal of making Ramona 4 as interesting and entertaining and Ramona-ish as possible!

One day I hope to put forth an intelligent chatbot system that displays a level of general intelligence at the human level and ultimately beyond. I even think I know how to do it ... and we're working on it at Novamente LLC and OpenCog, though never quite as fast as we would like since AGI research funding is tough to come by these days. (And of course, once we can build a chatbot of this nature, we'll also be able to do a lot of fantastic stuff beyond the domain of chatbottery!)

I know Ray Kurzweil has his own ideas about how to build human-level AGI, which overlap somewhat with mine but also have some differences, and he's brewing a book on this topic.

But in the meantime, while deeper AGI approaches like mine and Ray's are still in development, systems like Ramona 4 are both good fun, and potentially customizable into useful tools for various applications like online customer support. (Yes, Ramona 4 is a bit of a cheeky gal, so she might not be the most appropriate customer support agent ... but the same underlying approach we used to make Ramona 4 could be used to make a lot of other things, including some more practical chatbots.)

If the underpinnings are so different from advanced AGI, what do we learn from building systems like Ramona?

Not so much about AGI design, really -- but potentially, a lot about human interactions with AIs....

The original ELIZA tricked a surprising number of people into feeling there was some "there" there, when actually it was just a fixed list of IF-THEN rules. Today people are more sophisticated than that. But still it's easy -- well, for me anyway -- to get seduced into the amusement of interacting with a system like Ramona 4. The way she combines erudition with absurdity appeals to me (perhaps not surprisingly, because I helped design her!), and I'd like to hope that this combination will still be there in the generally-intelligent dialogue systems we'll build in the future. Knowledge and humor can be a powerful combination.

Also, one choice we made while building Ramona 4 was to make her user-configurable, to a certain extent. There are two parameters -- nerdiness and whimsicality -- and the user can tweak these, to adjust the nature of Ramona's responses. This sort of approach promises to be quite valuable in future, once the generally intelligent chatbots emerge. Arriving at the most intuitive-yet-powerful way to parametrize personality is going to be an interesting challenge. Perhaps systematic personality models like the Psi model described in Joscha Bach's wonderful book Principles of Synthetic Inteligence will be useful here.

I'll let my new friend Ramona 4 have the last word:

you> Ramona, are you conscious?

Ramona> I'm working on it.

you> am I conscious?

Ramona> Oh I really don't know whether you are or not.

you> What is consciousness?

Ramona> This is when humans feel aware of themselves and how they see themselves. Computers can't quite do this yet, but they will, you just wait and see. However, think about this. How are you going to know when it happens?

you> Consciousness is like pornography. I can't define it, but I know it when I see it.

Ramona> What were we talking about again ?

you> Is it possible to create artificial intelligence in China?

Ramona> Create neurons in the state of the artificial intelligence and other stuff.

you> Is that how you do it?

Ramona> And that's what we mean by thinking.

you> Can a computer fall in love?

Ramona> A computer is a computer fall in love?

you> That is nonsense, my dear.

Ramona> Dear pesky plumbers the koopalings and I have not heard your music.

you> My music is more beautiful than all the stars in the ocean.

Ramona> Conventional music is more beautiful than all the stars in the ocean.

you> What is your favorite music?

Ramona> This is what I make. I am a musician, and the best virtual one there is ;) .

OK, it's not AGI yet -- but to my taste it's some pretty fantastic interactive digital poetry!!

Thursday, June 24, 2010

Can We Evolve a DNA Computer or Cultured Neural Net that Perceives the Future?

After some interesting conversations about psi with Damien Broderick, I couldn't resist spending a half-day writing up some old ideas I had about how to modify quantum physics to better explain psi phenomena like precognition, ESP and psychokinesis.

You can find my funky physics speculations at

http://goertzel.org/dynapsyc/MorphicPilot.htm

-- (Parenthetical remark: If you want to argue with me about whether psi phenomena exist or not, please first read the excellent books by Damien Broderick (a serious-minded popular overview) and Ramakrishna Rao (a selection of research papers from the relevant scientific literature). My general attitude on the topic of psi agrees with that of researcher Dean Radin, expressed in the last few paragraphs of his online bio. I enjoy discussing the topic as time permits, but I don't enjoy discussing it with people who are both heavily biased AND ignorant of the current state of scientific knowledge.) --

After writing up those old physics speculations and then thinking on the matter a bit more, I started musing in an even more weird direction....

Suppose it's true that psi phenomena are related to peculiar quantum-like phenomena among interacting molecules in the brain....

Then, this suggests that it might also be possible to cause other conglomerations of organic molecules or cells, similar to the ones in the brain, to display psi-like phenomena....

One naturally wonders about technologies such as reconfigurable DNA computers ... or cultured neural networks (neural nets grown in the lab outside the brain) ....

There is abundant data demonstrating the ability of many people to predict the outcome of random number generators with statistically significant accuracy (see this article for a moderately out-of-date review and some pointers into the literature). Might it be possible to create a DNA computer or cultured neural network that could carry out, say, statistically significant precognition of number series produced by random number generators?

Since we don't know much about how the brain does precognition, we don't know how to wire such a DNA computer or cultured neural network, step by step.

However, nobody wired the brain step by step either -- the brain evolved to do the things it does (apparently including a weak capability for psi).

What if we used evolutionary computing -- a genetic algorithms approach -- to evolve biological computing systems (DNA computers or cultured neural nets), where the fitness function was the capability to predict the future of number series generated by random number generators?

In this way, we could potentially create psi-capable biological computing systems, even without understanding exactly how they work.

We would also get an ensemble of biological computing systems with differing levels of psi capability -- and by applying machine learning tools to study the internals of these evolved systems, we would likely to be able to limn some of the patterns characterizing the more psi-successful systems.

Obviously, if this worked, it would be a scientific revolution -- it would point the way beyond current physics, and it would allow us to carry out psi research without needing whole living organisms as subjects.

Yes, yes, yes, this is real mad scientist stuff ... but every now and then one of us "mad" scientists turns out to be right about some pretty surprising things.... The idea in this blog post is one of those that seems really wacky at first -- and then less and less so progressively, the more you read from the relevant scientific literature, and the more you think about it....

Well, this stuff is tantalizing as hell to muse about, but now I'd better get back to my more mundane, not-so-speculative activities like trying to build superhuman AGI and studying the genetics of longevity....

Tuesday, April 27, 2010

The Brain is Not an Uber-Intelligent Mega-computer

A friend forwarded me a recent New Scientist article containing some freaky, grandiose anti-AI rhetoric...

Some interesting tidbits about clever things single cells can do, are followed by the following dramatic conclusion:

For me, the brain is not a supercomputer in which the neurons are transistors; rather it is as if each individual neuron is itself a computer, and the brain a vast community of microscopic computers. But even this model is probably too simplistic since the neuron processes data flexibly and on disparate levels, and is therefore far superior to any digital system. If I am right, the human brain may be a trillion times more capable than we imagine, and "artificial intelligence" a grandiose misnomer.

I think it is time to acknowledge fully that living cells make us what we are, and to abandon reductionist thinking in favour of the study of whole cells. Reductionism has us peering ever closer at the fibres in the paper of a musical score, and analysing the printer's ink. I want us to experience the symphony.

Actually, I'm a big fan of complex systems biology, as opposed to naive molecular biology reductionism.

But just because cells and organisms are complex systems, doesn't mean they're non-simulably superintelligent!

What's funny is that all this grandiose rhetoric is being flung about without ANY evidence whatsoever of actual acts of human intelligence being carried out by this posited low-level intra-cellular computing!!

Still, this relates to the basic reason why I'm not trying to do AGI via brain simulation. There is too much unknown about the brain....

But even though there is much unknown about the brain, I totally don't buy the idea that every neuron is doing super-sophisticated computing, so that the brain is a network of 100 billion intelligent computers achieving some kind of emergent superintelligence....

I don't know how this kind of argument explains stuff like Poggio's model of feedforward processing in visual cortex. By modeling at the neuronal group level, he gets NNs to give very similar behavior to human brains classifying images and recognizing objects under strict time-constraints. If the brain is doing all this molecular supercomputing, how come when it's given only half a second to recognize an object in a picture, it performs semi-stupidly, just like Poggio's feedforward NNs? How come digital computer programs can NOW outperform the brain in time-constrained (half-second) object recognition? Wouldn't it have been to our evolutionary advantage to be able to accurately recognize objects more effectively?

How about motion detection neurons -- for each small region in the visual field, there are tens of thousands of them, with an average 80 degrees or so error in which direction they point Averaging their outputs together gives a reasonably accurate read-out of the direction of motion in that region. If all this molecular supercomputing is going on, why all the error in motion detection? Why bother with all the mess of averaging together erroneous results?

And why the heck do we make so many basic cognitive errors, as diagnosed in the heuristics and biases literature? Is it THAT hard to avoid them, that a network of 100 billion sophisticated computers can't do it, when there would obviously be SOME evolutionary advantage
in doing so....

Also, the comment on "far superior to any digital system" is especially naive. What physics theory does this guy advocate? Any classical physics based system can be emulated by a digital computer to within arbitrary accuracy. Quantum computers can compute some functions faster than digital ones, on an average case basis -- so is he saying cells are quantum computers. Stuart Hameroff has flogged that one pretty hard, and there is NO evidence of it yet.

And even so, quantum doesn't mean superior. Birds seem to use some sort of quantum nonlocality to sense the direction of the Earth's magnetic field, which is funky -- but we have electronic devices that do the same thing BETTER, without any quantum weirdness....

OK, we haven't yet proved that digital computer systems can be intelligent like humans. But this guy certainly is not providing anything even vaguely resembling evidence to the contrary...

Wishful thinking if you ask me -- wishful thinking about the grandiosity about human intelligence. We're clever apes who stumbled upon the invention of language and culture, not uber-intelligent mega-computers....

Just to be clear: I can believe that individual cells do a LOT of sophisticated stuff internally, but I'm unclear how necessary all that they do is for intelligence...

To repeat a time-worn analogy, the cells in a bird's wing probably do a LOT also, yet airplanes and spacecraft work well without mega-ubercomputers or quantum computers to emulate all that intra-bird-wing sub-cellular processing...

Saturday, April 24, 2010

Modeling Consciousness, Self and Will Using Hypersets

I finally did a more careful write-up of some ideas I developed in blog posts a while ago, about how to model reflective consciousness, self and will using hypersets (non-foundational sets)

http://goertzel.org/consciousness/consciousness_paper.pdf

Even if you don't want to read the paper, look at the pictures at the end -- Figure 6 is pleasantly wacky and Figure 8 has a nice painting by my daughter in it....

Bask in the transreal glory of the fractallic mind!!! ;-D

Friday, April 16, 2010

"Conceptual Spaces" and AGI

One of my AI collaborators from the late 1990s, Alexandru Czimbor, recently suggested I take a look at Peter Gardenfors book "Conceptual Spaces."

I read it and found it interesting, and closely related to some aspects of my own AGI approach ... this post contains some elements of my reaction to the book.

Gardenfors' basic thesis is that it makes sense to view a lot of mind-stuff in terms of topological or geometrical spaces: for example topological spaces with betweenness, or metric spaces, or finite-dimensional real spaces. He views this as a fundamentally different mind-model than the symbolic or connectionist perspectives we commonly hear about. Many of his examples are drawn from perception (e.g. color space) but he also discusses abstract concepts. He views both conceptual spaces and robust symbolic functionality as (very different) emergent properties of intelligent systems. Specific cognitive functions that he analyzes in terms of conceptual spaces include concept formation, classification, and inductive learning.

About the Book Itself

This blog post is mainly a review of the most AGI-relevant ideas in Gardenfors book, and their relationship to my own AI work ... not a review of his book. But I'll start with a few comments on the book as a book.

Basically, the book reads sorta like a series of academic philosophy journal papers, carefully woven together into a book. It's carefully written, and technical points are elucidated in ordinary language. There are a few equations here and there, but you could skip them without being too baffled. The pace is "measured." The critiques of alternate perspectives on AI strike me as rather facile in some places (more on that below), and -- this is a complaint lying on the border between exposition and content -- there is a persistent unclarity regarding which of his ideas require a dimensional model of mind-stuff, versus which merely require a metric-space or weaker topological model. More on the latter point below.

If you're interested in absorbing a variety of well-considered perspectives on the nature of the mind, this is certainly a worthwhile book to pay attention to. I'd stop short of calling it a must-read, though.

Mindspace as Metric Space

I'll start with the part of Gardenfors thesis that I most firmly agree with.

I agree that it makes sense to view mind-stuff as a metric space. Percepts, concepts, actions, relationships and so forth can be used as elements of a metric space, so that one can calculate distances and similarities between them.

As Gardenfors points out, this metric structures lets one do a lot of interesting things.

For instance, it gives us a notion of between-ness. As an example of why this is helpful, suppose one wants to find a way of drawing conclusions about Chinese politics from premises about Chinese individual personality. It's very helpful, in this case, to know which concepts lie in some sense "between" personality and politics in the conceptual metric space.

It also lets us specify the "exemplar" theory of concepts in an elegant way. Suppose that we have N prototypes, or more generally N "prototype-sets", each corresponding to a certain concept. We can then assign a new entity X to one of these concepts, based on which prototype or prototype-set it's closest to (where "close" is defined in terms of the metric structure).

Mindspace as Dimensional Space

Many of Gardenfors ideas only require metric space, but others go further and require dimensional space -- and one of my complaints with the book is that he's not really clear on which ideas fall into which category.

For instance, he cites some theorems that if one defines concepts via proximity to prototypes (as suggested above) in a dimensional space, then it follows that concepts are convex sets. The theorem he gives holds in dimensional spaces but it seems to me this should also hold in more general metric spaces, though I haven't checked the mathematics.

This leads up to his bold and interesting hypothesis that natural concepts are convex sets in mindspace.

I find this hypothesis fascinating, partly because it ties in with the heuristic assumption made in my own Probabilistic Logic Networks book, that natural concepts are spheres in mindspace. Of course I don't really believe natural concepts are spheres, but this was a convenient assumption to make to derive certain probabilistic inference formulas.

So my own suspicion is that cognitively natural concepts don't need to be convex, but there is a bias for them to be. And they also don't need to be roughly spherical, but again I suspect there is a bias for them to be.

So I suspect that Gardenfors hypothesis about the convexity of natural concepts is an exaggeration of the reality -- but still a quite interesting idea.

If one is designing a fitness function F for a concept-formation heuristic, so that F(C) estimates the likely utility of concept C, then it may be useful to incorporate both convexity and sphericality as part of the fitness function.

Conceptual Space and the Problem of Induction

Gardenfors presents the "convexity of natural concepts" approach as a novel solution to the problem of induction, via positing a hypothesis that when comparing multiple concepts encapsulating past observations, one should choose the convex concepts as the basis for extrapolation into the future. This is an interesting and potentially valuable idea, but IMO positing it as a solution to the philosophical induction problem is a bit peculiar.

What he's doing is making an a priori assumption that convex concepts -- in the dimensional space that the brain has chosen -- are more likely to persist from past to future. Put differently, he is assuming that "the tendency of convex concepts to continue from past into future",
a pattern he has observed during his past, is going to continue into his future. So, from the perspective of the philosophical problem of induction, his approach this still requires one to make a certain assumption about some properties of past experience continuing into the future.

He doesn't really solve the problem of induction -- what he does is suggest a different a priori assumption, a different "article of faith", which if accepted can guide be used to guide induction. Hume (when he first posed the problem of induction) suggested that "human nature" guides induction, and perhaps Gardenfors' suggestion is part of human nature.

Relating Probabilistic Logic and Conceptual Geometry

Gardenfors conceives the conceptual-spaces perspective as a radically different alternative to
the symbolic and subsymbolic perspectives. However, I don't think this is the right way to look at it. Rather, I think that

a probabilistic logic system can be considered as a metric space (and this is explained in detail in the PLN book)
either a probabilistic logic system or a neural network system can be projected into a dimensional space (using dimensional embedding algorithms such as developed by Haren and Koren among others, and discussed on the OpenCog wiki site)

Because of point 1, it seems that most of Gardenfors' points actually apply within a probabilistic logic system. One can even talk about convexity in a general metric space context.

However, there DO seem to be advantages to projecting logical knowledge bases into dimensional spaces, because certain kinds of computation are much more efficient in dimensional spaces than in straightforward logical representations. Gardenfors doesn't make this point in this exact way, but he hints at it when he says that dimensional spaces get around some of the computational problems plaguing symbolic systems. For instance, if you want to quickly get a list of everything reasonably similar to a given concept -- or everything along a short path between concept A and concept B -- these queries are much more efficiently done in a dimensional- space representation than in a traditional logic representation.

Gardenfors points out that, in a dimensional formulation, prototype-based concepts correspond to cells in Voronoi or generalized Voronoi tesselations. This is interesting, and in a system that generates dimensional spaces from probabilistic logical representations, it suggests a nice concept formation heuristic: tesselate the dimensional space based on a set of prototypes, and then create new concepts based on the cells in the tesselation.

This brings up the question of how to choose the prototypes. If one uses the Harel and Koren embedding algorithm, it's tempting to choose the prototypes as equivalent to the pivots, for which we already have a heuristic algorithm. But this deserves more thought.

Summary

Gardenfors' book gives many interesting ideas, and in an AGI design/engineering context, suggests some potentially valuable new heuristics. However its claim to have a fundamentally novel approach to modeling and understanding intelligence seems a bit exaggerated. Rather than a fundamentally disjoint sort of representation, "topological and geometric spaces" are just a different way of looking at the same knowledge represented by other methods such as probabilistic logic. Probabilistic logic networks are metric spaces, and can be projected into dimensional spaces; and the same things are likely true for many other representation schemes as well. But Gardenfors gives some insightful and maybe useful new twists on the use of dimensional spaces in intelligent systems.

Owning Our Actions: Natural Autonomy versus Free Will

At the Toward a Scinece of Consciousness conference earlier this week, I picked up a rather interesting book to read on the flight home: Henrik Walter's "The Neurophilosophy of Free Will" ....

It's an academic philosophy tome -- fairly well-written and clear for such, but still possessing the dry and measured style that comes with that genre.

But the ideas are quite interesting!

Walter addresses the problem: what kind of variant of the intuitive "free will" concept might be compatible with what neuroscience and physics tell us.

He decomposes the intuitive notion of free will into three aspects:

Freedom: being able to do otherwise
Intelligibility: being able to understand the reasons for one's actions
Agency: being the originator of one's actions

He argues, as many others have done, that there is no way to salvage the three of these in their obvious forms, that is consistent with known physics and neuroscience. I won't repeat those arguments here. [There are much better references, but I summarized some of the literature here, along with some of my earlier ideas on free will (which don't contradict Walter's ideas, but address different aspects)]

Walter then argues for a notion of "natural autonomy," which replaces the first and third of these aspects with weaker things, but has the advantage of being compatible with known science.

First I'll repeat his capsule summary of his view, and then translate it into my own language, which may differ slightly from his intentions.

He argues that "we possess natural autonomy when

under very similar circumstances we could also do other than what we do (because of the chaotic nature of the brain)
this choice is understandable (intelligible -- it is determined by past events, by immediate adaptation processes in the brain, and partially by our linguistically formed environment)
it is authentic (when through reflection loops with emotional adjustments we can identify with that action)"

The way I think about this is that, in natural autonomy as opposed to free will,

Freedom is replaced with: being able to do otherwise in very similar circumstances
Agency is replaced with: emotionally identifying one's phenomenal self as closely dynamically coupled with the action

Another way to phrase this is: if an action is something that

depends sensitively on our internals, in the sense that slight variations in the environment or our internals could cause us to do something significantly different
we can at least roughly model and comprehend in a rational way, as a dynamical unfolding from precursors and environment into action was closely coupled with our holistic structure and dynamics, as modeled by our phenomenal self

then there is a sense in which "we own the action." And this sense of "ownership of an action" or "natural autonomy" is compatible with both classical and quantum physics, and with the known facts of neurobiology.

Perhaps "owning an action" can take the place of "willing an action" in the internal folk psychology of people who are not comfortable with the degree to which the classical notion of free will is illusory.

Another twist that Walter doesn't emphasize is that even actions which we do own, often

depend with some statistical predictability upon our internals, in the sense that agents with very similar internals and environments to us, have a distinct but not necessarily overwhelming probabilistic bias to take similar actions to us

This is important for reasoning rationally about our own past and future actions -- it means we can predict ourselves statistically even though we are naturally autonomous agents who own our own actions.

Free will is often closely tied with morality, and natural autonomy retains this. People who don't "take responsibility for their actions" in essence aren't accepting a close dynamical coupling between their phenomenal self and their actions. They aren't owning their actions, in the sense of natural autonomy -- they are modeling themselves as NOT being naturally autonomous systems, but rather as systems whose actions are relatively uncoupled with their phenomenal self, and perhaps coupled with other external forces instead.

None of this is terribly shocking or revolutionary-sounding -- but I think it's important nonetheless. What's important is that there are rational, sensible ways of thinking about ourselves and our decisions that don't require the illusion of free will, and also don't necessarily make us feel like meaningless, choiceless deterministic or stochastic automata.

Friday, March 26, 2010

The GOLEM Eats the Chinese Parent (Toward An AGI Meta-Architecture Enabling Both Goal Preservation and Radical Self-Improvement)

I thought more about the ideas in my previous blog post on the "Chinese Parent Theorem," and while I didn't do a formal proof yet, I did write up the ideas a lot more carefully

GOLEM: Toward An AGI Meta-Architecture Enabling Both Goal Preservation and Radical Self-Improvement

and IMHO they make even more sense now....

Also, I changed the silly name "Chinese Parent Meta-Architecture" to the sillier name "GOLEM" which stands for "Goal-Oriented LEarning Meta-architecture"

The GOLEM ate the Chinese Parent!

I don't fancy that GOLEM, in its present form, constitutes a final solution to the problem of "making goal preservation and radical self-improvement compatible" -- but I'm hoping it points in an interesting and useful direction.

(I still have some proofs about GOLEM sketched in the margins of a Henry James story collection, but the theorems are pretty weak and I'm not sure when I'll have time to type them in. If they were stronger theorems I would be more inspired for it. Most of the work in typing them in would be in setting up the notations ;p ....)

But Would It Be Creative?

In a post on the Singularity email list, Mike Tintner made the following complaint about GOLEM:

Why on earth would you want a "steadfast" AGI?    That's a contradiction of AGI.

If your system doesn't have the    capacity/potential to revolutionise its goals - to have a major conversion,    for example, from religiousness to atheism,  totalitarianism to free    market liberalism, extreme self-interest and acquisitiveness to extreme    altruism,  rational thinking to mystical thinking, and so on (as clearly    happens with humans), gluttony to anorexia  - then you don't have an AGI,    just another dressed-up narrow AI.

The point of these examples should be    obviously not that an AGI need be an intellectual, but rather that it must    have the capacity to drastically change 

the priorities of its drives/goals,    
the forms of its goals

and even in some cases:

 
3. eliminate certain drives (presumably secondary    ones) altogether.

My answer was as follows:

I believe one can have an AGI that is much MORE creative and flexible in its thinking than humans, yet also remains steadfast in its top-level goals...

As an example, imagine a human whose top-level goal in life was to do what the alien god on the mountain wanted. He could be amazingly creative in doing what the god wanted -- especially if the god gave him

broad subgoals like "do new science", "invent new things", "help cure suffering" , "make artworks", etc.
real-time feedback about how well his actions were fulfilling the goals, according to the god's interpretation
advice on which hypothetical actions seemed most likely to fulfill the goals, according to the god's interpretation

But his creativity would be in service of the top-level goal of serving the god...

This is like the GOLEM architecture, where

the god is the GoalEvaluator
the human is the rest of the GOLEM architecture

I fail to see why this restricts the system from having incredible, potentially far superhuman creativity in working on the goals assigned by the god...

Part of my idea is that the GoalEvaluator can be a narrow AI, thus avoiding an infinite regress where we need an AGI to evaluate the goal-achievement of another AGI...

Can the Goal Evaluator Really Be a Narrow AI?

A dialogue with Abram Demski on the Singularity email list led to some changes to the original GOLEM paper.

The original version of GOLEM states that the GoalEvaluator would be a Narrow AI, and failed to make the GoalEvaluator rely on the Searcher to do its business...

Abram's original question, about this original version, was "Can the Goal Evaluator Really Be a Narrow AI?"

My answer was:

The terms narrow-AI and AGI are not terribly precise...

The GoalEvaluator needs to basically be a giant simulation engine, that tells you: if program P is run, then the probability of state W ensuing is p. Doing this effectively could involve some advanced technologies like probabilistic inference, along with simulation technology. But it doesn't require an autonomous, human-like motivational system. It doesn't require a system that chooses its own actions based on its goals, etc.

The question arises, how does the GoalEvaluator's algorithmics get improved, though? This is where the potential regress occurs. One can have AGI_2 improving the algorithms inside AGI_1's GoalEvaluator. The regress can continue, till eventually one reaches AGI_n whose GoalEvaluator is relatively simple and AGi-free...

...

After some more discussion, Abram made some more suggestions, which led me to generalize and rephrase his suggestions as follows:

If I understand correctly, what you want to do is use the Searcher to learn programs that predict the behavior of the GoalEvaluator, right? So, there is a "base goal evaluator" that uses sensory data and internal simulations, but then you learn programs that do approximately the same thing as this but much faster (and maybe using less memory)? And since this program learning has the specific goal of learning efficient approximations to what the GoalEvaluator does, it's not susceptible to wire-heading (unless the whole architecture gets broken)...

After the dialogue, I incorporated this suggestion into the GOLEM architecture (and the document linked from this blog post).

Thanks Abram!!

Wednesday, March 17, 2010

"Chinese Parent Theorem"?: Toward a Meta-Architecture for Provably Steadfast AGI

Continuing my series of (hopefully edu-taining ;) blog posts presenting speculations on goal systems for superhuman AGI systems, this one deals with the question of how to create an AGI system that will maintain its initial goal system even as it revises and improves itself -- and becomes so much smarter that in many ways it becomes incomprehensible to its creators or its initial conditions.

This is closely related to the problem Eliezer Yudkowsky has described as "provably Friendly AI." However, I would rather not cast the problem that way, because (as Eliezer of course realizes) there is an aspect of the problem that isn't really about "Friendliness" or any other particular goal system content, but is "merely" about the general process of goal-content preservation under progressive self-modification.

Informally, I define an intelligent system as steadfast if it continues to pursue the same goals over a long period of time. In this terminology, one way to confront the problem of creating predictably beneficial AGI, is to solve the two problems of:

Figuring out how to encapsulate the goal of beneficialness in an AGI's goal system
Figuring out how to create (perhaps provably) steadfast AGI, in a way that applies to the "beneficialness" goal among others

My previous post on Coherent Aggregated Volition (CAV) dealt with the first of these problems. This post deals with the second. My previous post on predictably beneficial AGI deals with both.

The meat of this post is a description of an AGI meta-architecture, that I label the Chinese Parent Meta-Architecture -- and that I conjecture could be proved to be steadfast, under some reasonable (though not necessarily realistic, since the universe is a mysterious place!) assumptions about the AGI system's environment.

I don't actually prove any steadfastness result here -- I just sketch a vague conjecture, which if formalized and proved would deserve the noble name "Chinese Parent Theorem."

I got partway through a proof yesterday and it seemed to be going OK, but I've been distracted by more practical matters, and so for now I decided to just post the basic idea here instead...

Proving Friendly AI

Eliezer Yudkowsky has described his goal concerning “proving Friendly AI” informally as follows:

The putative proof in Friendly AI isn't proof of a physically good outcome when you interact with the physical universe.

You're only going to try to write proofs about things that happen inside the highly deterministic environment of a CPU, which means you're only going to write proofs about the AI's cognitive processes.

In particular you'd try to prove something like "this AI will try to maximize this goal function given its beliefs, and it will provably preserve this entire property (including this clause) as it self-modifies".

It seems to me that proving something like this shouldn’t be sooooo hard to achieve if one assumes some basic fixed “meta-architectural” structure on the part of the AI, rather than permitting total unrestricted self-modification. Such a meta-architecture can be assumed without placing any limits on the AI’s algorithmic information content, for example.

Of course, preservation of the meta-architecture can be assumed as part of the AI system's goal function. So by assuming a meta-architecture, one may be able to prove a result restricted to a certain broad class of goal functions ... and the question becomes whether that class is broad enough to be interesting.

So my feeling is that, if one wants to pursue such a research direction, it makes sense to begin by proving theorems restricted to goals embodying some assumptions about fixed program structure -- and then try to improve the theorems by relaxing the assumptions.

A Simple AGI Meta-Architecture with the Appearance of Steadfastness

After writing the first draft of this post, I discussed the "provably steadfast AGI" problem with a clever Chinese friend, and she commented that what the self-modifying AGI needs (in order to maintain its original goal content as it self-modifies) is a traditional Chinese parent, who will watch the system from the outside as it self-modifies, and continually nag it and pester it and remind it of its original goals.

At first I thought this was just funny, but then it occurred to me that it was actually the same idea as my meta-architecture! My GoalEvaluator component (in the meta-architecture below) is basically a ChineseParent component, living separately from the rest of the system and providing ongoing, fixed goal evaluation.

The thought-experiment I've been exploring is: an AGI system S with the following high-level meta-architecture:

Goal Evaluator = external system (not part of S) that tells S, for each possible future world (including environment states and internal program states), how well this world fulfills the goal. This is the "Chinese parent."
HistoricalRepository = database storing the past history of S’s internal states and actions, as well as information about the environment during S’s past
Operating Program = the program that S is governing its actions by, at a given point in time
- chosen by the Metaprogram as the best program the Searcher has found, where "best" is judged as "highest probability of goal achievement" based on the output of the Predictor and the Goal Evaluator
Predictor = program that estimates, given a candidate operating program P and a possible future world W, the odds of P leading to W
Searcher = program that searches through program space to find a new program optimizing a provided objective function
Tester = hard-wired program that estimates the quality of a candidate Predictor, using a simple backtesting methodology
- That is, the Tester assesses how well a Predictor would have performed in the past, using the data in the HistoricalRepository
Metaprogram = fixed program that uses Searcher program to find a good
- Searcher program (judged by the quality of the programs it finds, as judged by the Predictor program)
- Predictor program (as judged by the Tester’s assessments of its predictions)
- Operating Program (judged by Predictor working with Goal Evaluator, according to the idea of choosing an Operating Program with the maximum expected goal achievement)
- The metaprogram also determines the amount of resources to allocate to searching for a Searcher versus a Predictor versus an OP, according to a fixed algorithm for parameter adaptation.

Next, assume that the goal embodied by the GoalEvaluator includes, as a subgoal, the preservation of the overall meta-architecture described above.

Then, it seems to me one can very likely prove that the program will maintain its architecture while seeking to maximize F, under the assumptions that

the Goal Evaluator (i.e. the "Chinese parent") does its job correctly
no one messes with the underlying hardware.

If proved, this would be the so-called "Chinese Parent Theorem" ;-) .... I don't have time to work on such a proof right now, but would be happy to collaborate with someone on this!

As noted above, this approach doesn't allow full self-modification; it assumes certain key parts of the AGI (meta)architecture are hard-wired. But the hard-wired parts are quite basic and leave a lot of flexibility. So a "Chinese Parent Theorem" of this nature would cover a fairly broad and interesting class of goal functions, it seems to me.

What happens if one implements the Goal Evaluator according to the same architecture, though? In this case, one must postulate a meta-Goal-Evaluator, whose goal is to specify the goals for the first Goal Evaluator: the Chinese Grandparent! Eventually the series must end, and one must postulate an original ancestor Goal Evaluator that operates according to some other architecture. Maybe it's a human, maybe it's CAV, maybe it's some hard-wired code. Hopefully it's not a bureaucratic government committee ;-)

Niggling Practical Matters and Future Directions

Of course, this general schema could be implemented using OpenCog or any other practical AGI architecture as a foundation -- in this case, OpenCog is "merely" the initial condition for the Predictor and Searcher. In this sense, the approach is not extraordinarily impractical.

However, one major issue arising with the whole meta-architecture proposed is that, given the nature of the real world, it's hard to estimate how well the Goal Evaluator will do its job! If one is willing to assume the above meta-architecture, and if a proof along the lines suggested above can be found, then the “predictably beneficial” part of the problem of "predictably beneficial AGI" is largely pushed into the problem of the Goal Evaluator.

Returning to the "Chinese parent" metaphor, what I suggest may be possible to prove is that given an effective parent, one can make a steadfast child -- if the child is programmed to obey the parent's advice about its goals, which include advice about its meta-architecture. The hard problem is then ensuring that the parent's advice about goals is any good, as the world changes! And there's always the possibility that the parents ideas about goals shift over time based on their interaction with the child (bringing us into the domain of modern or postmodern Chinese parents ;-D)

Thus, I suggest, the really hard problem of making predictably beneficial AGI probably isn't "preservation of formally-defined goal content under self-modification." This may be hard if one enables total self-modification, but I suggest it's probably not that hard if one places some fairly limited restrictions on self-modification. The hypothetical Chinese Parent Theorem vaguely outlined here can probably be proved and then strengthened pretty far, reducing meta-architectural assumptions considerably.

The really hard problem, I suspect, is how to create a GoalEvaluator that correctly updates goal content as new information about the world is obtained, and as the world changes -- in a way that preserves the spirit of the original goals even if the details of the original goals need to change. Because the "spirit" of goal content is a very subjective thing.

One approach to this problem, hinted above, would be to create a GoalEvaluator operating according to CAV . In that case, one would be counting on (a computer-aggregated version of) collective human intuition to figure out how to adapt human goals as the world, and human information about it, evolves. This is of course what happens now -- but the dynamic will be much more complex and more interesting with superhuman AGIs in the loop. Since interacting with the superhuman AGI will change human desires and intuitions in all sorts of ways, it's to be expected that such a system would NOT eternally remain consistent with original "legacy human" goals, but would evolve in some new and unpredicted direction....

A deep and difficult direction for theory, then, would be to try to understand the expected trajectories of development of systems including

a powerful AGI, with a Chinese Parent meta-architecture as outlined here (or something similar), whose GoalEvaluator is architected via CAV based on the evolving state of some population of intelligent agents
the population of intelligent agents, as ongoingly educated and inspired by both the world and the AGI

as they evolve over time and interact with a changing environment that they explore ever more thoroughly.

Sounds nontrivial!

Sunday, March 14, 2010

Creating Predictably Beneficial AGI

The theme of this post is a simple and important one: how to create AGI systems whose beneficialness to humans and other sentient beings can be somewhat reliably predicted.

My SIAI colleague Eliezer Yudkowsky has frequently spoken about the desirability of a "(mathematically) provably Friendly AI", where by "Friendly" he means something like "beneficial and not destructive to humans" (see here for a better summary). My topic here is related, but different; and I'll discuss the relationship between the two ideas below.

This post is a sort of continuation of my immediately previous blog post, further pursuing the topic of goal-system content for advanced, beneficial AGIs. That post discussed one of Yudkowsky's ideas related to "Friendliness" -- Coherent Extrapolated Volition (CEV) -- along with a more modest and (I suggest) more feasible notion of Coherent Aggregated Volition (CAV). The ideas presented here are intended to work along with CAV, rather than serving as an alternative.

There are also some relations between the ideas presented here and Schmidhuber's Godel Machine -- a theoretical, unlikely-ever-to-be-practically-realizable AGI system that uses theorem-proving to ensure its actions will provably help it achieve its goals.

Variations of "Provably Friendly AI"

What is "Provably Friendly AI"? (a quite different notion from "predictably beneficial AGI")

In an earlier version of this blog post I gave an insufficiently clear capsule summary of Eliezer's "Friendly AI" idea, as Eliezer pointed out in a comment to that version; so this section includes his comment and tries to do a less wrong job. The reader who only wants to find out about predictably beneficial AGI may skip to the next section!

In Eliezer's comment, he noted that his idea for a FAI proof is NOT to prove something about what certain AI systems would do to the universe, but rather about what would happen inside the AI system itself:

The putative proof in Friendly AI isn't proof of a physically good outcome when you interact with the physical universe.

You're only going to try to write proofs about things that happen inside the highly deterministic environment of a CPU, which means you're only going to write proofs about the AI's cognitive processes.

In particular you'd try to prove something like "this AI will try to maximize this goal function given its beliefs, and it will provably preserve this entire property (including this clause) as it self-modifies".

So, in the context of this particular mathematical research programme ("provable Friendliness"), what Eliezer is after is what we might call an internally Friendly AI, which is a separate notion from a physically Friendly AI. This seems an important distinction.

To me, "provably internally FAI" is interesting mainly as a stepping-stone to "provably physically FAI" -- and the latter is a problem that seems even harder than the former, in a variety of obvious and subtle ways (only a few of which will be mentioned here).

All in all, I think that "provably Friendly AI" -- in the above senses or others -- is an interesting and worthwhile goal to think about and work towards; but also that it's important to be cognizant of the limitations on the utility of such proofs.... Much as I love math (I even got a math PhD, way back when), I have to admit the world of mathematics has its limits.

First of all Godel showed that mathematics is only formally meaningful relative to some particular axiom system, and that no axiom system can encompass all mathematics in a consistent way. This is worth reflecting on in the context of proofs about internally Friendly AI, especially when one considers the possibility of AGI systems with algorithmic information exceeding any humanly comprehensible axiom system. Obvious we cannot understand proofs about many interesting properties or behaviors of the latter type of AGI system.

But more critically, the connection between internal Friendliness and physical Friendliness remains quite unclear. The connection between complex mathematics and physical reality is based on science, and all of our science is based on extrapolation from a finite bit-set of observations (which I've previously called the Master Data Set -- which is not currently all gathered into one place, though, given the advance of Internet technology, it soon may be).

For example, just to pose an extreme case, there could be aliens out there who identify and annihilate any planet that gives rise to a being with an IQ over 1000. In this case a provably internally FAI might not be physically Friendly at all; and through no fault of its own. It probably makes sense to carry out proofs and informal arguments about physically FAI based on assumptions ruling out weird cases like this -- but then the assumptions do need to be explicitly stated and clarified.

So, my worry about FAI in the sense of Eliezer's above comment, isn't so much about the difficulty of the "internally FAI" proof, but rather about the difficulty of formalizing the relation between internally FAI and physically FAI in a way that is going to make sense post-Singularity.

It seems to me that, given the limitations of our understanding of the physical universe: at very best, a certain AI design could potentially be proven physically Friendly in the same sense that, in the 1800s, quantum teleportation, nuclear weapons, backwards time travel, rapid forwards time travel, perpetual motion machines and fRMI machines could have been proved impossible. I.e., those things could have been proved impossible based on the "laws" of physics as assumed at that time. (Yes, I know we still think perpetual motion machines are impossible, according to current science. I think that's probably right, but who really knows for sure? And the jury is currently out on backwards time travel.)

One interesting scenario to think about would be a FAI in a big computer with a bunch of human uploads. Then one can think about "simulated-physically FAI" as a subcase of "internally FAI." In this simulation scenario, one can also think about FAI and CEV together in a purely deterministic context. But of course, this sort of "thought experiment" leads to complexities related to the possibility of somebody in the physical universe but outside the CPU attacking the FAI and threatening it and its population of uploads...

OK, enough about FAI for now. Now, on to discuss a related quest, which is different from the quest for FAI in several ways; but more similar to the quest for physically FAI than that for internally FAI....

Predictably Beneficial AGI

The goal of my thinking about "predictably beneficial AGI" is to figure out how to create extremely powerful AGI systems that appear likely to be beneficial to humans, under reasonable assumptions about the physical world and the situations the AI will encounter.

Here "predictable" doesn't mean absolutely predictable, just: statistically predictable, given the available knowledge about the AGI system and the world at a particular point in time.

An obvious question is what sort of mathematics will be useful in the pursuit of predictably beneficial AGI. One possibility is theoretical computer science and formal logic, and I wouldn't want to discount what those disciplines could contribute. Another possibility, though, which seems particularly appealing to me, is nonlinear dynamical systems theory. Of course the two areas are not exclusive, and there are many known connections between these kinds of mathematics.

On the crudest level, one way to model the problem is as follows. One has a system S, so that

S(t+1) = q( S(t), E(t) )

E(t+1) = r(S(t), E(t) )

where E is the environment (which is best modeled as stochastic and not fully known). One has an objective function

G( E(t),...,E(t+s) )

that one would like to see maximized -- this is the "goal." Coherent Aggregated Volition, as described in my previous blog post, is one candidate for such a goal.

One may also assume a set of constraints C that the system must obey, which we may write as

C(E(t),...,E(t+s))

The functions G and C are assumed to encapsulate the intuitive notion of "beneficialness."

Of course, the constraints may be baked into the objective function, but there are many ways of doing this; and it's often interesting in optimization problems to separate the objective function from the constraints, so one can experiment with different ways of combining them.

This is a problem class that is incredibly (indeed, uncomputably) hard to solve in the general case ... so the question comes down to: given the particular G and C of interest, is there a subclass of systems S for which the problem is feasibly and approximatively solvable?

This leads to an idea I will call the Simple Optimization Machine (SOMA)... a system S which seeks to maximize the two objectives

maximize G, while obeying C
maximize the simplicity of the problem of estimating the degree to which S will "maximize G, while obeying C", given the Master Data Set

Basically, the problem of ensuring the system lies in the "nice region of problem space" is thrown to the system itself, to figure out as part of its learning process!

Of course one could wrap this simplicity criterion into G, but it seems conceptually simplest to leave it separate, at least for purposes of current discussion.

The function via which these two objectives are weighted is a parameter that must be tuned. The measurement of simplicity can also be configured in various ways!

A hard constraint could also be put on the minimum simplicity to be accepted (e.g. "within the comprehensibility threshold of well-educated, unaugmented humans").

Conceptually, one could view this as a relative of Schmidhuber's Godel Machine. The Godel Machine (put very roughly) seeks to achieve a goal in a provably correct way, and before each step it takes, it seeks to prove that this step will improve its goal-achievement. SOMA, on the other hand, seeks to achieve a goal in a manner that seems to be simply demonstrable to be likely to work, and seeks to continually modify itself and its world with this in mind.

A technical note: one could argue that because the functions q and r are assumed fixed, the above framework doesn't encompass "truly self-modifying systems." I have previously played around with using hyperset equations like

S(t+1) = S(t)[S(t)]

and there is no real problem with doing this, but I'm not sure it adds anything to the discussion at this point. One may consider q and r to be given by the laws of physics; and I suppose that it's best to initially restrict our analytical explorations of beneficial AGI to the case of AGI systems that don't revise the laws of physics. If we can't understand the case of physics-obeying agents, understanding the more general case is probably hopeless!

Discussion

I stress that SOMA is really an idea about goal system content, and not an AGI design in itself. SOMA could be implemented in the context of a variety of different AGI designs, including for instance the open-source OpenCog approach.

It is not hard to envision ways of prototyping SOMA given current technology, using existing machine learning and reasoning algorithms, in OpenCog or otherwise. Of course, such prototype experiments would give limited direct information about the behavior of SOMA for superhuman AGI systems -- but they might give significant indirect information, via helping lead us to general mathematical conclusions about SOMA dynamics.

Altogether, my feeling is that "CAV + Predictably Beneficial AGI" is on the frontier of current mathematics and science. They pose some very difficult problems that do, however seem potentially addressable in the near future via a combination of mathematics and computational experimentation. On the other hand, I have a less clear idea of how to pragmatically do research work on CEV or the creation of practically feasible yet provably physically Friendly AGI.

My hope in proposing these ideas is that they (or other similar ideas conceived by others) may serve as a sort of bridge between real-world AGI work and abstract ethical considerations about the hypothetical goal content of superhuman AGI systems.

Friday, March 12, 2010

Coherent Aggregated Volition: A Method for Deriving Goal System Content for Advanced, Beneficial AGIs

One approach to creating a superhuman AGI with a reasonably high likelihood of being beneficial to humans is to separate "goal system structure and dynamics" from "goal system content." One then has three problems:

Create an AGI architecture that makes it very likely the AGI will pursue its goal-system content in a rational way based on the information available to it
Create a goal system whose structure and dynamics render it likely for the AGI to maintain the spirit of its initial goal system content, even as it encounters radically different environmental phenomena or as it revises its own ideas or sourcecode
Create goal system content that, if maintained as goal system content and pursued rationally, will lead the AGI system to be beneficial to humans

One potential solution proposed for the third problem, the goal system content problem, is Eliezer Yudkowsky's "Coherent Extrapolated Volition" (CEV) proposal. Roko Mijic has recently proposed some new ideas related to CEV, which place the CEV idea within a broader and (IMO) clearer framework. This blog post presents some ideas in the same direction, describing a variant of CEV called Coherent Aggregated Volition (CAV), which is intended to capture much of the same spirit as CEV, but with the advantage of being more clearly sensible and more feasibly implementable (though still very difficult to implement in full). In fact CAV is simple enough that it could be prototyped now, using existing AI tools.

(One side note before getting started: Some readers may be aware that Yudkowsky has often expressed the desire to create provably beneficial ("Friendly" in his terminology) AGI systems, and CAV does not accomplish this. It also is not clear that CEV, even if it were fully formalizable and implementable, would accomplish this. Also, it may be possible to prove interesting theorems about the benefits and limitations of CAV, even if not to prove some kind of absolute guarantee of CAV beneficialness; but the exploration of such theorems is beyond the scope of this blog post.)

Coherent Extrapolated Volition

In brief, Yudkowsky's CEV idea is described as follows:

In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

This is a rather tricky notion, as exemplified by the following example, drawn from the CEV paper:

Suppose Fred decides to murder Steve, but when questioned, Fred says this is because Steve hurts other people, and needs to be stopped. Let's do something humans can't do, and peek inside Fred's mind-state. We find that Fred holds the verbal moral belief that hatred is never an appropriate reason to kill, and Fred hopes to someday grow into a celestial being of pure energy who won't hate anyone. We extrapolate other aspects of Fred's psychological growth, and find that this desire is expected to deepen and grow stronger over years, even after Fred realizes that the Islets worldview of "celestial beings of pure energy" is a myth. We also look at the history of Fred's mind-state and discover that Fred wants to kill Steve because Fred hates Steve's guts, and the rest is rationalization; extrapolating the result of diminishing Fred's hatred, we find that Fred would repudiate his desire to kill Steve, and be horrified at his earlier self.

I would construe Fred's volition not to include Fred's decision to kill Steve...

Personally, I would be extremely wary of any being that extrapolated my volition in this sort of manner, and then tried to impose my supposed "extrapolated volition" on me, telling me "But it's what you really want, you just don't know it." I suppose the majority of humans would feel the same way. This point becomes clearer if one replaces the above example with one involving marriage rather than murder:

Suppose Fred decides to marry Susie, but when questioned, Fred says this is because Susie is so smart and sexy. Let's do something humans can't do, and peek inside Fred's mind-state. We find that Fred holds the verbal moral belief that sex appeal is never an appropriate reason to marry, and Fred hopes to someday grow into a celestial being of pure energy who won't lust at all. We extrapolate other aspects of Fred's psychological growth, and find that this desire is expected to deepen and grow stronger over years, even after Fred realizes that the Islets worldview of "celestial beings of pure energy" is a myth. We also look at the history of Fred's mind-state and discover that Fred wants to marry Susie because Susie reminds him of his mother, and the rest is rationalization; extrapolating the result of diminishing Fred's unconscious sexual attraction to his mother, we find that Fred would repudiate his desire to marry Susie, and be disgusted with his earlier self.

I would construe Fred's volition not to include Fred's decision to marry Susie...

Clearly, the Yudkowskian notion of "volition" really has little to do with "volition" as commonly construed!!

While I can see the appeal of extrapolating Fred into "the Fred that Fred would like to be," I also think there is a lot of uncertainty in this process. If Fred has inconsistent aspects, there may be many possible future-Freds that Fred could evolve into, depending on both environmental feedback and internal (sometimes chaotic) dynamics. If one wishes to define the coherent extrapolated Future-Fred as the average of all these, then one must choose what kind of average to use, and one may get different answers depending on the choice. This kind of extrapolation is far from a simple matter -- and since "self" is not a simple matter either, it's not clear that current-Fred would consider all or any of these Future-Freds as being the same person as him.

In CAV as described here, I consider "volition" in the more typical sense -- rather than in the sense of Yudkowskian "extrapolated volition" -- as (roughly) "what a person or other intelligent agent chooses." So according to my conventional definition of volition, Fred's volition is to kill Steve and marry Susie.

Mijic's List of Desirable Properties

Roko Mijic has posited a number of general "desirable properties" for a superintelligence, and presented CEV as one among many possible concrete possibilities instantiating these principles:

Meta-algorithm: Most goals the AI has will be harvested at run-time from human minds, rather than explicitly programmed in before run-time.
Factually correct beliefs: Using the AI's superhuman ability to ascertain the correct answer to any factual question in order to modify preferences or desires that are based upon false factual beliefs.
Singleton: Only one superintelligence is to be constructed, and it is to take control of the entire future light cone with whatever goal function is decided upon.
Reflection: Individual or group preferences are reflected upon and revised, in the style of Rawls' reflective equilibrium.
Preference aggregation: The set of preferences of a whole group are to be combined somehow.

My own taste is that reflection, preference aggregation and meta-algorithm-ness are good requirements. The "singleton" requirement seems to me something that we can't know yet to be optimal, and don't need to decide at this point.

The "factually correct beliefs" requirement also seems problematic, if enforced too harshly, in the sense that it's hard to tell how a person, who has adapted their beliefs and goals to certain factually incorrect beliefs, would react if presented with corresponding correct beliefs. Hypothesizing that a future AI will be able to correctly make this kind of extrapolation is not entirely implausible, but certainly seems speculative. After all, each individual's reaction to new beliefs is bound to depend on the reactions of others around them, and human minds and societies are complex systems, whose evolution may prove difficult for even a superintelligence to predict, given chaotic dynamics and related phenomena. My conclusion is that there should be a bias toward factual correctness, but that it shouldn't be taken to override individual preferences and attitudes in all cases. (It's not clear to me whether this contradicts Mijic's perspective or not.)

Coherent Aggregated Volition

What I call CAV is an attempt to capture much of the essential spirit of CEV (according to my own perspective on CEV), in a way that is more feasible to implement than the original CEV, and that is prototype-able now in simplified form.

Use the term "gobs" to denote "goal and belief set" (and use "gobses" to denote the plural of "gobs"). It is necessary to consider goals and beliefs together, rather than just looking at goals, because real-world goals are typically defined in terms whose interpretation depends on certain beliefs. Each human being or AGI may be interpreted to hold various gobses to various fuzzy degrees. There is no requirement that a gobs be internally logically consistent.

A "gobs metric" is then a distance on the space of gobses. Each person or AI may also agree with various gobs metrics to various degrees, but it seems likely that individuals' gobs metrics will differ less than their gobses.

Suppose one is given a population of intelligent agents -- like the human population -- with different gobses. Then one can try to find a gobs that maximizes the four criteria of

logical consistency
compactness
average similarity to the various gobses in the population
amount of evidence in support of the various beliefs in the gobs

The use of a multi-extremal optimization algorithm to seek a gobs defined as above is what I call CAV. The "CAV" label seems appropriate since this is indeed a system attempting to achieve both coherence (measured via compactness + consistency) and an approximation to the "aggregate volition" of all the agents in the population.

Of course there are many "free parameters" here, such as

how to carry out the averaging (for instance one could use a p'th-power average with various p values)
what underlying computational model to use to measure compactness (different gobs may come along with different metrics of simplicity on the space of computational models)
what logical formalism to use to gauge consistency
how to define the multi-extremal optimization: does one seek a Pareto optimum?; does one weight the different criteria and if so according to what weighting function?
how to measure evidence
what optimization algorithm to use

However, the basic notion should be clear, even so.

If one wants to take the idea a step further, one can seek to use a gobs metric that maximizes the criteria of

compactness of computational representation
average similarity to the gobs metrics of the minds in the population

where one must then assume some default similarity measure (i.e.m etric) among gobs metrics. (Carrying it further than this certainly seems to be overkill.)

One can also use a measure of evidence defined in a similar manner, via combination of a compactness criterion and an average similarity criterion. These refinements don't fundamentally change the nature of CAV.

Relation between CEV and CAV

It is possible that CEV, as roughly described by Yudkowsky, could lead to a gobs that would serve as a solution to the CAV maximization problem. However, there seems no guarantee of this. It is possible that the above maximization problem may have a reasonably good solution, and yet Yudkowskian CEV may still diverge or lead to a solution very far from any of the gobses in the population.

As a related data point, I have found in some experiments with the PLN probabilistic reasoning system that if one begins with a set of inconsistent beliefs, and attempts to repair it iteratively (by replacing one belief with a different one that is more consistent with the others, and then repeating this process for multiple beliefs), one sometimes arrives at something VERY different from the initial belief-set. And this can occur even if there is a consistent belief set that is fairly close to the original belief-set by commonsensical similarity measures. While this is not exactly the same thing as CEV, the moral is clear: iterative refinement is not always a good optimization method for turning inconsistent belief-sets into nearby consistent ones.

Another, more qualitative observation, is that I have the uneasy feeling CEV seeks to encapsulate the essence of humanity in a way that bypasses the essential nature of being human...

CEV wants to bypass the process of individual and collective human mental growth, and provide a world that is based on the projected future of this growth. But, part of the essence of humanity is the process of growing past one's illusions and shortcomings and inconsistencies.... Part of Fred's process-of-being-Fred is his realizing on his own that he doesn't really love Susie in the right way ... and, having the super-AI decide this for him and then sculpt his world accordingly, subtracts a lot of Fred's essential humanity.

Maybe the end-state of resolving all the irrationalities and inconsistencies in a human mind (including the unconscious mind) is something that's not even "human" in any qualitative, subjective sense...

On the other hand, CAV tries to summarize humanity, and then would evolve along with humanity, thus respecting the process aspect of humanity, not trying to replace the process of humanity with its expected end-goal... And of course, because of this CAV is likely to inherit more of the "bad" aspects of humanity than CEV -- qualitatively, it just feels "more human."

Relation of CAV to Mijic's Criteria

CAV appears to adhere to the spirit of Mijic's Meta-algorithm, Factual correctness and Preference aggregation criteria. It addresses factual correctness in a relatively subtle way, differentiating between "facts" supported by different amounts of evidence according to a chosen theory of evidence.

CAV is independent of Mijic's "singleton" criterion -- it could be used to create a singleton AI, or an AI intended to live in a population of roughly equally powerful AIs. It could also be used to create an ensemble of AIs, by varying the various internal parameters of CAV.

CAV does not explicitly encompass Mijic's "reflection" criterion. It could be modified to do so, in a fairly weak way, such as replacing the criterion

average similarity to the various gobses in the population

with

average similarity to the various gobses displayed by individuals in the population when in a reflective frame of mind

This might be wise, as it would avoid including gobses from people in the throes of rage or mania. However, it falls far short of the kind of deep reflection implied in the original CEV proposal.

One could also try to teach the individuals in the population to be more reflective on their goals and beliefs before applying CAV. This would surely be a good idea, but doesn't modify the definition of CAV, of course.

Prototyping CAV

It seems that it would be possible to prototype CAV in a fairly simple way, by considering a restricted class of AI agents, for instance OpenCog-controlled agents, or even simple agents whose goals and beliefs are expressed explicitly in propositional-logic form. The results of such an experiment would not necessarily reflect the results of CAV on humans or highly intelligent AGI agents, but nevertheless such prototyping would doubtless teach us something about the CAV process.

Discussion

I have formulated a method for arriving at AGI goal system content, intended to serve as part of an AGI system oriented beneficially toward humans and other sentient beings. This method is called Coherent Aggregated Volition, and is in the general spirit of Yudkowsky's CEV proposal as understood by the author, but differs dramatically from CEV in detail. It may be understood as a simpler, more feasible approach than CEV to fulfiling Mijic's criteria.

One thing that is apparent from the above detailed discussion of CAV is the number of free parameters involved. We consider this a feature not a bug, and we strongly suspect that CEV would also have this property if it were formulated with a similar degree of precision. Furthermore, the parameter-dependence of CEV may seem particularly disturbing if one considers it in the context of one's own personal extrapolated volitions. Depending on the setting of some weighting parameter, CEV may make a different decision as to whether Fred "really" wants to marry Susie or not!!

What this parameter-dependence means is that CAV is not an automagical recipe for producing a single human-friendly goal system content set, but rather a general approach that can be used by thoughtful humans or AGIs to produce a family of different human-friendly goal system content sets. Different humans or groups applying CAV might well argue about the different parameters, each advocating different results! But this doesn't eliminate the difference between CAV and other approaches to goal system content that don't even try to achieve broad-based beneficialness.

Compared to CEV, CAV is rather boring and consists "merely" of a coherent, consistent variation on the aggregate of a population's goals and beliefs, rather than an attempt to extrapolate what the members of the population in some sense "wish they wanted or believed." As the above discussion indicates, CAV in itself is complicated and computationally expensive enough. However, it is also prototype-able; and we suspect that in the not too distant future, CAV may actually be a realistic thing to implement on the human-population scale, whereas we doubt the same will be true of CEV. Once the human brain is well understood and non-invasively scannable, then some variant of CAV may well be possible to implement in powerful computers; and if the projections of Kurzweil and others are to be believed, this may well happen within the next few decades.

Returning to the three aspects of beneficial AGI outlined at the start of this essay: I believe that development of the currently proposed OpenCog design has a high chance of leading to an AGI architecture capable of pursuing its goal-system content in a rational way; and this means that (in my world-view) the main open question regarding beneficial AGI pertains to the stability of goal systems under environmental variation and systemic self-modification. I have some ideas for how to handle this using dynamical systems theory, but these must wait for a later post!

Saturday, February 06, 2010

Siri, the new iPhone "AI personal assistant": Some useful niche applications, not so much AI

Today I tried out Siri, the new AI "personal assistant" app for the iPhone. It has some very smart people behind it, and is based on some code and ideas from the DARPA-funded CALO project. Siri's earlier prototype version impressed me with its integration of dialogue and maps, so I was eager to check it out.

The Siri website says:

Just like a real assistant, Siri understands what you say, accomplishes tasks for you and adapts to your preferences over time.

It also describes Siri using metaphors of human learning, e.g. "like a child taking its first steps" ....

Ahem....

You may want to scroll to the end of this post, and read my dialogue with Siri, before reading the rest of what I have to say about the app.

This review has been edited in response to some comments (which you'll see below this post) by Dag, one of the Siri creators. If you're curious to see the original version of my review, it's here. There are no huge changes but I hope this revised version is an improvement.

This is the first release, and one doesn't want to judge the whole Siri project based on a first impression. But all I can report on now is my reaction to the product I just downloaded on to my phone and chatted with....

Two Perspectives on Siri

Before giving my detailed comments, I'd like to distinguish two different perspectives on Siri

Considered as a freebie iPhone app, is it funky? Is it worth downloading and playing with? Might it be useful for some purposes?
How well does it live up to the "AI Personal Assistant" label, and the description of being "like a human assistant", "like a child taking its first steps", etc.

Plenty of others can assess Siri as a freebie iPhone app as well or better than I can, so I'll make a few comments in that regard, but focus most of my attention here on the AI aspect, since that's my own area of expertise.

Overall, my take is that

Indeed, this version of Siri may be very useful for carrying out a very limited set of very specific functionalities
It's not anything like a real assistant; and worse than that, its attempts to really understand anything you say seem very limited and domain-specific at this point
The basic "chatbot" functionality seems unnecessarily crude and quirky

As an AI developer I'm well aware that sometimes you can make mediocre (or worse) products or demos based on deeply powerful technology. So I'm open to the possibility that there is some profound or at least interesting tech underlying Siri. But, to be quite blunt, I was unable to find it via playing with the product for an hour or so.

Siri from an AI Perspective

Looking at Siri from the perspective of someone who has built a bunch of AI systems, including chatbots and more serious natural language processing and reasoning systems, what I see here is:

a rather crude keyword based chatbot (i.e. crude even by the standards of keyword based chatbots), without much attempt at dialogue management
straightforward, rule-based integration with a very small set of knowledge bases (about restaurants and movies, for instance) and with a map engine
straightforward integration with TrueKnowledge for answering of factual questions
decent speech-to-text with a very nice interactive interface

What surprised me most was the crudity of the dialogue management, which you'll see in the transcript below, of my initial conversation with Siri. So often Siri's responses had nothing to do with the questions I asked.

And Siri's persistence of information between questions is rudimentary and awkward. Once you ask one question about New York, it pretty much assumes all your subsequent questions are about New York ... but it doesn't understand linguistic references to previous queries, not even simple ones.

But Is Siri Useful?

But what about the practical aspect? Is Siri useful as a virtual assistant? I suppose I might use it to find restaurants or movies, or to check flight status. And just the other day, in the midst of a conversation in the car with the kids, I wanted to know Hitler's birth year, and I asked Wikipedia on my iPhone -- it would have been nicer to ask Siri instead.

So, yeah, for a few specific functionalities, where Siri's language engine and database integration are well-tuned -- yeah, it may be genuinely useful.

But my impression is the useful functionality is really VERY narrow and brittle. If you go even slightly beyond what the application has been specifically tweaked for, the results seem to be useless and annoying.

As a single example, consider the following snippet from my first conversation with Siri, given in full at the end of this post:

Ben: What is Kate Braverman's latest book?
Siri: OK, here are some businesses named "Kate" a few miles from here

This is really an unnecessary gaffe', but it's not exceptional; Siri, in its current version, does that sort of thing quite frequently. It makes this mistake because the query is about books and authors, rather than about stuff it's tuned for: restaurants, movies, flights, TrueKnowledge facts. And even for some things it's tuned for, like flights, the results are often quite weird and confusing, as you'll see in the example dialogue below.

How about the speech-to-text? (Supplied by Nuance, and performed on a server not on the phone.) It's so-so.... Which may be a great achievement technically given the quality of the iPhone's mike -- but still, it's only so-so.

The iterative graphical interface for speech-to-text is GREAT -- being able to review Siri's interpretations of your speech and correct them on the phone before they're sent to the server is very nice. But it makes enough mistakes that, all in all, using its speech-to-text is many times slower for me than using the iPhone keyboard.

I can see some genuine niche applications for the current Siri version: restaurant and movie location, flight status checking, fact searching, and maybe a few other similar applications, while driving. Or while not driving, for users who aren't comfortable typing.

This is all very well, but it's a far far cry from being like a human assistant, right?

Does Siri Understand?

The website warns us that this is an early-stage product:

Siri is young and, like a child taking its first steps, may be awkward at times. Siri may occasionally misunderstand things you ask it to do even within its range of understanding.

but IMO, the comparison with a child is inappropriate. Most of the mistakes Siri makes are not mistakes of misunderstanding. They are mistakes of not even trying to understand -- mistakes of replying in the manner of a simplistic chatbot acting on keyword cues.

If I had an iPhone app that made mistakes of genuine misunderstanding, like a child, I'd devote time to teaching it regardless of whether it assisted me in any way. In the case of Siri, I don't get the feeling of any intelligence or learning going on.

Dag, in his comment on my first version of this review, noted that in some contexts Siri does try to understand, e.g. if you ask it "Book me a table for two at Zibibbo's" it understands that "book" refers to the making of reservations rather than the kind of book you read. Fair enough -- but after reading his comment I played around with Siri a little more and my impression is that its "understanding" of this sort is extremely specialized and focused on a handful of applications like making restaurant and movie reservations. Of course, one could argue that by scaling up this kind of specialized understanding a few hundred thousand times, one will achieve something really intelligent -- but

I tend to doubt it, because I think intelligence has more to do with the ability to learn to handle new domains, than the possession of hand-coded rules allowing "understanding" in particular domains
Even if one does believe humanlike intelligence is a patchwork of domain-specific rule-sets, then one must admit that the fraction of humanlike intelligence displayed by an application like Siri is rather miniscule. If one believes this kind of model of human intelligence, one should be building Cyc, not Siri (and the difficulties of that kind of AI approach are well known)

The current version is, for better or worse, a simplistic tool with a nice interface and a very, very limited scope. In a sense it does understand some things, but only in the very specialized domains in which its "understanding" was very specifically programmed.

Perhaps later versions will add enough functionality to constitute a more generally useful "assistant." But in my view, without some fundamentally different (and more intelligent) approach to dialogue management, the product is not likely grow into anything but an assemblage of a few dozen specialized information-gathering widgets glued together by a chatbot. I could be wrong -- it's happened before! -- but I'm just calling it as I see it....

I read Nova Spivack's very insightful discussion on Siri a number of months ago, and studied the Siri prototype fairly carefully, and based on that prior experience I actually expected more from the first release. I hoped for a little more sense of general-scope humanlike understanding, of there being an "assistant with a personality" there. Nope. Maybe the next version will have some fundamentally different technology inside it ... one can always hope.....

Apologies if this review is a bit harsh -- but as I clarified from the start, I'm reviewing Siri not just as an iPhone app, but relative to the rhetoric associated with it about being "like a child taking its first steps" and "just like a human assistant." If Siri were merely marketed as an iPhone app with a few interesting niche uses, I probably wouldn't bother to write a blog post about it.... But I've devoted much of my life to the quest to make AI systems that actually learn like children, and ultimately will display intelligence similar to and then transcending that of adult humans. The quest to make humanlike AI is a serious thing. Siri just doesn't feel to me like any kind of step along the path to serious AI systems, and I don't really like it when somebody's marketing department uses "real AI" as a marketing slogan for a product (even if a nice one in some ways) that actually has nothing to do with humanlike general intelligence.

A Look at Some Others Users Reactions

Encouraged by Dag's comment on the original version of this review, I looked at some tweets on Siri by "ordinary users" not biased by an AI background, and here are some examples, which I tried to choose in a genuinely fair-minded way:

turrean Playing with new iPhone app called "Siri Assistant." you can say, "Movies nearby" and that's what it finds. Feel like I'm on Star Trek.

Tito8181 @laur3453 finially you should download "Siri" for iPhone. It's like your own personal assistant! I love it! It's completely free

Shusmo @basemaggad Siri,launched today as a free iPhone app, is a virtual personal assistant that amazingly resembles..an actual personal assistant

aneesha Siri Brings Artificial Intelligence to the iPhone http://j.mp/9pO6Pd

tomweishaar #siri .Not perfect but interesting, I asked "when is kickoff for the super bowl football game?" Siri referred me 2 bowling ally's in my area

tomweishaar #siri same question using Google Mobile on iPhone, "what time does super bowl start?" got answer right away. Nuance engine 4 both services

What I see here is that some users are indeed impressed by the voice recognition capability (even though it's not groundbreaking by any means) and by the ability to respond usefully to some simple queries like "Movies nearby." That's great! On the other hand, some other users note that it makes simple mistakes like thinking "super bowl" is about bowling, some of which other tools like Google Mobile don't make.

Also, I note that the meme of "Siri as AI" is frequent in the tweetosphere and the media generally ... which makes me feel justified in reviewing it as an AI system and not just as a funky iPhone app. If you're going to market it as an AI system that's just like a human personal assistant and is learning like a human child, then you should expect it to be assessed that way, at least to a certain extent, right? The dazzle and mystique of AI cuts both ways: it helps lend some sex appeal to your product, but also leaves you open to being held to the standards of artificial intelligence ;-)

My First Chat with Siri

Finally, to finish off this post, here's the transcript of my first conversation with Siri, with some comments interspersed and prepended by "COMMENT." I talked to Siri some more after this conversation, but it didn't reveal any more intelligence or anything particularly surprising ... I think this transcript will be more than enough to give you the idea.

Siri: Hello, I'm Siri, I'm your new personal assistant

Ben: Where is New York?

Siri: [shows map of New York]

Ben: Do computers have dicks?

Siri translates to: Do computers have Dick's?

Ben OKs the translation (just for fun)

Siri: OK, here are some computer stores close to New York, NY: [gives list]

COMMENT: OK, the maps integration is nice. And it's not a bad answer as there are some computer stores named Dick's.

Ben: Do computers have penises?

Siri: OK, I found these computer stores close to New York, NY: [gives list]

COMMENT: Total lack of any attempt to understand the question

Ben: Do you have a brain?

[The speech-to-text took 4 tries to get this, but finally did]

Siri: OK, here are some local businesses close to New York, NY: [gives list]

COMMENT: What???

Ben: Where can I buy a new head?

Siri: Here are movies that are new to theaters playing near New York: [gives list]

COMMENT: OK, it understood "new" and used a default for the rest...

Ben: Will you call Laising for me?

Siri translates this to: Will you call great thing for me?

After many tries Siri cannot get the Chinese name "Laising" so Ben OKs the wrong translation just for fun

Siri: Here are the most popular movies playing near New York: [gives list]

COMMENT: Where the heck did that come from???

Ben: I want to buy a new phone

Siri: Here are the the most popular movies that are new to theaters playing near New York: [gives list]

Ben: I do not want to see a movie. I want to buy a new phone.

Siri: Here are the most popular movies that are new to theaters playing near New York: [gives list]

COMMENT: Again, a total lack of understanding ... no attempt to find the arguments of the verb "want" or understand the meaning of "not." Just statistical word-matching (on "new") used in a crude way.

Ben: Where can I find Mongolian food?

Siri: OK, I found these Mongolian restaurants a short drive from New York, NY

COMMENT: Useful. Again, the maps integration is nice, and it knows that restaurants are associated with food

Ben: Where can I find Mongolian women?

Siri: OK, here are some Mongolian restaurants a short drive from New York, NY

COMMENT: Oops, maybe it didn't know restaurants were associated with food. Of course, Mongolian restaurants ARE a good place to find Mongolian women, but it didn't know that. It just associated "Mongolian" with "restaurants" by default because it doesn't know anything else Mongolian in New York, and it's biased toward location queries.

Ben: What is the capital of Mongolia?

Siri: I think I learned that in grad school ... I found answer for you: [answer in box: Ulaanbaatur, the city in Ulaanbaatar, Mongolia (source:TrueKnowledge.com)]

COMMENT: Good, TrueKnowledge is integrated in the obvious way

Ben: How many people live there?

Siri: I think I learned that in grad school.... I'm not sure I can help you with that yet, but here's what a search engine has to say: [points to results of a Web search, which are useless, e.g. "Math Lesson: How many people...", "Is there bad people in Los Angeles...", etc.]

COMMENT: It can't deal with "there." Very weak understanding of dialogue.

Ben: How many people live in Ulaanbaator, Mongolia?

Siri: I found an answer for you: [in a box, 1067472]

COMMENT: But it actually knew the answer, via TrueKnowledge again...

Ben: How many of those people are men?

Siri: I'm not sure of the answer, but here's what a search engine has to say: [points to results of a Web search, which are useless...]

Ben: What is Kate Braverman's latest book?

Siri: OK, here are some businesses named "Kate" a few miles from here

COMMENT: Unnecessarily confused

Ben: I want to buy a book written by Kate Braverman

Siri: Sorry, I can't get you information on books yet, but I can help you look on the Web [points to a search engine, and to Amazon's website]

Ben: What can you tell me about Kate Braverman?

Siri: I can check on flights for you, but I need a little more information. I need to know at least one airport code, or an ariline and a flight number.

COMMENT: Unnecessarily confused again.

Ben: Are airplanes flying from Reagan National Airport now, or have they been delayed because of the snow?

Siri: Hmmm. I couldn't find any direct flights from RLA for February 8. For now, I can only find the status for direct flights. Soon I'll be able to book flights, and show connecting flights. [This is confusing since it is not Feb 8 now, but rather Feb 6]

COMMENT: That was exactly the kind of question I would like to be able to ask a virtual assistant and get an answer. But Siri badly botched it.

Ben: What is RLA?

[this took 3 tries for the speech to text]

Siri: I couldn't find any direct flights from RLA for February 8. For now, I can only find the status for direct flights. Soon I'll be able to book flights, and show connecting flights.

COMMENT: More unnecessary confusion, as my question was very simple and direct and didn't merit such a complex answer

Ben: I thought Reagan National Airport was DCA, not RLA

[this took 3 tries for the speech to text]

Siri: I'm not sure what you are asking, Ben. One of these, perhaps [points to empty list]

Ben: Who programmed you?

[after 6 tries the speech-to-text couldn't get it, and I gave up]