To follow this blog by email, give your address here...

Sunday, April 06, 2008

Artificial Wisdom (... episodic memory, general intelligence, the Tao of John Coltrane, and so forth)

Every now and then, someone suggests to me that, alongside the pursuit of Artificial Intelligence, we should also be pursuing "Artificial Wisdom."

I always figured the "artificial wisdom" idea was probably just a bunch of useless English-language wordplay -- but one night last week, while watching Idiocracy with the kids for the second time (great movie exploring a non-Singularity-based future by the way ... highly recommend it!), I spent a while surfing the Web on my laptop refreshing my memory on how others have construed the "wisdom" concept and musing on what it might mean for AI.

Surprisingly enough, this led in some moderately interesting directions -- nothing revolutionary, but enough to justify the couple hours spent musing about it (and another 90 minutes or so synthesizing and writing up my glorious conclusions).

My main conclusion was a perspective in which wisdom is viewed as one of three core aspects of intelligence, associated with three distinct types of memory:

  • cleverness, associated with declarative memory (and the ability to manipulate abstract, certain or uncertain declarative knowledge)
  • skillfulness, associated with procedural memory (and the ability to effectively learn and adapt new procedures based on experience)
  • wisdom, associated with episodic memory (and insightful drawing of large-scale conclusions therefrom)

This being a blog post, though, rather than just presenting my conclusion, I'll start out by recounting some of the winding and mostly irrelevant path that led me there ;-)

Classical Conceptions of Wisdom

I started out with the dictionary, and as usual found it close to useless....

A typical dictionary definition of "wisdom," which is not a heck of a lot of help, is from Wiktionary, which tells us that

wisdom (plural wisdoms)

means

  1. An element of personal character that enables one to distinguish the wise from the unwise.
  2. A piece of wise advice.
  3. The discretionary use of knowledge for the greatest good.
  4. The ability to apply relevant knowledge in an insightful way, especially to different situations from that in which the knowledge was gained.
  5. The ability to make a decision based on the combination of knowledge, experience, and intuitive understanding.
  6. (theology) The ability to know and apply spiritual truths.
and furthermore that

wise

means

Showing good judgement or the benefit of experience.

Hoo haw.

These definitions don't give us any particularly interesting way of distinguishing "wisdom" from "intelligence." Essentially they define wisdom as either intelligence, spiritual insight, or the application of intelligence for ethical ends. Nothing new here.

Wikipedia is slightly more useful (but only slightly). Firstly it notes that

A standard philosophical, (philos-sophia: literally "lover of wisdom"), definition says that wisdom consists of making the best use of available knowledge.

It then notes some psychological research demonstrating that in popular culture, wisdom is considered as different from intelligence. Psychological researchers are quoted as saying that though "there is an overlap of the implicit theory of wisdom with intelligence, perceptiveness, spirituality and shrewdness, it is evident that wisdom is a distinct term and not a composite of other terms."

More interestingly, Wikipedia notes, Erik Erikson and other psychologists have argued that it is, in large part, the imminence of death that gives older human beings wisdom.

The knowledge of imminent death is seen as focusing the mind on concerns beyond its own individual well-being and survival, thus inducing a broader scope of understanding and an identification with the world at large, which are associated with the concept of wisdom.

This is interesting from a transhumanist perspective in that it suggests that the death of death would be the death of wisdom! I have seen some evidence for that in the incredible, shallow-minded selfishness of a certain subset of the transhumanist community -- people who are dead-set on having their own selves live forever, without any real thought as to why this might be valuable or what this might mean in a larger perspective. But of course, I don't really think death is the only or ultimate source of wisdom, though in a human context I can believe it's one of the main forces nudging us toward wisdom.

Paul Graham on Wisdom

One of the more interesting theories of wisdom I've run across (I found it a while ago for some random reason I've forgotten, and dug it up again last week) came from a contemporary blogger, Paul Graham:

http://paulgraham.com/wisdom.html

who distinguishes wisdom from intelligence in the following way:


"Wise" and "smart" are both ways of saying someone knows what to do. The difference is that "wise" means one has a high average outcome across all situations, and "smart" means one does spectacularly well in a few.

This explanation also suggests why wisdom is such an elusive concept: there's no such thing. "Wise" means something—that one is on average good at making the right choice. But giving the name "wisdom" to the supposed quality that enables one to do that doesn't mean such a thing exists. To the extent "wisdom" means anything, it refers to a grab-bag of qualities as various as self-discipline, experience, and empathy

Graham considers wisdom as partly a kind of de-biasing and cleansing of the mind, a notion that has some resonance with the modern notion of "Bayesian calibration" of the mind:

Recipes for wisdom, particularly ancient ones, tend to have a remedial character. To achieve wisdom one must cut away all the debris that fills one's head on emergence from childhood, leaving only the important stuff. Both self-control and experience have this effect: to eliminate the random biases that come from your own nature and from the circumstances of your upbringing respectively. That's not all wisdom is, but it's a large part of it. Much of what's in the sage's head is also in the head of every twelve year old. The difference is that in the head of the twelve year old it's mixed together with a lot of random junk.

Provocatively, Graham also posits that intelligence is quite different from wisdom, in that it has to do with accentuating rather than avoiding biases:

The path to intelligence seems to be through working on hard problems. You develop intelligence as you might develop muscles, through exercise. But there can't be too much compulsion here. No amount of discipline can replace genuine curiosity. So cultivating intelligence seems to be a matter of identifying some bias in one's character -— some tendency to be interested in certain types of things—- and nurturing it. Instead of obliterating your idiosyncrasies in an effort to make yourself a neutral vessel for the truth, you select one and try to grow it from a seedling into a tree.

To avoid confusion, from here on I'll sometimes refer to Graham's interpretation of these concepts as Graham-style wisdom and Graham-style intelligence, respectively.

There is an unclarity in Graham's essay as to the extent to which he thinks the kind of focusing and bias-accentuation that's part of Graham-style intelligence has to involve irrationality. My own view is that Graham-style intelligence definitely does NOT require an individual to be irrational, in the sense of making suboptimal judgments about a particular problem given the resources devoted to thinking about the problem. However, a finite system in a complex environment is always going to be irrational to some measure, due to not having enough resources to make a fully analysis of any complex situation. To the extent that Graham-style intelligence involves heavy focus on some particular set of topic areas, it's going to drain resources from other areas, thus making the mind less intelligent regarding these other areas.

So, in Graham's view, intelligence has to do with focusing loads of resources on processing in a handful of narrow domains that match one's innate biases, whereas wisdom has to do with evenly distributing processing across all the different domains in one's environment.

Along these lines Graham also notes (correctly, I think) that:

The wise are all much alike in their wisdom, but very smart people tend to be smart in distinctive ways.

As Graham conceives it, wisdom is basically equivalent to general intelligence: it's intelligence averaged across a variety of situations. In mathematics there exist various sorts of averages, some of which weight extreme values more heavily than others (these are p'th power averages). Graham's view would be that "wisdom" and "intelligence" are both estimates of general intelligence (defined as intelligence averaged over different domains/tasks), but with different sorts of averaging: in the case of intelligence, an averaging that pays especial attention to extremes (say a p-power average with p=5, or whatever); and in the case of wisdom, a more typical arithmetic averaging.

This is all sort of nice, but (as will become clear as the essay unfolds) I don't really think it gets at the crux of the matter.


Wisdom Goes Beyond the Individual

Another interesting perspective (that I also think doesn't get at the crux of the matter) is given in the paper "Meaning generation and artificial wisdom" with abstract

We propose an interpretation of wisdom in terms of meaning generation in social groups. Sapient agents are able to generate useful meanings for other agents beyond their own capability of generation of self-meanings. This makes sapient agents specially valuable entities in agent societies because they provide interagent reliable third-person meaning generation that provides some functional redundancy that contributes to enhance individual and social robustness and global performance.

Here wisdom is identified with the ability to generate meaning in the social group, going beyond meaning that is perceptible by the individual doing the meaning-generating. This harks back to Erikson's understanding of wisdom as related to identification with the world at large, beyond the mind/body.

This view also reminds me vaguely of Aldous Huxley's Perennial Philosophy, an attempt to distill the "wisdom teachings" of all the world's religions. In the Perennial Philosophy, wisdom teaches that the individual self is an illusion and all of us are one with the universe (and yet in a sense still distinct and individual.)

Mulling over all this, none of it really satisfied me. Of course, a folks concept like "wisdom" can't be expected to have a crisp and sensible formalistic definition ... but it still seemed to me that all the attempts at systematization and formalization I'd read about were missing some really essential aspects of the folk concept.

Wisdom, Cleverness and Skillfulness

And so, I came up with a totally different idea....

After a fair bit of musing, my mind kept drifting to the familiar distinction between declarative, procedural and episodic memory (drawn from textbook cognitive psych).

Remember:

  • Declarative knowledge = knowledge of facts, conjectures, hypotheses (abstract or concrete)
  • Procedural knowledge = knowledge of how to do things (could be physical, mental, social, etc.)
  • Episodic knowledge = knowledge of stories that have occurred in the history of intelligent beings (oneself, others one knows, others one has heard about,...)

One interesting thought that popped into my head is: The concept of wisdom, in its folk-psychology sense, has a lot to do with the ability to solve problems that are heavily dependent on context, using intuition that's based on large-scale analysis of one's episodic-memory store.

Or, less geekily: Wisdom consists of making intelligent use of experience.

A subtlety here is that this need not be one's own experience. Direct experience may be the best way to acquire wisdom (and surely this is part of the reason that wisdom is commonly associated with age) but some rare folks are remarkably gifted at absorbing wisdom from the experience of others -- absorbed via observation, via reading, or conversation, or whatever.

More broadly, this train of thought leads me to a sort of fundamental trinity of aspects of intelligence: cleverness, skillfulness and wisdom.

There's cleverness, which is the ability to appropriately manipulate, create and absorb declarative knowledge toward one's goals. This declarative knowledge may be abstract, or it may be concrete facts. Declarative knowledge is largely symbolic in nature, and cleverness is largely founded on adeptness at symbol-manipulation.

There's skillfulness, which is the ability to effectively do stuff in service of one's goals. This covers physical skills but also highly abstract mental skills like writing an essay, proving a theorem, or closing a business deal.

In some domains skillfulness can exist in the total absence of cleverness. The vast majority of shred metal guitarists would seem to fit in this category (to choose a somewhat random example based on what's playing in my headphones at the moment). These guys are so damn skilled, yet there's not much adept manipulation of meaning in their solos, or compositions. Compare the typical shred guitarist to Yngwie Malmsteen or Buckethead, who are also massively skilled (and in similar ways) -- but who are also highly clever in their symbolic manipulation of the abstract patterns characterizing the concrete sonic forms they're so skilled at producing.

In other domains, it's really hard for cleverness and skillfulness to emerge in any way except exquisitely intercombined. Mathematics is an example. Procedural knowledge at doing proofs is needed for fully understanding complex proofs -- because so many steps are left out in proofs as typically written down, if you don't know how to do proofs, you won't be able to fill in all the gaps in your head when you read a proof, so you'll never get more than a general understanding. On the other hand, it's even more obvious that deep declarative understanding and manipulation-ability regarding mathematical content is necessary to do mathematical proofs. Math is a domain where procedural and declarative intelligence have got to work in extremely tight synergy.

Finally, there's wisdom, which as I'm conceiving it here is the ability to intelligently draw conclusions from a vast repository of data regarding specific situations.

Human minds tend to organize data regarding specific situations using story-like, "narrative" structure, so that in human practice, wisdom often takes the form of the ability to mine appropriate abstract patterns from a vast pool of remembered stories.

Of course, the operation of human episodic memory is largely constructive -- we don't actually grab experiential data out of some sort of neurological database; rather, we synthesize stories from fragmentary images, stories, and such. Wisdom is about synthesizing appropriate stories from large databases of partially-remembered, ambiguous, fractional stories -- and then, as appropriate, using these stories to guide the creation of declarative or procedural knowledge.

In mathematics, wisdom is closely related to what's called "mathematical maturity" ... the general sense of how mathematics is done. Mathematical maturity guides the mind to interesting problems and interesting concepts ... and helps you choose an overall proof strategy (whereas it's cleverness and skillfulness that help you carry out the proof).

The transition from {cleverness + skillfulness} to wisdom in music is epitomized to me by the mid-to-late John Coltrane ... the Coltrane of "My Favorite Things" and "A Love Supreme." These are the solos of a man who has listened so much and played so much that he's disassembled thousands of different musical narratives and reassembled them to tell different kinds of stories, like no one ever told before. So much richer than the merely clever, skillful and emotionally moving solos of the early Coltrane. Certain works of great art manage to be intensely personal and dramatically universal at the same time, and
this often results from wisdom in the sense I'm defining it here.

Note that a mature mathematician or a world-changing jazz soloist need not be "wise" in the sense of a Taoist sage. The classical conception of wisdom has to do with making intelligent judgments based on large stores of experience in everyday human life. In the old days this was pretty much the only experience there was -- everyday human life plus various shamanic and psychedelic experiences.... But now the human world has become far more specialized, and it's possible to have a specialized wisdom, because it's possible to have a huge and rich store of episodic knowledge that's restricted to some special domain, like music or mathematics, or even a sufficiently complex game like Go or chess.

This vision of wisdom would seem to contradict Graham's, cited above -- he views wisdom as related to the ability to achieve goals over a broad variety of domains, in contract to intelligence which he conceives as a more narrowly domain-specialized intelligence.

But I don't think the contradiction is total.

I think that within a sufficiently rich and complex domain, one requires wisdom as I've defined it in order to achieve a really high level of intelligence. Learning skills and manipulating symbols is not enough. Direct and intelligent mining of massive experience-stores is needed.

I also think that wisdom, even if achieved initially and primarily within a certain domain, has a striking power to transcend domains. There are a lot of universal patterns among large stores of stories, no matter what the domain.

But even if the wisdom achieved by a great mathematician or chess player or jazz soloist helps that person to intuitively understand the way things work in other domains, this won't necessarily lead them to practical greatness in these other domains -- great achievement seems to require a synthesis of wisdom with either cleverness or skillfulness, and in some domains (like math or jazz improvisation) all three.

Defined-Problem versus Contextual Intelligence

Next, what does all this have to do with artificial intelligence?

One of the lessons learned in the last few decades of AI practice is that there is a pretty big difference between:

  1. Defined-problem intelligence: Problem-solving that occurs "after a crisply-defined problem statement has been identified", versus
  2. Contextual intelligence: problem-solving that is mainly concerned with interpreting general goals in the context of a complex situation, and, "figuring out what the context-specific problem is, in the first place" -- i.e. figuring out what crisply-defined problem, if solved in the relevant context, is likely to work toward the general goals at hand

I think this might be a more useful and more precise distinction than the "narrow AI" versus "general AI" distinction that I've often made before. It's ultimately getting at the same thing, but it's putting the point in a better way, I think.

What's narrow about "narrow AI" systems like chess-playing programs and medical diagnostic expert systems isn't merely that they're focused on specific, narrow domains. It's the fact that they operate based on defined-problem intelligence. It happens, though, that in some sufficiently specialized domains, defined-problem intelligence is enough to yield ass-kicking performance. In other domains it's not -- because in these other domains, figuring out what the problem is, is basically the problem.

I suggest that defined-problem intelligence is focused on declarative and procedural knowledge: i.e. it consists of cleverness or skillfulness or some combination thereof.

Logical reasoning systems, for example, are focused on declarative knowledge, and possess in some cases great facility at manipulating declarative knowledge.

Evolutionary learning systems and neural nets, on the other hand, are mainly focused on procedural knowledge -- on learning how to do stuff, without need for symbolic representations or symbol manipulations.

On the other hand: Contextual intelligence, I suggest, is a matter of knowing how to synthesize declarative and procedural knowledge, that representing problem-statements and problem-solutions, out of the combination of general goals and real-world situations.

I suggest that powerful contextual intelligence always relies upon powerful use of episodic memory, and associated mechanisms for storing, accessing, manipulating and analyzing sets of stories.

Or, briefly getting less geeky again: contextual intelligence requires wisdom.

Not at the level of the Taoist sage, John Coltrane or Riemann ... but at a way higher level than possessed by any currently operational AI system.

Note that defined-problem intelligence may sometimes draw on a wide body of background knowledge -- but it uses this background knowledge in a manner constrained by certain well-defined declarative propositions, or practical constraints on procedure-learning. It uses the background knowledge in a manner that doesn't require the background knowledge to be organized or accessed episodically -- rather, it uses background knowledge as a set of declarative facts, or data items, or constraints on actions, or procedures for doing specific things in specific types of situations.

"How to make a lot of money in Russia" is a problem that requires intense contextual as well as defined-problem intelligence. Whereas, "how to make a lot of money by trading oil futures on the Russian stock exchange" is more heavily weighted toward calculational intelligence, though it could be approached in a contextual-intelligence-heavy manner as well.

For instance, in the domain of bioinformatics, figuring out a rule that can diagnose a disease based on a gene expression microarray dataset, is a well-defined problem -- a problem that can be solved via focusing strictly on a small set of reasonably well-encapsulated information items. Declarative and/or procedural focused AI works well here ... much better than human intelligence.

On the other hand, figuring out which datasets are likely to be reliable, and figuring out how to normalize these datasets in a reasonable way based on the experimental apparatus described in the associated research paper, are tasks that require much more understanding of context, more milking of subtle patterns in episodic memory. I.e., I'm suggesting, more wisdom.

In the current practice of bioinformatic data analysis, human wisdom is needed to craft well-defined problems to feed into the superior (in this domain) declarative and procedural intelligence of narrow-AI bioinformatic data-analysis systems like the ones we've created at Biomind LLC.

Doing Time in the Universal Mind

Getting back to some of the ideas introduced at the start of this essay ... it seems all this ties in moderately closely with Erikson's definition and the Perennial Philosophy definition of "wisdom."

These definitions conceive wisdom as related to an understanding of life situations in a broader context than that of the individual body and mind. Wisdom as these thinkers conceive it, is a higher level of contextual intelligence than average humans display -- an ability to conceive daily situations in a broader-than-usual context.

This corresponds, really, to relying on a kind of collective episodic memory store, rather than just the episodic memory store corresponding to one's own life. By the time one is old, one is reviewing a longer life, and reviewing the past and future lives of one's children and grandchildren, and thinking about the whole scope of stories all these people may be involved in. A much richer contextuality.

Another ingredient of the Perennial Philosophy notion of wisdom is self-understanding, and I think that ties in here very closely too. One's own self is always part of the context, and to carry out really deep contextual understanding or problem-solving, one needs to appreciate how one's own history, knowledge and biases are affecting the situation and affecting one's own judgments. Powerful contextual intelligence -- unlike powerful calculational intelligence -- requires deep and broad self-understanding.

Wrapping Up

Sooo ... if we conceive wisdom as contextual intelligence powered by rich analysis of episodic memory, then it is clear that wisdom is a key aspect of general intelligence -- and is precisely the aspect that the AI research field has most abjectly ignored to date.

And it is also clear that ethical judgment is richly bound up with wisdom, as here conceived. Ethical judgment, in real life, is all about contextual understanding. It's not about following logical principles of ethics -- even when such principles are agreed-upon, real-life application always comes down to tricky context-specific intuitive judgments. Which comes down to understanding a vast pool of different situations, different episodes, that have existed in the lives of different human being and groups.

Defined-problem intelligence can be useful for ethical judgments. For instance in cases where scarce resources need to be divided fairly among a large number of parties with complex interrelationships and constraints, one has a well-defined problem of figuring out the optimally ethical balance, or a reasonable approximation thereof. But this actually seems an exceptional case, and the default case of ethical judgment seems to be to rely much more heavily on contextual than defined-problem intelligence.

Just to be clear: I'm not claiming that the conception of "wisdom" I've outlined here thoroughly captures all aspects of the natural-language/folk-psychology term "wisdom." Like "mind", "intelligence" and so forth, "wisdom" is a fuzzy term that amalgamates various different overlapping meanings ... it's not the kind of thing that CAN be crisply defined and analyzed once and for all.

What I hope to have done is to extract from the folks concept of wisdom some more precise, interesting and productive ideas, that closely relate to this folk concept but don't pretend to exhaust it.

In short...

  • General intelligence = defined-problem intelligence + contextual (problem-defining) intelligence
  • Calculational intelligence = cleverness (declarative intelligence) + skillfulness (procedural intelligence)
  • Contextual intelligence = in the human context, highly reliant on large-scale analysis of episodic memory
  • Wisdom = interestingly interpreted as contextual intelligence
  • Ethics = heavily reliant on wisdom

In this view, not surprisingly, the pursuit of Artificial Wisdom emerges as a subtask of the pursuit of Artificial General Intelligence. But what's interesting is it emerges as a complementary subtask to the one that most of the AI community is working on at the moment -- narrow-AI, or artificial defined-problem intelligence.

There is a bit of work in the AI community on narrative and story understanding. But most of this work seems, well, overly artificial. It has to do with formalistic systems for representing story structure. That is just not how we do things, in our human minds, and I suspect it's not an effective path at all.

I don't at the moment know any way to give an AGI system a rich understanding of episodes in the world than to actually embed it in the world and let is learn via experiencing. Virtual worlds may be a great start, given the amount of rich social interaction now occurring therein.

Thus I conclude that an excessive focus on narrow-AI research is, well, un-wise ;-)

And physically or virtually embodied AGI may potentially be a wise approach...

And I return again to the apparent wisdom of integrative AI approaches. Cleverness, skillfulness and wisdom are, I suggest, separate aspects of intelligence, which are naturally implemented in an AI system as separate modules -- but modules which must be architected for close inter-operation, because the real crux of general intelligence is the synergetic fusion of the three.

Friday, March 28, 2008

Buckets of Crumbs!!!

I just posted a way deeper and more interesting blog post a couple hours ago (using multiverse theory and Occam's Razor to explain why voting may often be rational after all), but I decided to post this sillier one tonight too because I have a feeling I'll forget if I put it off till tomorrow (late at night I'm willing to devote a little time to blogging in lieu of much-needed sleep ... tomorrow when I wake up there will be loads of work I'll feel obliged to do instead!)

This blog post just re-"prints" part of a post I made to the AGI email list today, which a couple people already asked me if they could quote.

It was made in response to a poster on the AGI list who made the argument that AGI researchers would be more motivated to work on building superhuman AGI if there were more financial gain involved ... and that, in fact, desire for financial gain MUST be a significant part of their motivation ... since AGI researchers are only human too ...

What I said is really simple and shouldn't need to have been said, but still, this sort of thing seems to require constant repetition, due to the nature of the society we live in...

Here goes:



Singularitarian AGI researchers, even if operating largely or partly in the business domain (like myself), value the creation of AGI far more than the obtaining of material profits.




I am very interested in deriving $$ from incremental steps on the path to powerful AGI, because I think this is one of the better methods available for funding AGI R&D work.




But deriving $$ from human-level AGI really is not a big motivator of mine. To me, once human-level AGI is obtained, we have something of dramatically more interest than accumulation of any amount of wealth.




Yes, I assume that if I succeed in creating a human-level AGI, then huge amounts of $$ for research will come my way, along with enough personal $$ to liberate me from needing to manage software development contracts or mop my own floor. That will be very nice. But that's just not the point.





I'm envisioning a population of cockroaches constantly fighting over crumbs of food on the floor. Then a few of the cockroaches -- let's call them the Cockroach Robot Club -- decide to spend their lives focused on creating a superhuman robot which will incidentally allow cockroaches to upload into superhuman form with superhuman intelligence. And the other cockroaches insist that the Cockroach Robot Club's motivation in doing this must be a desire to get more crumbs of food. After all, just **IMAGINE** how many crumbs of food you'll be able to get with that superhuman robot on your side!!! Buckets
full of crumbs!!!


(Perhaps after they're resurrected and uploaded, the cockroaches that used to live in my kitchen will come to appreciate the literary inspiration they've provided me! For the near future though I'll need to draw my inspiration elsewhere as Womack Exterminators seems to have successfully vanquished the beasties with large amounts of poisonous gas. Which I can't help feeling guilty about, being a huge fan of the film Twilight of the Cockroaches ... but really, I digress...)

I'm also reminded of a meeting I was in back in 1986, when I was getting trained as a telephone salesman (one of my lamer summer jobs from my grad school days ... actually I think that summer I had given up on grad school and moved to Las Vegas with the idea of becoming a freelance philosopher ... but after a couple months of phone sales, which was necessary because freelance philosophers don't make much money, I reconsidered and went back to grad school in the fall). The trainer, a big fat scary guy who looked and sounded like a meaner version of my ninth grade social studies teacher, was giving us trainee salespeople a big speech about how everyone wanted success, and he asked us how success was defined. Someone in the class answered MONEY and the trainer congratulated him and said: "That's right, in America success means money, and you're going to learn to make a lot of it!" The class cheered (a scene that could have been straight out of Idiocracy ... "I like money!"). Feeling obnoxious (as I usually was in those days), I raised my hand and asked the trainer if Einstein was successful or not ... since Einstein hadn't been particularly rich, I noted, that seemed to me like a counterexample to the principle that had been posited regarding the equivalence of success and financial wealth in the American context. The trainer changed the subject to how the salesman is like a hammer and the customer is like a nail. (By the way I was a mediocre but not horrible phone salesman of "pens, caps and mugs with your company name on them." I had to use the name "Ben Brown" on the phone though because no one could pronounce "Goertzel." If you were a small business owner in summer 1986 and got a phone call from an annoying crap salesman named Ben Brown, it was probably the 19 year old version of me....)


Thursday, March 27, 2008

Why Voting May Not be Such a Stupid Idea (A Multiversal Argument)

I haven't voted in any election for a heck of a long time ... but, in some conversations a couple years ago, an argument came up that actually seems like a reasonable argument why voting might be a good idea.

I'm not sure why I never blogged this before ... but I didn't ... so here goes ...


Why might voting be worthwhile, even though the chances that your vote breaks a tie in the election are vanishingly small?

Consider this: Would you rather live in a branch of the multiverse where the people like you vote, or where the people like you don't vote?

Obviously, if there are a lot of people like you, then you'll be better off in a branch where the people like you vote.

So: You should vote so as to be sure you're in one of those branches.

But, wait a minute. How do you know you won't end up in a branch where most of the people like you DON'T vote, but you vote anyway?

Well, you can't know that for sure. But, the question to ask is, which of the two swaths of possible universes are more probable overall:

Type 1) Ones in which everyone like you votes

Type 2) Ones in which most people like you don't vote, but you're the exception

Adopting an "Occam prior" that favors simpler possible universes over more complex ones, you arrive at the conclusion that Type 1 universes are more probable.

Now, this isn't an ironclad, universal argument for voting. If you're such a freak that all the people like you voting wouldn't make any difference, then this argument shouldn't convince you to vote.

Another counterargument against the above argument is that free will doesn't exist in the multiversal framework. What the heck does it mean to "decide" which branch of the multiverse to go down? That's not the kind of thing you can decide. Your decision process is just some dynamics that occurs on some branches and not others. It's not like your decision process steps out of the branching-process governing the multiverse and chooses which routes you follow....

But the thing is, deciding still feels like deciding from within your own human mind -- whether or not it's REALLY deciding in any fundamental physical sense.

So, I'm not telling you to decide anything. I'm merely (because it's what my internal dynamics are doing, in this branch of the multiverse that we're in) typing in some words that my internal dynamics believe may encourage you to carry out some of your own internal dynamics that may feel to you like you're deciding something. Right? Because, this is simply the way the universe is happening ... in this branch of the multiverse....

Don't decide anything. Just notice that these words are making you reflect on which branch of the multiverse you'd rather be in -- the one where everyone like you votes, or the one where they don't....

And of course it's not just about voting. It's really about any ethical behavior ... any thing such that we'd all be better off if everyone like us did that thing.

It's about compassion, for that matter -- we'd all be better off if everyone was more compassionate.... Would you rather be in the branch of the multiverse where everyone like you is compassionate, or....

Well, you get it.

But am I voting in this year's Presidential elections?

Out of all the candidates available, I'd definitely support Obama ... but nah, I think I'll probably continue my long tradition of lame citizenship and not vote.

I just don't think there are that many people like me out there ;-)

But if I read enough other blog posts like this one, I'd decide there was a large enough population of similar people out there, and I WOULD vote....

Tuesday, March 25, 2008

Quantum Voodoo in "Classical" Systems?

Way way back in the dark ages, when I was 19 years old and in my second year of grad school, I wrote a paper called "Holistic Indeterminacy" and submitted it to the journal Mind.

The basic idea was that, in some cases, very complex "classical" physical systems might literally display the same kind of indeterminacy associated with quantum systems.

The paper was printed out crappily on a dot matrix printer with dimly printed ink, and written in a not terribly professional way. It got rejected, and I've long since lost the thing. Furthermore, I never since found time to write up the ideas in the paper again. (Had there been a Web back then I would have posted the thing on my website, but this was the mid 1980's ... if I recall correctly, I hadn't even sent an email yet, at that point. I might actually have the paper on some old floppy disk in the basement, but odds are the data's long corrupted even if the disk is still around...).

But anyways ... please pardon these reminisces of an old man!! ... these old ideas of mine came up today in a conversation I was having with a friend over lunch, so I figured I'd take a few minutes to type them into a blog post (way less work than a paper!).

In fact these ideas are far more topical now than in the 1980's, as quantum computing is these days finally becoming a reality ... along with macroscopic quantum systems and all sorts of other fun stuff....

Partly because of these advances, and partly because the ideas have had decades to pervade my brain, I think I can now express the idea a bit more crisply than I did back then.

Still, it's a freaky and speculative train of thought, which I am not fully convinced makes any sense.

But at very least, it's some amusing hi-fi sci-fi.....

The basic idea is as follows.

Premise: Quantum logic is the logic of that which, in principle, cannot be observed. Classical logic is the logic of that which can, in principle, be observed.

The above may sound odd but it's not my idea -- it's the conclusion of a lot of work in quantum physics and the quantum theory of measurement, by serious physicists who understand such things far better than I do. It's way clearer now than it was in the mid 80's, though it was known to all the cool people even then....

Now is where things start to get weird. I want to make the above premise observer-dependent in a manner different from how quantum theory does it. Namely, I want to introduce an observer who, himself, has a finite capacity for understanding and observation -- a finite Kolmogorov complexity, for example.

This leads to my

Modest proposal: An observing system should use quantum logic to reason about anything that it, as a particular system, cannot in principle observe.

There are some things that a worm cannot observe, because it is just a worm; but I can observe. From the perspective of the worm, I suggest, these things should be reasoned about using quantum logic.

Similarly, there are some things that I cannot observe, in principle, because I am just a little old me.

Yes, I could potentially expand myself into a dramatically greater being. But, then that wouldn't help ME (i.e., my current self) to observe these things ... it would just help {some other, greater guy who had evolved out of me} to observe these things.

Of course, you can't step into the same river once ... and there is not really any ME that is persistent beyond an individual moment (and there are no individual moments!). But you can talk about a class of systems, and you can say that some observables are simply NOT observable by any system within that class. So systems within that class need to reason about these observables using quantum logic.

Where does complexity come into the picture? Well, among the things I can't in principle observe, are patterns of more complexity than can fit in my brain.

And among the things my deliberatively conscious mind can't in principle observe, are patterns of more complexity than can fit within its own very limited capacity.

So, if we interpret "quantum logic is the logic of things that can't in principle be observed" subjectively, as applying to particular real-world observing systems (including subsystems like the deliberatively conscious component of a human brain), then we arrive at the funky conclusion that maybe we should reason about each others' minds using quantum logic ... or maybe even, that we should reason about our own unconscious using quantum logic....

Funny idea, hmmm?

Way back when I wrote down some mathematics embodying these notions, but I don't feel like regenerating that right now. Although I'm a bit curious to see whether it had any validity or not ;-)

What made me think of this today was a discussion about consciousness, and the possibility (raised by the friend I was talking to) that some sort of wacky quantum voodoo is necessary to produce consciousness.

Maybe so. On the other hand, it could also be that any system complex enough to display the kind of rich deliberative consciousness we humans do, is complex enough that humans need to reason about it using quantum logic ... because in principle we cannot observe its dynamics (without becoming way more complex than we are, hence losing our self-ness...).

Ahhh... well I'll get back to doing the final edits on the Probabilistic Logic Networks book now ...

Monday, March 10, 2008

A New, Improved, Completely Whacky Theory of Evolution

This blog posts presents some really weird, speculative science, that I take with multiple proverbial salt-grains ... but, well, wouldn't it be funky if it were true?

The idea came to mind in the context of a conversation with my old friend Allan Combs, with whom I co-edit the online journal Dynamical Psychology.

It basically concerns the potential synergy between two apparently radically different lines of thinking:


Morphic Fields

The basic idea of a morphic field is that, in this universe, patterns tend to continue -- even when there's not any obvious causal mechanism for it. So that, for instance, if you teach thousands of rats worldwide a certain trick, then afterwards it will be easier for additional rats to learn that trick, even though the additional rats have not communicated with the prior one.

Sheldrake and others have gathered a bunch of evidence in favor of this claim. Some say that it's fraudulent or somehow subtly methodologically flawed. It might be. But after my recent foray into studying Ed May's work on precognition, and other references from Damien Broderick's heartily-recommended book Outside the Gates of Science (see my previous blog posts on psi), I'm becoming even more willing than usual to listen to data even when it goes against prevailing ideas.

Regarding morphic fields on the whole, as with psi, I'm still undecided, but interested. The morphic field idea certainly fits naturally with my philosophy that "the domain of pattern is primary, not the domain of spacetime"

Estimation of Distribution Algorithms

EDA's, on the other hand, are a nifty computer science idea aimed at accelerating artificial evolution (that occurs within software processes)

Evolutionary algorithms are a technique in computer science in which, if you want to find/create a certain object satisfying a certain criterion, you interpret the criterion as a "fitness function" and then simulate an "artificial evolution process" to try to evolve objects better and better satisfying the criterion. A population of candidate objects is generated at random, and then, progressively, evolving objects are crossed-over and mutated with each other. The fittest are chosen for further survival, crossover and mutation; the rest are discarded.

Google "genetic algorithms" and "genetic programming" if this is novel to you.

This approach has been used to do a lot of practical stuff -- in my own work, for example, I've evolved classification rules predicting who has cancer or who doesn't based on their genetic data (see Biomind); evolved little programs controlling virtual agents in virtual worlds to carry out particular tasks (see Novamente); etc. (though in both of those cases, we have recently moved beyond standard evolutionary algorithms to use EDA's ... see below...)

EDA's mix evolutionary algorithms with probabilistic modeling. If you want to find/create an object satisfying a certain criterion, you generate a bunch of candidates -- and then, instead of letting them cross over and mutate, you do some probability theory and figure out the patterns distinguishing the fit ones from the unfit ones. Then you generate new babies, new candidates, from this probability distribution -- throw them into the evolving population; lather, rinse, repeat.

It's as if, instead of all this sexual mating bullcrap, the Federal gov't made an index of all our DNA, then did a statistical study of which combinations of genes tended to lead to "fit" individuals, then created new individuals based on this statistical information. Then these new individuals, as they grow up and live, give more statistical data to throw into the probability distribution, etc. (I'd argue that this kind of eugenics is actually a plausible future, if I didn't think that other technological/scientific developments were so likely to render it irrelevant.)

Martin Pelikan's recent book presents the idea quite well, for a technical computer science audience.

Moshe Looks' PhD thesis presents some ideas I co-developed regarding applying EDA's to automated program learning.

There is by now a lot of mathematical/computational evidence that EDA's can solve optimization problems that are "deceptive" (hence very difficult to solve) for pure evolutionary learning. To put it in simple terms, there are many broad classes of fitness functions for which pure neo-Darwinist evolution seems prone to run into dead ends, but for which EDA style evolution can jump out of the dead ends.

Morphic Fields + EDA's = ??

Anyway -- now how do these two ideas fit together?

What occurred to Allan Combs and myself in an email exchange (originating from Allan reading about EDA's in my book The Hidden Pattern) is:

If you assume the morphic field hypothesis is true, then the idea that the morphic field can serve as the "probability distribution" for an EDA (allowing EDA-like accelerated evolution) follows almost immediately...

How might this work?

One argument goes as follows.

Many aspects of evolving systems are underdetermined by their underlying genetics, and arise via self-organization (coupled to the environment and initiated via genetics). A great example is the fetal and early-infancy brain, as analyzed in detail by Edelman (in Neural Darwinism and other writings) and others. Let's take this example as a "paradigm case" for discussion.

If there is a morphic field, then it would store the patterns that occurred most often in brain-moments. The brains that survived longest would get to imprint their long-lasting patterns most heavily on the morphic field. So, the morphic field would contain a pattern P, with a probability proportional to the occurrence of P in recently living brains ... meaning that occurrence of P in the morphogenetic field would correspond roughly to the fitness of organisms containing P.

Then, when young brains were self-organizing, they would be most likely to get imprinted with the morphic-field patterns corresponding to the most-fit recent brains....

So, if one assumes a probabilistically-weighted morphic field (with the weight of a pattern proportional to the number of times it's presented) then one arrives at the conclusion that evolution uses an EDA ...

Interesting to think that the mathematical power of EDA's might underly some of the power of biological evolution!

The Role of Symbiosis?

In computer science there are other approaches than EDAs for jumping out of evolutionary-programming dead ends, though -- one is symbiosis and its potential to explore spaces of forms more efficiently than pure evolution. See e.g. Richard Watson's book from a couple year back --

Compositional Evolution: The Impact of Sex, Symbiosis, and Modularity
on the Gradualist Framework of Evolution


and, also, Google "symbiogenesis." (Marginally relevantly, I wrote a bit about Schwemmler's ideas on symbiogenesis and cancer , a while back.)

But of course, symbiosis and morphic fields are not contradictory notions.

Hypothetically, morphic fields could play a role in helping organisms to find the right symbiotic combinations...

But How Could It Be True?

How the morphic fields would work in terms of physics is a whole other question. I don't know. No one does.

As I emphasized in my posts on psi earlier this year, it's important not to reject data just because one lacks a good theory to explain it.

I do have some interesting speculations to propound, though (I bet you suspected as much ;-). I'll put these off till another blog post ... but if you want a clue of my direction of thinking, mull a bit on

http://www.physics.gatech.edu/schatz/clocks.html

Sunday, March 09, 2008

Brief Report on AGI-08

Sooo....

The AGI-08 conference (agi-08.org) occurred last weekend in Memphis...!

I had hoped to write up a real scientific summary of AGI-08, but at the moment it doesn't look like I'll find the time, so instead I'll make do with this briefer and more surface-level summary...

Firstly, the conference went VERY well. The tone was upbeat, the discussions were animated and intelligent, and all in all there was a feel of real excitement about having so many AGI people in one place at one time.

Attendance was good: We originally anticipated 80 registrants but had 120+.

The conference room was a futuristic setting called "The Zone" that looked sorta like the Star Trek bridge -- with an excellent if mildly glitchy video system that, during Q&A sessions, showed the questioner up on a big screen in front of the room.

The unconventional format (brief talks followed by long discussion/Q&A) sessions was both productive and popular. The whole thing was video-ed and at some point the video record will be made available online (I don't know the intended timing of this yet).

The proceedings volume was released by IOS Press a few weeks before the conference and is a thick impressive-looking tome.

The interdisciplinary aspect of the conference seemed to work well -- e.g. the session on virtual-worlds AI was chaired by Sibley Verbeck (CEO of Electric Sheep Company) and the session on neural nets was chaired by Randal Koene (a neuroscientist from Boston University). This definitely made the discussions deeper than if it had been an AI-researchers-only crowd.

Plenty of folks from government agencies and large and small corporations were in attendance, as well as of course many AI academics and non-affiliated AGI enthusiasts. Among the AI academics were some highly-respected stalwarts of the AI community, alongside the new generation...

There seemed to be nearly as many Europeans as Americans there, which was a pleasant surprise, and some Asians as well.

The post-conference workshop on ethical, sociocultural and futurological issues drew about 60 people and was a bit of a free-for-all, with many conflicting perspectives presented quite emphatically and vociferously. I think most of that discussion was NOT captured on video (it took place in a different room where video-ing was less convenient), though the workshop talks themselves were.

The media folks in attendance seemed most energized by the section on AI in virtual worlds, which is because in this section the presenters (me, Andrew Shilliday, and Martin Magnusson) showed movies of cute animated characters doing stuff. This gave the nontechnical observers something to grab onto, which most of the other talks did not.

As at the earlier AGI-06 workshop, one of the most obvious observations after listening to the talks was that a lot of AGI research programs are pursuing fairly similar architectures and ideas but using different languages to describe what they're doing. This suggests that making a systematic effort at finding a common language and really understanding the true overlaps and differences of the various approaches, would be very beneficial. There was some talk of organizing a small, invitation-only workshop among practicing AGI system architects, perhaps in Fall 2008, with a view toward making progress in this direction.

Much enthusiasm was expressed for an AGI-09, and it was decided that this will likely be located in Washington DC, a location that will give us the opportunity to use the conference to help energize various government agencies about AGI.

There was also talk about the possibility of an AGI online technical journal, and a group of folks will be following that up, led by Pei Wang.

An "AGI Roadmap" project was also discussed, which would involve aligning different cognitive architectures currently proposed insofar as possible, but also go beyond that. Another key aspect of the roadmap might be an agreement on certain test environments or tasks that could be used to compare and explore various AGI architectures in more of a common way than is now possible.

Lots of ideas ... lots of enthusiasm ... a strong feeling of community-building ... so, I'm really grateful to Stan Franklin, Pei Wang, Sidney DeMello and Bruce Klein and everyone else who helped to organize the conference.

Finally, an interesting piece of feedback was given by my mother, who knows nothing about AGI research (she runs a social service agency) and who did not attend the conference but read the media coverage afterwards. What she said is that the media seems to be taking a far less skeptical and mocking tone toward AGI these days, as opposed to 7-10 years ago when I first started appearing in the media now and then. I think this is true, and it signifies a real shift in cultural attitude. This shift is what allowed The Singularity Is Near to sell as many copies as it did; and what encouraged so many AI academics to come to a mildly out-of-the-mainstream conference on AGI. Society, including the society of scientists, is starting to wake up to the notion that, given modern technology and science, human-level AGI is no longer a pipe dream but a potential near-term reality. w00t! Of course there is a long way to go in terms of getting this kind of work taken as seriously as it should be, but at least things seem to be going in the right direction.

Balancing concrete work on AGI with community-building work like co-organizing AGI is always a tricky decision for me.... But in this case, the conference went sufficiently well that I think it was worthwhile to deviate some time from the R&D to help out with it. (And now, back to the mass of other work that piled up for me during the conference!)

Yet More Rambling on Will (Beyond the Rules vs. Randomness Dichotomy)

A bit more on this nasty issue of will ... complementing rather than contradicting my previously-expressed ideas.

(A lot of these theory-of-mind blog posts are gonna ultimately get revised and make their way into The Web of Pattern, the sequel to The Hidden Pattern that I've been brewing in my mind for a while...)

What occurred to me recently was a way out of the old argument that "free will can't exist because the only possibilities are RULES versus RANDOMNESS."

In other words, the old argument goes: Either a given behavior is determined, or it's random. And in either case, where's the will? Granted, a random coin-toss (quantum or otherwise) may be considered "free" in a sense, but it's not willed -- it's just random.

What occurred to me is that this dichotomy is oversimplified because it fails to take two factors into account:

  1. A subjectively experienced moment occurs over a fuzzy span of time, not at a single physical moment
  2. "Random" always means "random with respect to some observer."

To clarify the latter point: "S is random to system X" just means "S contains no patterns that system X could identify."

System Y may be able to recognize some patterns in S, even though X can't.

And, X may later evolve into X1, which can recognize patterns in S.

Something that was random to me thirty years ago, or thirty seconds ago, may be patterned to me now.

Consider the perspective of the deliberative, rational component of my mind, when it needs to make a choice. It can determine something internally, or it can draw on an outside source, whose outcome may not be predictable to it (that is, it may make a "random" choice). Regarding outside sources, options include

  1. a random or pseudorandom number generator
  2. feedback from the external physical world, or from another mind in the vicinity
  3. feedback from the unconscious (or less conscious) non-deliberative part of the mind

Any one of these may introduce a "random" stimulus that is unpatterned from the point of view of the deliberative decision-maker.

But of course, options 2 and 3 have some different properties from option 1. This is because, in options 2 or 3, something that appears random at a certain moment, may appear non-random a little later, once the deliberative mind has learned a little more (and is thus able to recognize more or different patterns).

Specifically, in the case of option 3, it is possible for the deliberative mind to draw on the unconscious mind for a "random" choice, and then a half-moment later, import more information from the unconscious that allows it to see some of the patterns underlying the previously-random choice. We may call this process "internal patternization."

Similarly, in the case of option 2, it is possible for the deliberative mind to draw on another mind for a "random" choice, and then a half-moment later, import more information from the other mind that allows it to see some of the patterns underlying the previously random choice. We may call this process "social patternization."

There's also "physical patternization" where the random choice comes from an orderly (but initially random to the perceiving mind) process in the external world.

These possibilities are interesting to consider in the light of the non-instantaneity of the subjective moment. Because, the process of patternization may occur within a single experienced moment.

The subjective experience of will, I suggest, is closely tied to the process of internal patternization. When we have the feeling of making a willed decision, we are often making a "random" choice (random from the perspective of our deliberative component), and then immediately having the feeling of seeing some of the logic and motivations under that choice (as information passes from unconscious to conscious). But the information passed into the deliberative mind is of course never complete and there's always still some indeterminacy left, due to the limited capacity of deliberative mind as compared to unconscious mind.

So, what is there besides RULES plus RANDOMNESS?

There is the feeling of RANDOMNESS transforming into RULES (i.e. patterns), within a single subjective moment.

When this feeling involves patterns of the form "Willing X is causing {Willing X plus the occurrence of S}", then we have the "free will" experience. (This is the tie-in with my discourse on free will and hypersets, a few blog posts ago.)

That is, the deliberative content of recursive willing is automatized and made part of the unconscious, through repeated enaction. It then plays a role in unconscious action determination, which is perceived as random by the deliberative mind -- until, toward the tail end of a subjective moment, it becomes more patterned (from the view of the deliberative mind) due to receiving more attention.

Getting practical for a moment: None of this, as I see it, is stuff that you should program into an AGI system. Rather it is stuff that should emerge within the system as a part of its ongoing recognition of patterns in the world and itself, oriented toward achieving its goals. In this particular case the dynamics of attention allocation is key -- the process by which low-attention items (unconscious) can rapidly gain attention (become intensely deliberatively conscious) within a single subjective moment, but can also have a decisive causal impact prior to this increase in attention. The nonlinear dynamics of attention, in other words, is one of the underpinnings of the subjective experience of will.

What I'm trying to do here is connect phenomenology, cognitive science and AGI design. It seems to work, conceptually, in terms of according with my own subjective experience and also with known data on human brain/mind and my intuition/experience with AGI design.