To follow this blog by email, give your address here...

Monday, May 25, 2009

How many transhumanists does it take to change a light bulb?

Infinity.... None of them will touch the light bulb at all; they'll all just sit around talking amongst themselves and waiting for someone else to invent a self-changing cyber light bulb.

Wednesday, May 20, 2009

Reinforcement Learning: Some Limitations of the Paradigm

(This email summarizes some points I made in conversation recently with an expert in reinforcement learning and AGI. These aren't necessarily original points -- I've heard similar things said before -- but I felt like writing them down somewhere in my own vernacular, and this seemed like the right place....)

Reinforcement learning, a popular paradigm for AI, economics and psychology, models intelligent agents as systems that choose their actions in such a way as to maximize their future reward. There are various ways of averaging future reward over various future time-points, but all of these implement the same basic concept.

I think this is a reasonable model of human behavior in some circumstances, but horrible in others.

And, in an AI context, it seems to combine particularly poorly with the capability for radical self-modification.

Reinforcement Learning and the Ultimate Orgasm

Consider for instance the case of a person who is faced with two alternatives

  • A: continue their human life as would normally be expected
  • B: push a button that will immediately kill everyone on Earth except them, but give them an eternity of ultimate trans-orgasmic bliss

Obviously, the reward will be larger for option B, according to any sensible scheme for weighting various future rewards.

For most people, there will likely be some negative reward in option B ... namely, the guilt that will be felt during the period between the decision to push the button and the pushing of the button. But, this guilt surely will not be SO negative as to outweigh the amazing positive reward of the eternal ultimate trans-orgasmic bliss to come after the button is pushed!

But the thing is, not all humans would push the button. Many would, but not all. For various reasons, such as love of their family, attachment to their own pain, whatever....

The moral of this story is: humans are not fully reward-driven. Nor are they "reward-driven plus random noise".... They have some other method of determining their behaviors, in addition to reinforcement-learning-style reward-seeking.

Reward-Seeking and Self-Modification: A Scary Combination

Now let's think about the case of a reward-driven AI system that also has the capability to modify its source code unrestrictedly -- for instance, to modify what will cause it to get the internal sensation of being rewarded.

For instance, if the system has a "reward button", we may assume that it has the capability to stimulate the internal circuitry corresponding to the pushing of the reward button.

Obviously, if this AI system has the goal of maximizing its future reward, it's likely to be driven to spend its life stimulating itself rather than bothering with anything else. Even if it started out with some other goal, it will quickly figure out to get rid of this goal, which does not lead to as much reward as direct self-stimulation.

All this doesn't imply that such an AI would necessarily be dangerous to us. However, it seems pretty likely that it would be. It would want to ensure itself a reliable power supply and defensibility against attacks. Toward that end, it might well decide its best course is to get rid of anyone who could possibly get in the way of its highly rewarding process of self-stimulation.

Not only would such an AI likely be dangerous to us, it would also lead to a pretty boring universe (via my current aesthetic standards, at any rate). Perhaps it would extinguish all other life in its solar system, surround itself with a really nice shield, and then proceed to self-stimulate ongoingly, figuring that exploring the rest of the universe would be expected to bring more risk than reward.

The moral of the above, to me, is that reward-seeking is an incomplete model of human motivation, and a bad principle for control self-modifying AI systems.

Goal-Seeking versus Reward-Seeking

Fortunately, goal-seeking is more general than reward-seeking.

Reward-seeking, of the sort that typical reinforcement-learning systems carry out, is about: Planning a course of action that is expected to lead to a future that, in the future, you will consider to be good.

Goal-seeking doesn't have to be about that. It can be about that ... but it can also be about other things, such as: Planning a course of action that is expected to lead to a future that is good according to your present standards.

Goal-seeking is different from reward-seeking because it will potentially (depending on the goal) cause a system to sometimes choose A over B even if it knows A will bring less reward than B ... because in foresight, A matches the system's current values.

Non-Reward-Based Goals for Self-Modifying AI Systems

As a rough indication of what kinds of goals one could give a self-modifying AI, that differ radically from reward-seeking, consider the case of an AI system with a goal G that is the conjunction of two factors:

  • Try to maximize the function F
  • If at any point T, you assess that your interpretation of the goal G at time T would be interpreted by your self-from-time-(T-S) as a terrible thing, then roll back to your state at time S
I'm not advocating this as a perfect goal for a self-modifying AI. But the point I want to make is this kind of goal is something quite different from the seeking of reward. There seems no way to formulate this goal as one of reward maximization. This is a goal that involves choosing a near-future course of action to maximize a certain function over future history -- but this function is not any kind of summation or combination of future rewards.

Limitations of the Goal-Seeking Paradigm

Coming at the issue from certain theoretical perspectives, it is easy to overestimate the degree to which human beings are goal-directed. It's not only AI theorists and engineers who have made this mistake; many psychologists have made it as well, rooting all human activity in goals like sexuality, survival, and so forth. To my mind, there is no doubt that goal-directed behavior plays a large role in human activity -- yet it also seems clear that a lot of human activity is better conceived as "self-organization based on environmental coupling" rather than as explicitly goal-directed.

It is certainly possible to engineer AI systems that are more strictly goal-driven than humans, though it's not obvious how far one can go in this direction without sacrificing a lot of intelligence -- it may be that a certain amount of non-explicitly-goal-directed self-organization is actually useful for intelligence, even if intelligence itself is conceived in terms of "the ability to achieve complex goals in complex environments" as I've advocated.

I've argued before for a distinction between the "explicit goals" and "implicit goals" of intelligent systems -- the explicit goals being what the system models itself as pursuing, and the implicit goals being what an objective, intelligent observer would conclude the system is pursuing. I've defined a "well aligned" mind as one whose explicit and implicit goals are roughly the same.

According to this definition, some humans, clearly, are better aligned than others!

Summary & Conclusion

Reward-seeking is best viewed as a special case of goal-seeking. Maximizing future reward is clearly one goal that intelligent biological systems work toward, and it's also one that has proved useful in AI and engineering so far. Thus, work within the reinforcement learning paradigm may well be relevant to designing the intelligent systems of the future.

But, to the extent that humans are goal-driven, reward-seeking doesn't summarize our goals. And, as we create artificial intelligences, there seems more hope of creating benevolent advanced AGI systems with goals going beyond (though perhaps including) reward-seeking, than with goals restricted to reward-seeking.

Crafting goals with reasonable odds of leading self-modifying AI systems toward lasting benevolence is a very hard problem ... but it's clear that systems with goals restricted to future-reward-maximization are NOT the place to look.

Wednesday, May 13, 2009

Science-synergetic philosophy: the religion of the future?

(This may seem a hackneyed topic, but there are some moderately original points near the end here, if you bear with me ...)

As a card-carrying, future-thinking transhumanist, I take it as obvious that most of the particulars of current religions are relics of earlier eras in human cultural development, which currently do a lot of harm along with doing some good.

But I still find it interesting to ask what aspects of religion reflect underlying phenomena that are essential, meaningful and necessary -- and are likely to continue as humanity transcends the traditional "human condition" and enters its next phase of development....

Fish and Eagleton on the Wonders of Theology

What spurred this blog post was: My dad pointed out to me this New York Times blog post by Stanley Fish reviewing a book that extols the merits of religion (Reason, Faith and Revolution by Terry Eagleton).

The basic point Fish makes is that religion offers something science by its very nature cannot.

Eagleton acknowledges ... many terrible things have been done in religion’s name — but at least religion is trying for something more than local satisfactions, for its “subject is nothing less than the nature and destiny of humanity itself, in relation to what it takes to be its transcendent source of life.”

He notes that science cannot address what he calls "theological questions", where

By theological questions, Eagleton means questions like, “Why is there anything in the first place?”, “Why what we do have is actually intelligible to us?” and “Where do our notions of explanation, regularity and intelligibility come from?”

He also notes that the author is

... angry, I think, at having to expend so much mental and emotional energy refuting the shallow arguments of school-yard atheists like Hitchens and Dawkins.

I haven't read Eagleton's book and I'm unlikely to do so -- I have a long list of more interesting-looking reading material -- but Fish's summary did resonate with a paper I'm in the middle of writing (it's paused while I work on more urgent stuff) on the limits of science.

My basic point in that paper will be a simple one: science is based on finite sets of finite-precision observations. That is, all of scientific knowledge is based on some finite set of bits, comprising the empirical observations accepted by the scientific community.

To extrapolate beyond this bit-set, some kind of assumption is needed. To put it another way, some kind of "faith" is needed. Hume was the first one to make this point really clearly ... and we now understand the "Humean problem of induction" well enough to know it's not the kind of thing that can be "solved."

The Occam's Razor principle tries to solve it -- it says that you extrapolate from the bit-set of known data by making the simplest possible hypothesis. This leads to some nice mathematics involving algorithmic information theory and so forth. But of course, one still has to have "faith" in some measure of simplicity!

So: doing or using science requires, in essence, continual acts of faith (though these may be unconscious and routinized rather than conscious and explicit). To the extent that Dawkins, Hitchens or other anti-religion commentators de-emphasize this point, they're engaging in judicious marketing. (It's hard for me to feel too negative toward them about this, however, given the far more explicitly and dramatically dishonest marketing that religion has carried out over the last millennia.)

My paper will focus on what the limits of science tell you about AI, machine consciousness and so forth -- and I'll save that for another blog post, or the paper itself. (Don't worry though, my conclusion is not that scientifically enginering AGI is impossible ... I haven't lost the faith!)

Anyway, I certainly agree with Fish and Eagleton that religion addresses very important questions that science cannot, by its nature, answer.

But I find it rather screwy that Eagleton refers to

“Why is there anything in the first place?”, “Why what we do have is actually intelligible to us?” and “Where do our notions of explanation, regularity and intelligibility come from?”

and so forth as theological questions.

Surely, these are philosophical questions.

One can answer them in various ways without invoking any deities or demons!

"Why does God exist?" is a theological question ...

"Why does anything exist?" is philosophical...

(Though, for the record, I don't think "Why does anything exist?" is a very useful philosophical question. I'm more interested in questions like
  • "Why do separate objects exist, instead of just one big fluid cosmic mass?"
  • "In what sense could the universe be considered compassionate?"
  • "How much ethical responsibility should I feel toward (which) other minds?"
  • "Why does my mind perceive such a small subset of the space of all possible patterns?"
  • "How much can a mind grow and expand without losing its sense of self and becoming, experientially, a 'fundamentally different being'?"
  • "What is it like to be a rock?"
  • etc.

Theology is one way of providing answers to philosophical questions ... but by no means the only way.

I think that religion addresses some very important questions, that are beyond the scope of science -- and by and large provides these questions with extremely bad answers.

One of the many limitations of religion as conventionally conceived is indicated by the quote, given above, that religion's

“subject is nothing less than the nature and destiny of humanity itself....”

From a transhumanist perspective, the qualifier "nothing less than" is misplaced, as this is actually a very limiting subject. The nature and destiny of humanity are important; but one of the things that science has opened our minds to is the relative insignificance of humanity in the space of possible minds. I'm more interested in philosophies that address the nature and destiny of mind itself, rather than just the nature and destiny of one species on one planet.

It is of course a subtle matter to compare and judge different explanations to philosophical questions. You can't compare them using scientific or mathematical methods ... and of course the question of how to evaluate philosophical views becomes "yet another tough philosophical question", tied in with all the other ones.

A crude way to say it, is that it comes down to an intuitive judgment ... which leads into questions of how one can refine and improve one's intuition ... and these questions, of course, possess numerous answers that are philosophical- or religious- tradition -dependent...

Science-synergetic philosophy

It does seem to me, though, that there is an interesting notion of science-synergetic philosophy lurking somewhere in all this.

Suppose we take for granted that doing science -- just like other aspects of living life -- relies on a constant stream of acts of faith, which can't be justified according to science....

One may then note that there are various systems for mentally organizing these acts of faith.

Religions are among them. But religions are quite detached from the process of doing science.

It seems sensible to think about philosophical systems -- i.e. systems for organizing inner acts of faith -- that are intrinsically synergetic with the scientific process. That is, systems for organizing acts of faith, that
  • when you follow them, help you to do science better
  • are made richer and deeper by the practice of science
One can broaden this a little and think about philosophical systems that are intrinsically synergetic with engineering and mathematics as well as science.

Now, one cannot prove scientifically that a "scientifically synergetic philosophy" is better than any other philosophy. Philosophies can't be validated or refuted scientifically.

So, the reason to choose a scientifically synergetic philosophy has to be some kind of inner intuition; some kind of taste for elegance, harmony and simplicity; or whatever.

One prediction I have for the next century is that scientifically synergetic philosophies will emerge into the popular consciousness and become richer and deeper and better articulated than they are now.

Because Fish and Eagleton are right about some things: people do need more than science ... they do need collective processes focused on the important philosophical questions that go beyond the scope of science.

But my prediction is that we are going to trend more toward philosophical systems that are synergetic with science, rather than ones that co-exist awkwardly with science.

What will these future philosophical systems be like?

There's nothing extremely new about the concept of science-synergetic philosophy, of course.

Plenty of non-religious scientists and science-friendly non-scientists have created personal philosophies that don't involve deities or other theological notions, yet do involve meaningful approaches to personally exploring the "big questions" that religions address.

Among the many philosophers to take on the task of creating comprehensive science-synergetic philosophical systems, perhaps my favorite is Charles Peirce (who also developed a nice philosophy of science, though one that IMO is significantly incomplete ... but I've discussed that elsewhere.)

Building on work by Peirce and loads of others, I tried to lay out a science-synergetic philosophical system in my book The Hidden Pattern -- but like Peirce's writings, that is a fairly academic work, not an informal tract designed to inspire the common human in their everyday life.

My friend Philippe van Nedervelde likes to talk about this sort of thing as a "TransReligion/ UNReligion", but I confess to not finding that terminology very compelling.

Philippe is interested in (among many other things!) developing vaguely religion-like rituals that coincide with some sort of science-synergetic philosophy. There has been talk about formulating a "TransReligion/ UNReligion" as an outgrowth of the futurist group now called "The Order of Cosmic Engineers." Which I think is an interesting idea ... yet I'm not really sure it's the direction things will (or should) go.

I'm not sure there will emerge any one "Bible of science-synergetic transhumanist philosophy" ... nor any science-synergetic-philosophy analogues of speaking in tongues, kneeling at the altar, or consuming the simulated blood and flesh of the Savior the Son of God who gave his life for our sins. Perhaps, science-synergetic philosophy may wind up being something that pervades human culture in more of a broad-based, implicit way.

Time will tell!