Saturday, December 27, 2008

The Subtle Structure of the Everyday Physical World = The Weakness of Abstract Definitions of Intelligence

In my 1993 book "The Structure of Intelligence" (SOI), I presented a formal definition of intelligence as "the ability to achieve complex goals in complex environments." I then argued (among other things) that pattern recognition is the key to achieving intelligence, due to the algorithm
  • Recognize patterns regarding which actions will achieve which goals in which situations
  • Choose a goal that is expected to be good at goal achievement in the current situation
The subtle question in this kind of definition is: How do you average over the space of goals and environments? If you average over all possible goals and environments, weighting each one by their complexity perhaps (so that success with simple goals/environments is rated higher), then you have a definition of "how generally intelligent a system is," where general intelligence is defined in an extremely mathematically inclusive way.

The line of thinking I undertook in SOI was basically a reformulation in terms of "pattern theory" of ideas regarding algorithmic information and intelligence that originated with Ray Solmonoff; and Solomonoff's ideas have more recently been developed by Shane Legg and Marcus Hutter into a highly rigorous mathematical definition of intelligence.

I find this kind of theory fascinating, and I'm pleased that Legg and Hutter have done a more thorough job than I did of making a fully formalized theory of this nature.

However, I've also come to the conclusion that this sort of approach, without dramatic additions and emendations, just can't be very useful for understanding practical human or artificial intelligence.

What is Everyday-World General Intelligence About?

Let's define the "everyday world" as the portion of the physical world that humans can directly perceive and interact with -- this is meant to exclude things like quantum tunneling and plasma dynamics in the centers of stars, etc. (though I'll also discuss how to extend my arguments to these things).

I don't think everyday-world general intelligence is mainly about being able to recognize totally general patterns in totally general datasets (for instance, patterns among totally general goals and environments). I suspect that the best approach to this sort of totally general pattern recognition problem is ultimately going to be some variant of "exhaustive search through the space of all possible patterns" ... meaning that approaching this sort of "truly general intelligence" is not really going to be a useful way to design an everyday-world AGI or a significant components of one. (Hutter's AIXItl and Schmidhuber's Godel Machine are examples of exhaustive search based AGI methods.)

Put differently, I suspect that all the AGI systems and subcomponents one can really build are SO BAD at solving this general problem, that it's better to characterize AGI systems
  • NOT in terms of how well they do at this general problem
but rather
  • in terms of what classes of goals/environments they are REALLY GOOD at recognizing patterns in
I think the environments existing in the everyday physical and social world that humans inhabit are drawn from a pretty specific probability distribution (compared to say, the "universal prior," a standard probability distribution that assigns higher probability to entities describable using shorter programs), and that for this reason, looking at problems of compression or pattern recognition across general goal/environment spaces without everyday-world-oriented biases, is not going to lead to everyday-world AGI.

The important parts of everyday-world AGI design are the ones that (directly or indirectly) reflect the specific distribution of problems that the everyday world presents an AGI system.

And this distribution is really hard to encapsulate in a set of mathematical test functions. Because, we don't know what this distribution is.

And this is why I feel we should be working on AGI systems that interact with the real everyday physical and social world, or the most accurate simulations of it we can build.

One could formulate this "everyday world" distribution, in principle, by taking the universal prior and conditioning it on a huge amount of real-world data. However, I suspect that simple, artificial exercises like conditioning distributions on text or photo databases don't come close to capturing the richness of statistical structure in the everyday world.

So, my contention is that
  • the everyday world possesses a lot of special structure
  • the human mind is structured to preferentially recognize pattern related to this special structure
  • AGIs, to be successful in the everyday world, should be specially structured in this sort of way too
To encompass this everyday-world bias (or other similar biases) into the abstract mathematical theory of intelligence, we might say that intelligence relative to goal/environment class C is "the ability to achieve complex goals (in C) in complex environments (in C)"

And we could formalize this by weighting each goal or environment by a product of
  • its simplicity (e.g. measured by program length)
  • its membership in C, considering C as a fuzzy etc
One can create a formalization of this idea using Legg and Hutter's approach to defining intelligence also.

One can then characterize a system's intelligence in terms of which goal/environment sets C it is reasonably intelligent for.

OK, this does tell you something.

And, it comes vaguely close to Pei Wang's definition of intelligence as "adaptation to the environment."

But, the point that really strikes me lately is how much of human intelligence has to do, not with this general definition of intelligence, but with the subtle abstract particulars of the C that real human intelligences deal with (which equals the everyday world).

Examples of the Properties of the Everyday World That Help Structure Intelligence

The propensity to search for hierarchical patterns is one huge example of this. The fact that searching for hierarchical patterns works so well, in so many everyday-world contexts, is most likely because of the particular structure of the everyday world -- it's not something that would be true across all possible environments (even if one weights the space of possible environments using program-length according to some standard computational model).

Taking it a step further, in my 1993 book The Evolving Mind I identified a structure called the "dual network", which consists of superposed hierarchical and heterarchical networks: basically a hierarchy in which the distance between two nodes in the hierarchy is correlated with the distance between the nodes in some metric space.

Another high level property of the everyday world may be that dual network structures are prevalent. This would imply that minds biased to represent the world in terms of dual network structure are likely to be intelligent with respect to the everyday world.

The extreme commonality of symmetry groups in the (everyday and otherwise) physical world is another example: they occur so often that minds oriented toward recognizing patterns involving symmetry groups are likely to be intelligent with respect to the real world.

I suggest that the number of properties of the everyday world of this nature is huge ... and that the essence of everyday-world intelligence lies in the list of these abstract properties, which must be embedded implicitly or explicitly in the structure of a natural or artificial intelligence for that system to have everyday-world intelligence.

Apart from these particular yet abstract properties of the everyday world, intelligence is just about "finding patterns in which actions tend to achieve which goals in which situations" ... but, this simple meta-algorithm is well less than 1% of what it takes to make a mind.

You might say that a sufficiently generally intelligent system should be able to infer these general properties from looking at data about the everyday world. Sure. But I suggest that would require a massively greater amount of processing power than an AGI that embodies and hence automatically utilizes these principles? It may be that the problem of inferring these properties is so hard as to require a wildly infeasible AIXItl / Godel Machine type system.

Important Open Questions

A couple important questions raised by the above:
  1. What is a reasonably complete inventory of the highly-intelligence-relevant subtle patterns/biases in the everyday world?
  2. How different are the intelligence-relevant subtle patterns in the everyday world, versus the broader physical world (the quantum microworld, for example)?
  3. How accurate a simulation of the everyday world do we need to have, to embody most of the subtle patterns that lie at the core of to everyday-world intelligence?
  4. Can we create practical progressions of simulations of the everyday world, such that the first (and more crude) simulations are very useful to early attempts at teaching proto-AGIs, and the development of progressively more sophisticated simulations roughly tracks the development of progress in AGI design and development.
The second question relates to an issue I raised in a section of The Hidden Pattern, regarding the possibility of quantum minds -- minds whose internal structures and biases are adapted to the quantum microworld rather than to the everyday human physical world. My suspicion is that such minds will be quite different in nature, to the point that they will have fundamentally different mind-architectures -- but there will also likely be some important and fascinating points of overlap.

The third and fourth questions are ones I plan to explore in an upcoming paper, an expansion of the AGI-09 conference paper I wrote on AGI Preschool. An AGI Preschool as I define it there is a virtual world defining a preschool environment, with a variety of activities for young AI's to partake in. The main open question in AGI Preschool design at present is: How much detail does the virtual world need to have, to support early childhood learning in a sufficiently robust way? In other words, how much detail is needed so that the AGI Preschool will posssess the subtle structures and biases corresponding to everyday-world AGI? My AGI-09 conference paper didn't really dig into this question due to length limitations, but I plan to address this in a follow-up, expanded version.

Wednesday, November 26, 2008

The Increasing Value of Peculiar Intelligence

One more sociopolitical/futurist observation ...

If you're not aware of David Brin's Transparent Society concept, you should read the book, and start with the Web page

http://www.davidbrin.com/tschp1.html

His basic idea is that, as surveillance technology improves, there are two possibilities:

  • The government and allied powers watch everybody, asymmetrically
  • Everybody watches everyone else, symmetrically (including the government and allied powers being watched)

He calls the latter possibility sousveillance ... all-watching ...

What occurs to me is that in a transparent society, there is massive economic value attached to peculiar intelligence

This is because, if everyone can see everything else, the best way to gain advantage is to have something that nobody can understand even if they see it

And it's quite possible that, even if they know that's your explicit strategy, others can't really do anything to thwart it...

Yes, a transparent society could decide to outlaw inscrutability. But this would have terrible consequences, because nearly all radical advances are initially inscrutable....

Inscrutability is dangerous. But it's also, almost by definition, the only path to radical growth.

I argued in a recent blog post that part of the cause of the recent financial crisis is the development of financial instruments so complex that they are inscrutable to nearly everyone -- so that even if banks play by the rules and operate transparently, they can still trick shareholders (and journalists) because these people can't understand what they see!

But it seems that this recent issue with banks is just a preliminary glimmering of what's to come....

Yes, maybe there is something a bit self-centered or self-serving in this theory, since I seem to personally have more differential in "peculiar intelligence" than in most other qualities ... but, call it what you will, my peculiar intelligence is definitely pushing me to the conclusion that peculiar intelligence is going to be a more and more precious commodity as the Age of Transparency unfolds...

Obviously, you can also see this phenomenon in financial trading even in a non-crisis time. It's not enough to be smart at predicting the markets to make a LOT of money ... because if you're just smart, but in non-peculiar way, then once you start trading a lot of money, others will observe your trades and pick up the pattern, and start being smart in the same way as you ... and then you'll lose your edge. The way to REALLY make a lot of money in the markets is to be very smart, and in a very peculiar way, so that even if others watch your trades carefully, they can't understand the pattern, and they can't imitate your methodology....

The best trader I know personally is Jaffray Woodriff who runs quantitative.com, and he exemplifies this principle wonderfully: very intelligent, very peculiar (though in an extremely personable, enjoyable way), and very "peculiarly intelligent" ;-)

Democratic market socialism redux

All in all, the conclusion I'm coming to lately ... as reflected in my last two blog posts, as well as in some other thinking ... is that government is going to need to do some rather careful and specific things to guide society toward a positive Singularity.

Yes, if someone creates a hard-takeoff with a superhuman AGI, then the government and other human institutions may be largely irrelevant.

But if there is a soft takeoff lasting say 1-3 decades ... before the hard takeoff comes along ... then, my view is increasingly that the market system is going to screw things up, and lead to a situation where there are a lot of unhappy and disaffected people ... which increases the odds of some kind of nasty terrorist act intervening and preventing a positive Singularity from occurring.

It seems we may need to review the general line of thinking (though not many of the specific proposals) from old democratic-socialism-style texts like Economics and the Public Purpose ...

Perhaps one positive consequence of the current economic crisis is that it may cause the US public to understand the value of well-directed government spending....

And yes, I'm well aware that most of my colleagues in the futurist community tend to be libertarian politically. I think they're just wrong. I am all in favor of getting rid of victimless crimes ... legalizing drugs and prostitution and so forth ... but, given the realities of the next century and the risks of a negative Singularity, I don't think we can afford to leave things up to the unpredictable self-organizing dynamics of the market economy ... I think we as a society will need to reflect on what we want the path to Singularity to be, and take specific concerted governmental actions to push ourselves along that path...

This is not the political dialogue for 2008 ... different issues are occupying peoples' minds right now ... but it may be the political dialogue for 2012 or 2015 ... or at latest, I'd guess, 2020 ...

Why the average workweek isn't decreasing faster ... and what we can do about it

This is another post on political, economic and futurist themes ... starting out with a reflection on a bogus patent, and winding up with a radical social policy proposal that just might improve life in the near future and also help pave the way to a positive Singularity... (yeah, I know I know, lack of ambition has never been one of my numerous faults ;-)

Let's start with the simple, oft-asked question: Why isn't the average workweek decreasing faster ... given all the amazing technology recently developed?

One clue is found in some news I read today: IBM has patented the idea of a specialized electronic device that makes it handier to split your restaurant bill among several people at the table:

http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=7,457,767.PN.&OS=PN/7,457,767&RS=PN/7,457,767

The patent application is really quite hilarious reading ... and the number of references to prior, equally bullshitty patents for related inventions is funny too.

What strikes me most here is the amount of effort expended by lawyers, patent examiners and judges in dealing with this crap. Not to mention their paralegals, secretaries, and on and on and on.

Why does this part of the contemporary human workload exist?

One could argue that some wasted work is a necessary part of a complex economic system, and that it would be hard to eliminate the crap without throwing out a lot of useful stuff as well.

I'm sure this is part of the truth, but I don't think it's the whole story.

Another part of the answer, I think, is: This kind of BS work exists because people have time to do it.

If people didn't have time to do this shit, because they all had to be occupied gathering food or making shelter or defending themselves against attackers -- or conceiving or manufacturing truly original and interesting inventions -- then the legal system would rapidly get adjusted so as to make bullshit patents like this illegal.

But, because we have plenty of extra resources in our economy ... due to the bounty created by advances in technology ... we (collectively and without explicitly intending to) adjust various parameters of our economy (such as the patent system) in such a way as to create loads of bullshit work for ourselves.

But there's a little more to it than this. Granted that we have the time ... but why do we choose to spend our time on this kind of crap? Instead of, say, doing less useless work and having more free time?

One important, relevant fact to grasp is that people like working.

Yes, it's not just that we're ambitious or greedy, it's that we actually like working -- in spite of the fact that we also like complaining about working. (Don't blame me -- humans are just fucked-up creatures....).

There is loads of data supporting this idea; I'll cite just a smattering here.

Americans work more and are happier than Europeans:

http://sayanythingblog.com/entry/americans_work_harder_happier_than_europeans/

Old folks who work longer are happier:

http://ideas.repec.org/p/pra/mprapa/5606.html

I can't remember the link, but studies show that people are on average happier at work than at home (if asked at random times during the day), but will when asked SAY they are happier at home than at work.... I think this study was done by Mihaly Csikszentmihalyi, who you can find out about at

http://www.time.com/time/health/article/0,8599,1606395,00.html

http://www.amazon.co.uk/Flow-Classic-Work-Achieve-Happiness/dp/0712657592

He is the pioneer of the study of "flow", the state of supreme happiness that many humans enter into when carrying out some highly absorbing, satisfying activity, like creating an artwork, participating in a sport, acting in a play or doing scientific or other professional work ... or gardening ... or chopping down a tree ... actually the activity can be almost anything, the real point is in the way the mind interacts with the activity: it has to be enough to fully occupy the mind without causing too much frustration, so that the self is continually lost and rediscovered within the ongoing dynamism of the interactive process....

Among many more interesting observations, he notes that:

"
Those at work generally report that they wish they were at home, but when they're home they often feel passive, depressed or bored. "They have in mind that free time at home will make them feel better, but often it doesn't,"
"

But, the plot thickens ... because, although we like working ... this doesn't necessarily mean we like working long hours.

People who work part-time are happier than those working full-time (84 per cent are happy versus 79 per cent):

http://www.arboraglobal.com/documents/Happiness%20at%20Work%20Index%202007.pdf

So where we seem to fail is in creating more part-time jobs...

This seems to be because, for an employer, it's always more cost efficient to hire one full time worker than two part-timers ;-(

So, instead of making use of the bounty provided by technology by creating part-time jobs and also creating more opportunities for creative, growth-producing, flow-inducing leisure ... we make use of it by creating more and more full-time jobs doing more and more bullshit work, like processing patents for "inventions" like the one I referenced above.

Given the way our brains work, we're better off working full-time than we would be not working at all.

But there is at least some evidence to suggest we'd be even better off working fewer hours....

But, given the way the market system works, there is almost never any incentive for any employer to allow part-time workers. It just isn't going to be rational from their perspective, as it will decrease their economic efficiency relative to competitors.

The market system, it seems, is going to push toward the endless creation of BS work, because no individual company cares about reducing the workweek ... whereas individual companies DO in many cases profit by creating BS work ... if they can bill other companies (or the government) for this bullshit work....

So, this leads to the idea that the government of France may have had the right idea at heart, in creating the 35 hour maximum workweek. Which they have since rescinded, because in a global context, it made them uncompetitive with other countries lacking such a restriction.

But anyway, outlawing long work hours is obviously not the answer. People should have the freedom to work long hours and get paid for them, if they want to.

So, an interesting question, policy-wise, is how the government can create incentives for reduced workweeks, without introducing punitive and efficiency-killing regulations.

One possibility would be, instead of just government projects aimed at paying people to build infrastructure, to launch government projects aimed at paying people to do interesting, creative, growth-inducing stuff in their spare time.

Basically, what I'm suggesting is: massively boosted government funding for art and science and other not-directly-economically-productive creative work ... but without making people jump through hoops of fire to get it (i.e. no long, tedious, nearly-sure-to-fail grant applications) ... and specifically designed NOT to discriminate against people who do not-directly-economically-productive creative work only part-time.

This would make it harder for companies to find full-time employees, because it wouldn't be all that much more lucrative to work full-time than to work part-time plus earn a creative-activity stipend on the side. But, unlike France's previous restrictive laws, it would enable companies that REALLY need full-time employees to hire them, so long as they were able to pay the required premium salaries ... or recruit workers who preferred the offered work to paid on-the-side creative-activity....

I suspect this would actually boost the economy, by shifting effort away from BS make-work like processing bogus patents, and toward creative work that would end up having more indirect, ancillary economic value in all sorts of ways.

This may seem a funny thing to think about in the current economic context, when the economy is in such trouble globally. But I consider it obvious that the current economic troubles are "just" a glitch (albeit an extremely annoying one) on the path to a post-scarcity economy. And even so, the government is still giving away money right now to people who are out of work. What if the payment were decreased to people who didn't engage in creative activities, but increased to people who did. Peoples' lives would become richer, as more of them would be more involved in creative activities. And, the human world as a whole would become richer because of all of these new creations.

And this sort of thing will become all the more interesting once robots and AI software eliminate more and more jobs. At that point the market system, unrestricted, would probably create an insane overgrowth of bullshit full-time jobs ... but a part-time creative-activity incentive as I've described could interfere with this dynamic, and nudge things toward a more agreeable situation where people divide their time between genuinely economically useful work, and non-directly-economically-useful but personally-and-collectively-rewarding creative activity.

Furthermore, this would create a culture of creative activity, that would serve us well once robots and AIs eliminate jobs altogether. It would be great if we could greet the Singularity with a culture of collective creativity rather than one characterized by a combination of long hours on useless bullshit jobs, combined with leisure time spent staring at the tube (whether the TV or the YouTube) or playing repetitive shoot-em-up video games ... etc. etc.

OK -- now ... time for me to get back to work ;-)

Tuesday, November 25, 2008

The Inevitable Increase of Irrationality (more musings on the recent financial crisis)

In a recent email dialogue with my dad, I found myself spouting off again about the possible causes of the recent financial crisis ... and found myself coming to a conclusion that may have some more general implications as we move forward toward Singularity.

Basically, I came to the conclusion that the financial crisis can be interpreted, in part, as a failure of the "rational actor" model in economics.

And furthermore, it's a failure of the rational actor model for an interesting reason: technology is advancing sufficiently that the world is too damn complex for most people to be rational about it, even if they want to be.

And this situation, I suggest, is likely to get worse and worse as technology advances.

Greenspan said something related in an interview shortly after the credit crunch hit: he said was shocked that major banks would deviate so far from rational self-interest.

To the extent this is the case, the recent crisis could be viewed as a failure of the rational-actor model -- and a validation of the need to view socioeconomic systems as complex self-organizing systems, involving psychological, sociological and economic dynamics all tangled together ... a need that will only increase as technology advances and makes it harder and harder for even smart rational-minded people to approximate rational economic judgments.

As a semi-aside: In terms of traditional economists, I'd say Galbraith's perspective probably accounts for this crisis best. He was always skeptical of over-mathematical approaches, and stressed the need to look at economic mechanisms in their social context. And of course, Obama's proposed solution to the problem is strongly Galbraithian in nature (except that Galbraith, not being an elected official, had the guts to call it "socialism" ;-)

Now, there is (of course) a counterargument to the claim that the recent financial crisis indicates a failure of the rational actor model. But one can also make a counter-counterargument, which I find more compelling.

The counterargument is: the banks as institutions were perhaps being irrational, but the individual decision-makers within the banks were being rational in that their incentive structures were asymmetric ... they got big bonuses for winning big, and small penalties for losing big. As a human being who is a banker, taking a huge gamble may be rational, if you get a huge bonus upon winning ... and upon losing, just need to go find another job (a relatively small penalty). So in that sense the individuals working in the banks may have been acting rationally ...

Yet the corporations were not acting rationally: which means that the bank shareholders were the ones not acting rationally, by not constraining the bank managers to put more appropriate incentive structures in place for their employees...

But why were the shareholders acting irrationally?

Well, I suggest that the reason the bank shareholders did not act rationally was, largely, out of ignorance.

Because the shareholders were just too ignorant of the actual risks involved in these complex financial instruments, not to mention of the incentive structures in place within the banks, etc.

We have plenty of legal transparency requirements in place, so that shareholders can see what's going on inside the corporations they invest in. But this is of limited value if the shareholders can't understand what's going on.

So, getting back to rational actor theory: the novel problem that we have here (added on top of the usual human problems of dishonesty, corruption and so forth) may be that, in a world that is increasingly complex (with financial instruments defined by advanced mathematics, for example), being a rational economic actor is too difficult for almost anybody.

The rational-actor assumption fails for a lot of reasons, as many economists have documented in the last few decades ... but this analysis of the current financial crisis suggests that as technology advances, it is going to fail worse and worse.

You could argue that this effect would be counterbalanced by the emergence of an increasingly effective professional "explainer" class who will boil down complex phenomena so that individuals (like bank shareholders) can make judgments effectively.

However, there are multiple problems with this.

For one thing, with modern media, there is so much noise out there that even if the correct explanations are broadcast to the world on the Web, the average person has essentially no way to select them from among the incorrect explanations. OK, they can assume the explanations given in the New York Times are correct ... but of course there is not really any objective and independent press out there, and "believing what the authorities tell you" is a strategy with many known risks.

And, secondly, it's not clear that the journalists of the world can really keep up with developments well enough to give good, solid, objective advice and explanations, even if that is their goal!

So we can expect that as Singularity approaches, the rational-actor model will deviate worse and worse from reality, making economics increasingly difficult to predict according to traditional methods.

Some folks thought that increasing technology would somehow decrease "market friction" and make markets more "efficient" ... in the technical sense of "efficiency," meaning that everyone is paying the mathematically optimal price for the things they buy.

But in fact, increasing technology seems to be increasing "market incomprehensibility" and hence, in at least some important cases, making markets LESS efficient ...

But of course, making markets less "efficient" in the technical sense doesn't necessarily make the economy less efficient in a more general sense.

The economy is, in some senses, becoming fantastically more and more efficient (at producing more and more interesting configurations of matter and mind given less and less usage of human and material resources) ... but it's doing so via complex, self-organizing dynamics ... not via libertarian-style, rational-actor-based free-market dynamics.

Interesting times ahead....

Glocal memory, neural nets, AI, psi

The dichotomy between localized and globalized (distributed) memory in the brain and world is pretty much bullcrap.

For a long time I've had the idea of harmonizing the two, and the Novamente and OpenCog AGI designs incorporate this harmony, as do the speculative brain theories I proposed in my books in the 1990s.

But I never really articulated the concept of global/local ... i.e. glocal ... memory in a general way before, which I've now done in a semi-technical essay called "Glocal Memory"

http://www.goertzel.org/dynapsyc/2008/glocal_memory.htm

I decided to write up the glocal memory stuff now because I'm writing a technical paper on glocal Hopfield neural nets and glocal attention allocation in OpenCog, and I wanted to have something to reference in the paper on glocal-memory-in-general, and there wasn't anything....

In case that's not odd enough for you, I also posted another essay using the glocal memory concept to give a possible model for how psi might work, building on others' ideas regarding filter and transmission models of mind:

http://www.goertzel.org/dynapsyc/2008/glocal_psi.htm

Whoop-di-doodly-dandy-oh

Wednesday, November 19, 2008

A meta-theory of consciousnes

Let

C = consciousness

P = physical reality

Then, the various theories of consciousness may be placed into 11 categories:
  1. C and P are the same
  2. C is a part of P
  3. P is a part of C
  4. C and P are parts of something else
  5. C and P share a common part but are nonidentical
  6. C and P are parts of each other (hyperset style)
  7. C and P are separate -- somehow correlated but not via parthood
  8. C does not exist, it's an illusion
  9. P does not exist, it's an illusion
  10. C and P are parts of each other, and also parts of something else
  11. C and P are parts of each other, and also parts of something else, which is also a part of them
The word "part" should not be overinterpreted in the above, it's just used in a generic and informal sense.

I haven't yet broken this down mathematically, but on quick observation these seem the only logically possible relationships involving C, P and parthood (and allowing for circular parthood).

Each of these theory-categories has some adherents, and if I had more time and tolerance-for-boredom, I could divide existing theories of consciousness into these categories.

The question I want to ask here is: What could the relationship between C and P be, so as to explain why different individuals would propose each of the above 11 theory-categories?

My observation is that there is ONE category of theories that would explain all the 11 theory-categories as different views of a common reality, involving in some cases errors of omission but in no case errors of commission.

This is category 11: that C and P contain each other, and are also contained in something else that is also a part of them

If we posit an underlying model of the general form

C ={C0, P, D}

P = {P0, C, D}

D = {D0, C, P}


then we can see how the other 10 theory-categories would emerge via sins of omission:
  1. C and P are the same results from ignoring C0 and P0
  2. C is a part of P results from ignoring C0
  3. P is a part of C results from ignoring P0
  4. C and P are parts of something else ignores that D is a part of C and P, and that C and P are parts of each other
  5. C and P share a common part but are nonidentical, is correct so long as one allows this common part to be a hyperset (which is C nested within P nested within C ... etc.)
  6. C and P are parts of each other (hyperset style) results from ignoring D
  7. C and P are separate -- somehow correlated but not via parthood -- results from projecting C into {C0}, P into {P0}
  8. C does not exist, it's an illusion results from ignoring C
  9. P does not exist, it's an illusion results from ignoring P
  10. C and P are parts of each other, and also parts of something else, results from ignoring that C and P contain D
It is thus tempting to conjecture that 11 is the underlying reality, and people who posit the other 10 theories are simply ignoring some of the aspects of reality -- i.e. to use the old metaphor, they are like blind people each feeling a different part of the elephant and concluding that's all there is.

Question for reflection: what is D? ;-)

Friday, November 14, 2008

Ethics as an Attractor

While visiting my father a couple weeks ago, he talked a bit about what he sees as one of the core ideas of religion: the notion of some kind of "universal morality" or "universal moral force-or-field-or-whatever."

While he's not a big fan of the more superstitious aspects of historical or contemporary religions, he does believe there is a universal sense of right versus wrong. For instance, he feels that killing a random person on the street just because one is in a bad mood, is somehow REALLY wrong, not just "wrong according to some person or group's subjective belief system."

My main initial reaction to this idea was not understanding what it might mean. I'm enough of a postmodernist that absolutes don't make much sense to me.

Even "2+2=4" is only true relative to some specific formal system defining the terms involved. And " '2+2=4' is true relative to formal system F " is also only true relative to some metamathematical system -- and so on ... even in mathematics, you never escape the chain of indirections and get to something absolute.

But in the few days after we had that conversation, I thought a bit about what meaning could be assigned to his notion of universal morality within my postmodernist world-view. This is a topic I addressed in the final chapter of The Path to Posthumanity, but not from exactly this perspective: there I was more concerned with enunciating moral principles sufficiently abstract to guide the development of posthumans, rather than with debating the absoluteness or relativity of such principles.

An interesting perspective with pertinence to this issue is Mark Waser's argument that ethical, cooperative behavior (in some form) may be an "attractor" of social systems. Meaning roughly that:

  • social systems without something like this are unlikely to survive ... except by adopting some form of ethical, cooperative behavior pattern as a norm
  • as a new social system originates and gradually grows, it is likely to evolve its own form of ethical, cooperative behavior

I think this is an interesting perspective, and in the next paragraphs I'll point out some of its limits, and then connect it back to the conversation with my father.

To make the argument more concrete, I'll begin by defining "ethics" in my own special way -- probably not precisely the same as how Mark intends it; but then, in his paper Mark doesn't give a very precisely drawn definition. First of all I'll define a "code of social behavior" as a set of rules, habits or principles that some agent in a society adopts to guide its behavior toward other agents in that society. I'll then define an ethics as a code of social behavior such that

  • it occurs in an agent that has the internal flexibility to plausibly adopt the ethics, or not, without causing some sort of immediate disaster for the agent
  • it occurs in an agent that is capable of counterfactual reasoning regarding the social situations it regularly encounters (psychologists have various ways to test this, that have been used to study animals for example)
  • it involves the agent carrying out behaviors that it reasons (counterfactually) it would not carry out if it did not adhere to the ethics
  • it involves the agent carrying out behaviors that the agent believes will benefit other agents in the society

In short, I define an ethics as a code that agents can optionally adopt, and if they adopt it, they know they're taking actions to benefit others, that they reason they wouldn't take in the absence of the code of ethics.

This reminds of a story one of my great-uncles used to tell about my 5-year-old incarnation. He was watching me play with toys, and then my little sister Rebecca came up and asked for one of the toys I was most enjoying. After a moment of reflection, I gave it to her, and then I commented to him that "Sometimes it feels good to do something for someone else that you don't want to do."

Pertaining to Mark Waser's argument, we can then ask whether ethics in this sense is likely to be an attractor. There are two questions here, of course: will the existence of an ethics, any ethics, be an attractor; and will some specific ethics be an attractor. I'll deal mainly with the first question, because if "the existence of ethics" isn't an attractor, then obviously no specific ethics is going to be.

The main limit I see in Waser's argument (as ported to my definition of ethics) is that the argument doesn't seem to apply in cases where one or more of the members of the social system are vastly more intrinsically capable than the others (in relevant ways).

In societies of humans, it could be argued that unethical behavior is ultimately unstable, because the oppressed underclass (with asymmetrically little social power, but roughly equal intrinsic capabilities) will eventually revolt. But the possibility of revolt exists because outside of the scope of the social system, all healthy adult humans have roughly the same level of intrinsic capability.

One can imagine a scenario roughly like the one in H.G. Wells "The Time Machine," where a subset of society that actually is strongly more capable in relevant senses (smarter, saner, stronger) takes control and oppresses the less capable majority. Perhaps any adequately capable individual among the underclass is either killed or taken into the overclass ... and any inadequately capable person among the overclass is either killed or tossed into the underclass. In this kind of scenario, after a certain number of generations, one would have a situation in which there would be pressure for ethics within each class, but not ethics between them.

Nothing like the above has ever happened in human history, of course, and nor is it likely to. However, the case of future AI minds is somewhat different. All humans are built according to the same architecture and have roughly the same amount of intrinsic computational resources, but the same won't necessarily be the case for all AIs.

I see no reason to believe that existence-of-ethics will be attracting in societies involving members with strongly asymmetric capabilities. In fact, it seems it might be easier to frame an alternate argument: that in a society consisting of two groups of radically different degrees of intrinsic capability, the attractor will be

  • ethical behavior within the overclass
  • ethical behavior within the underclass
  • exploitative behavior from the overclass to the underclass

A related situation is human behavior toward "lower animals" -- but this is a different sort of matter because animals don't meet the criteria of ethical agents I laid out above. Adult treatment toward children also doesn't quite fit the mold of this situation, because the intrinsic difference in capability between parents and children reverses as the children grow older (leading to sayings like, "Don't spank your kid too hard; when he grows up he'll be the one choosing your nursing home!").

One thus arrives at the hypothesis that a restricted form of Waser's argument might hold: maybe existence-of-ethics is an attractor in societies composed of agents with roughly equal intrinsic capabilities (in relevant situations).

As to what specific ethical codes may be attractors, it seems to me that is going to depend upon the specifics of the agents and societies. But the general phenomenon of choosing actions for others' benefit, that one knows one would not take in the absence of the ethical code, seems something that could plausibly be argued to serve as an attractor in any society of sufficiently flexible, intelligent organisms with roughly equal intrinsic relevant-capabilities.

Note what the notion of an attractor means here: essentially it means that if you have a society with the right characteristics that *almost* has ethics, then the society will eventually evolve ethics.

Ethics being an attractor doesn't imply that it must be the only attractor; there could be other attractors with very different properties. Arguing that any society with the right characteristics will necessarily evolve into a state supporting ethics, would be a stronger argument.

Another twist on this is obtained by thinking about the difference between the conscious and unconscious minds.

Let's say we're talking about a society consisting of agents with reflective, deliberative capability -- but with a lot of mental habits that aren't easily susceptible to deliberation. This is certainly the situation we humans are in: most of the unconscious behavior-patterns that govern us, are extremely difficult for us to pull into our theater of conscious reflection and reason about ... for a variety of reasons, including the limited capacity of our conscious theater, and the way much unconscious knowledge is represented in the brain, which is only laboriously translatable into the way the conscious theater wants to represent knowledge.

Then, it may be that ethics winds up getting largely packed into the unconscious part of the mind, which is hard to deliberatively reason about. This might happen, for instance, if ethics were largely taught by imitation and reinforcement, rather than by abstract instruction. And this does seem to be how early-childhood ethical instruction happens among humans. We correct a child for doing something bad and reward them for doing something good (reinforcement learning), and we indicate to them real-world everyday-life examples of ethical behavior (both via personal example and via fictional stories, movies and the like). Abstract ethical principles only make sense to them via grounding in this experiential store of ethical examples.

So, if ethics evolves in a society due to its attracting nature, and is memetically propagated largely through unconscious instruction, then in effect what is happening is that in many cases

  • the reflective, deliberative mind is thinking about the individual organism and its utility
  • the unconscious mind is thinking about the superorganism of the overall society, via the experientially inculcated ethical principles

The voice of the conscience is thus revealed as the voice of the existence-of-ethics attractor that superorganisms (I hypothesize, following Waser) inevitably settle into, assuming their member agents possess certain characteristics.

Where does this leave my dad's notion of a universal ethical force? It doesn't validate any such thing, in quite the sense that my dad seemed to mean it.

However, it does validate the notion that an unconscious sense of ethics may be universal in the sense of being an inevitable mathematical property of any society satisfying certain characteristics.

What does this mean for me, as a reasoning being with an intuitive, unconscious sense of ethics ... and also the deliberative capability to think about this ethical sense ... as well as some ability to modify this sense of ethics if I want to?

Among other things, it reminds me that the deliberative, ratiocinative aspect of the mind probably needs to be a little humbler than it sometimes gets to be, inside us hyperintellectual, nerdy types. The "I" of the deliberative mind is a symbol for the whole mind, rather than constituting the whole mind ... and there may be systematic patterns characterizing which of our mind-patterns get stored in easily-deliberatively-accessible form and which do not. So as frustrating as it can be to those of us in love with ratiocination, if one wants to be maximally rational, one must accept that sometimes "the unconscious knows best" ... even if one can't understand its reasons ... because through observation, imitation and reinforcement, it gained experiential knowledge that possesses a scale and (in some cases) a form that is not ratiocination-friendly (but may yet be rationally considered extremely useful for goal-achievement).

Unfortunately, the unconscious also makes a lot of mistakes and possesses a powerful capacity for tricking itself ... which, all in all, makes the business of being a finite-resources mind with an even-more-acutely-finite-resources deliberative/rational component, an inordinately tricky one.

I personally find these issues slightly less confusing if I view them from the perspective of pattern space. Suppose we consider the universe as set of patterns, governed by a variety of pattern-dynamics rules, including the rule that patterns tend to extend themselves. Different patterns have different degrees of power -- that is, they have differing probabilities of success in extending themselves. The arguments I've given above suggest that ethics, as an abstract pattern for organizing behaviors, is a pattern of considerable power, especially among societies of intelligent entities of roughly comparable capability. In this view there is no "universal moral force" -- the universal force is the tendency of patterns to extend themselves, and ethics as a pattern seems to contain a great deal of this force.

On these issues there are two fairly extreme points of view, which may be compactly summarized (I hope not parodized) as:

  1. there is an absolutely correct ethics, an absolute right versus wrong, which may be imperfectly known to humans but which we can hope to better and better discern through various mental and social exercises
  2. ethics is purely relative in nature: people adopt ethical codes because they were taught to, or maybe because their genes tell them to (i.e. because their species was taught to, by evolution) ... but there is no fundamental meaning to these ethics rather than social, psychological or evolutionary happenstance

Unlike my father, I never had much attraction to the first perspective. The second perspective has bedeviled me from time to time, yet I've always been nagged by a suspicion there's something deeper going on (yes, of course, someone can attribute this internal nagging to my psychology, my upbringing or my evolutionary heritage!). I don't delude myself I've fully gotten to the bottom of the issue here, but I hope that (building on the ideas of others) I've made a teeny bit of progress.

In What Sense Is "Overcoming Bias" a Good Goal?

This blog post consists of some musings that arose to me a few weeks ago on reading the multi-author blog Overcoming Bias. They were not triggered by the contents of the blog -- though those are sometimes quite interesting -- but merely by the name. My business in this post is to probe the extent to which "overcoming bias" is actually the right way to think about the problem of (as the Welcome page for the Overcoming Bias blog informally puts it) "How can we obtain beliefs closer to reality?"

I think it's clear that "overcoming bias" is important, but I also think it's important to explore and understand the limitations of "overcoming bias" as a methodology for obtaining beliefs that are more useful in achieving one's goals.

(Note that in my own thinking, I tend to more often think in terms of "obtaining beliefs that are more useful in achieving one's goals," rather than in terms of "obtaining beliefs that are closer to reality." In many contexts this just amounts to a nitpick, but it also reflects a significant philosophical distinction. I don't make the philosophical assumption of an objective reality, to which beliefs are to be compared. My philosophical view of beliefs could be loosely considered Nietzschean, though I doubt it agrees with Nietzsche's views in all respects.)

According to Wikipedia,

"
Bias is a term used to describe a tendency or preference towards a particular perspective, ideology or result, especially when the tendency interferes with the ability to be impartial, unprejudiced, or objective. The term biased is used to describe an action, judgment, or other outcome influenced by a prejudged perspective.
"

This definition is worth dissecting because it embodies two different aspects of the "bias" concept, which are often confused in ordinary discourse. I'll call these:

Bias_1: "a tendency or preference toward a particular perspective"

Bias_2: "an instance of Bias_1 that interferes with the ability to be impartial, unprejudiced or objective", which I'll replace with "... interferes with the ability to achieve one's goals effectively."

Bias_1 is, obviously, not necessarily a bad thing.

First of all, the universe we live in has particular characteristics (relative to a universe randomly selected from a sensibly-defined space of possible universes), and being biased in a way that reflects these characteristics may be a good thing.

Secondly, the particular environments and goals that an organism lives in, may have particular characteristics, and it may benefit the organism to be biased in a way that reflects these characteristics.

Now, ideally, an organism would be aware of its environment- or goal- specific biases, so that if its environment or goals change, it can change its biases accordingly. On the other hand, it may be that maintaining this awareness detracts from the organism's ability to achieve goals, if it consumes a lot of resources that could otherwise be spent doing other stuff (even if this other stuff is done in a way that's biased toward the particular environment and goals at hand).

When discussed in a political context, "bias" is assumed to be a bad thing, as in the "especially when" clause of the above Wikipedia definition. Gender bias and racial bias are politically incorrect and according to most modern moral systems, immoral. The reason these biases are considered to be bad is rooted in the (correct, in most cases) assumption that they constitute Bias_2 with respect to the goals that most modern moral systems say we should have.

On the other hand, in cognitive science, bias is not always a bad thing. One may argue, as Eric Baum has done persuasively in What Is Thought?, that the human mind's ability to achieve its goals in the world is largely due to the inductive bias that it embodies, which is placed into it via evolutionary pressure on brain structure. In this context, bias is a good thing. The brain is a general-purpose intelligence, but it is biased to be able to more easily solve some kinds of problems (achieve some kinds of goals) than others. Without this biasing, there's no way a system with the limited computational capacity of the human brain would be able to learn and do all the things it does in the short lifespan of a human organism. The inductive bias that Baum speaks about is largely discussed as Bias_1, but also may in some cases function as Bias_2, because biases that are adaptive in some circumstances may be maladaptive in others.

One might argue that, in the case of evolved human inductive bias, it's the evolutionary process itself that has been less biased, and has (in a relatively unbiased way) evolved brains that are biased to the particular conditions on Earth. However, this is not entirely clear. The evolutionary mechanisms existing on Earth have a lot of particularities that seem adapted to the specific chemical conditions on Earth, for example.

One may argue that, even though we humans are born with certain useful biases, it is to our advantage to become reflective and deliberative enough to overcome these biases in those cases where they're not productive. This is certainly true -- to an extent. However, as noted above, it's also true that reflection and deliberation consume a lot of resources. Any organism with limited resources has to choose between spending its resources overcoming its biases (which may ultimately help it to achieve its goals), and spending its resources achieving its goals in a direct ways.

Furthermore, it's an interesting possibility that resource-constrained minds may sometimes have biases that help them achieve their goals, yet that they are not able to effectively reflect and deliberate on. Why might this be? Because the class of habits that an organism can acquire via reinforcement learning, may not fully overlap with the class of habits that the organism can study via explicit reflective, deliberative inference. For any particular mind-architecture, there are likely to be some things that are more easily learnable as "experientially acquired know-how" than as explicit, logically-analyzable knowledge. (And, on the other hand, there are going to be other things that are more easily arrived at via explicit inference than via experiential know-how acquisition.)

If a certain habit of thought is far more amenable to experiential reinforcement based learning than reflective, logical deliberation, does this mean that one cannot assess its quality, with a view toward ridding it of unproductive biases? Not necessarily. But overcoming biases in these habits may be a different sort of science than overcoming biases in habits that are more easily susceptible to reason. For instance, the best way to overcome these sorts of biases may be to place oneself in a large variety of different situations, so as to achieve a wide variety of different reinforcement signaling patterns ... rather than to reflectively and deliberatively analyze one's biases.

Many of these reflections ultimately boil down to issues of the severely bounded computational capability of real organisms. This irritating little issue also arises when analyzing the relevance of probabilistic reasoning (Bayesian and otherwise) to rationality. If you buy Cox's or de Finetti's assumptions and arguments regarding the conceptual and mathematical foundations of probability theory (which I do), then it follows that a mind, given a huge amount of computational resources, should use probability theory (or do something closely equivalent) to figure out which actions it should take at which times in order to achieve its goals. But, these nice theorems don't tell you anything about what a mind given a small or modest amount of computational resources should do. A real mind can't rigorously apply probability theory to all its judgments, it has to make some sort of heuristic assumptions ... and the optimal nature of these heuristic assumptions (and their dependencies on the actual amount of space and time resources available, and the specific types of goals and environments involved, etc.) is something we don't understand very well.

So, the specific strategy of overcoming Bias_2 by adhering more strictly to probability theory, is interesting and often worthwhile, but not proven (nor convincingly argued) to always the best thing to do for real systems in the real world.

In cases where the answer to some problem can be calculated using probability theory based on a relatively small number of available data items ... or a large number of data items that interrelate in a relatively simply characterizable way ... it's pretty obvious that the right thing for an intelligent person to do is to try to overcome some of their evolutionary biases, which may have evolved due to utility in some circumstances, but which clearly act as Bias_2 in many real-world circumstances. The "heuristics and biases" literature in cognitive psychology contains many compelling arguments in this regard. For instance, in many cases, it's obvious that the best way for us to achieve our goals is to learn to replace our evolved mechanisms for estimating degrees of probability, with calculations more closely reflecting the ones probability theory predicts. Professional gamblers figured this out a long time ago, but the lesson of the heuristics and biases literature has been how pervasive our cognitive errors (regarding probability and otherwise) are in ordinary life, as well as in gambling games.

On the other hand, what about problems a mind confronts that involve masses of different data items, whose interrelationships are not clear, and about which much of the mind's knowledge was gained via tacit experience rather than careful inference or scientific analysis?

Many of these problems involve contextuality, which is a difficult (though certainly not impossible) thing to handle pragmatically within formal reasoning approaches, under severe computational resource constraints.

For these problems, there seem to be two viable strategies for improving one's effectiveness at adapting one's beliefs so as to be able to more adeptly achieve one's goals:

  1. Figure out a way to transform the problem into the kind that can be handled using explicit rational analysis
  2. Since so much of the knowledge involved was gained via experiential reinforcement-learning rather than inference ... seek to avoid Bias_2 via achieving a greater variety of relevant experiences

So what's my overall takeaway message?

  • We're small computational systems with big goals, so we have to be very biased, otherwise we wouldn't be able to achieve our goals
  • Distinguishing Bias_1 from Bias_2 is important theoretically, and also important *but not always possible* in practice
  • The right way to cure instances of Bias_2 depends to some extent on the nature of the mental habits involved in the bias
  • In some cases, diversity of experience may be a better way to remove Bias_2, than explicit adherence to formal laws of rationality
  • It is unclear in which circumstances (attempted approximate) adherence to probability theory or other formal laws of rationality is actually the right thing for a finite system to do, in order to optimally achieve its goals
  • Heuristically, it seems that adherence to formal laws of rationality generally makes most sense in cases where contextuality is not so critical, and the relevant judgments depend sensitively mainly on a relatively small number of data items (or a large number of relatively-simply-interrelated data items)

Saturday, November 08, 2008

In Search of the "Machines of Loving Grace": AI, Robotics and Empathy

Empathy: the ability to feel each others’ feelings. It lies at the core of what makes us human. But how important will it be to the artificial minds we one day create? What are AI researchers doing to imbue their creations with artificial empathy ... and should they be doing more? In short, what is the pathway to the “machines of loving grace” that poet Richard Brautigan foresaw?

The mainstream of AI research has traditionally focused on the more explicitly analytical, intellectual, nerdy aspects of human intelligence: planning, problem-solving, categorization, language understanding. Recent attempts to broaden this focus have focused mainly on creating software with perceptual and motor skills: computer vision systems, intelligent automated vehicles, and so forth. Missing almost entirely from the AI field is the more social and emotional aspects of human intelligence. Chatbots attempting the Turing test have confronted these aspects directly – going back to ELIZA, the landmark AI psychotherapist from the early 1970’s – but these bots are extremely simplistic and have little connection to the main body of work in the AI field.

I think this is a major omission, and my own view is that empathy may be one of the final frontiers of AI. My opinion as an AI researcher is that, if we can crack artificial empathy, the rest of the general-AI problem will soon follow, based on the decades of successes that have already been achieved in problem-solving, reasoning, perception, motorics, planning, cognitive architecture and other areas.

(In my own AI work, involving the Novamente Cognition Engine and the OpenCog Prime system, I’ve sought to explicitly ensure the capability for empathy via multiple coordinated design aspects – but in this blog post I’m not going to focus on that, restricting myself to more general issues.)

Why would empathy be so important for AI? After all, it’s just about human feelings, which are among the least intelligent, most primitively animal-like aspect of the human mind? Well, the human emotional system certainly has its quirks and dangers, and the wisdom of propagating these to powerful AI systems is questionable. But the basic concept of an emotion, as a high-level integrated systemic response to a situation, is critical to functioning of any intelligent system. An AI system may not have the same specific emotions as a human being – particular emotions like love, anger and so forth are manifestations of humans’ evolutionary heritage, rather than intrinsic aspects of intelligence. But it seems unlikely that an AI without any kinds of high-level integrated systemic responses (aka emotions) would be able to cope with the realities of responding to a complex dynamic world in real-time.

A closely related point is the social nature of intelligence. Human intelligence isn’t as individual as we modern Westerners often seem to think: a great percentage of our intelligence is collective and intersubjective. Cognitive psychologists have increasingly realized this in recent decades, and have started talking about “distributed cognition.” If the advocates of the “global brain” hypothesis are correct, then eventually artificial minds will synergize with human minds to form a kind of symbiotic emergent cyber-consciousness. But in order for distributed cognition to work, the minds in a society need to be able to recognize, interpret and respond to each others’ emotions. And this is where empathy comes in.

Mutual empathy binds together social networks: as we go through our lives we are perpetually embodying self-referential emotional equations like

X = I feel pain and that you feel X



Y = I feel that you feel both joy and Y

or mutually-referential ones like

A = I feel happy that you enjoy both this music and B

B = I feel surprised that you feel A

These sorts of equations bind us together: as they unfold through time they constitute much of the rhythm by which our collective intelligence experiences and creates.

So empathy is important: but how does it work?

We don’t yet know for sure ... but the best current thinking is that there are two aspects to how the brain does empathy: inference and simulation. (And I do think this is a lesson for AI: in my own AI designs I deal with these aspects separately, and then address their interaction ... and I do think this is the right approach.)

Inference-wise, empathy has to do with understanding and modeling (sometimes consciously, sometimes unconsciously) what another person must be feeling, based on the cues we perceive and our background knowledge. Psychologists have mapped out transformational rules that help us do this modeling.

Simulative empathy is different: we feel what each other are feeling. A rough analogue is virtualization in computers: running Windows on a virtual machine within Linux; emulating an Activision console within Windows. Similarly, we use the same brain-systems that are used to run ourselves, to run a simulation of another person feeling what they seem to be feeling. And we do this unconsciously, at the body level: even though we don’t consciously notice that sad people have smaller pupils, our pupils automatically shrink when we see a sad person -- a physiological response that synergizes with our cognitive and emotional response to their sadness (and is the technique the lead character uses in Blade Runner to track down androids who lack human feeling). A long list of examples has been explored in the lab already, and we’ve barely scratched the surface yet: people feel disgust when they see others smell bad odor, they feel pain when they see others being pierced by a needle or get electrical shock, they sense touching when they see others being brushed, etc.

Biologists have just started to unravel the neural basis of simulative empathy, which seems to involve brain cells called mirror neurons ... which some have argued play a key role in other aspects of intelligence as well, including language learning and the emergence of the self (I wrote a speculative paper on this a couple years back).

(A mirror neuron is a neuron which fires both when an animal acts and when the animal observes the same action performed by another animal (especially one of the same species). Thus, the neuron "mirrors" the behavior of another animal, as though the observer were itself acting. These neurons have been directly observed in primates, and are believed to exist in humans and in some birds. In humans, brain activity consistent with mirror neurons has been found in the premotor cortex and the inferior parietal cortex.)

So: synergize inference and simulation, and you get the wonderful phenomenon of empathy that makes our lives so painful and joyful and rich, and to a large extent serves as the glue holding together the social superorganism.

The human capacity for empathy is, obviously, limited. This limitation is surely partly due to our limited capabilities of both inference and simulation; but, intriguingly, it might also be the case that evolution has adaptively limited the degree of our empathic-ness. Perhaps an excessive degree of empathy would have militated against our survival, in our ancestral environments?

The counterfactual world in which human empathy is dramatically more intense is difficult to accurately fathom. Perhaps, if our minds were too tightly coupled emotionally, progress would reach the stage of some ant-colony-like utopia and then halt, as further change would be too risky in terms of hurting someone else’s feelings. On the other hand, perhaps a richer and more universal empathy would cause a dramatic shift in our internal architectures, dissolving or morphing the illusion of “self” that now dominates our inner worlds, and leading to a richer way of individually/collectively existing.

One aspect of empathy that isn’t sufficiently appreciated is the way it reaches beyond the touchy-feely sides of human life: for instance it pervades the worlds of science and business as well, which is why there are still so many meetings in the world, email, Skype and WebEx notwithstanding. The main reason professionals fly across the world to hob-nob with their colleagues – in spite of the often exhausting and tedious nature of business travel (which I’ve come to know all too well myself in recent years) -- is because, right now, only face-to-face communication systematically gives enough of the right kind of information to trigger empathic response. In a face-to-face meeting, humans can link together into an empathically-joined collective mind-system, in a way that doesn’t yet happen nearly as reliably via electronically-mediated communications.

Careful study has been given to the difficulty we have empathizing with certain robots or animated characters. According to Mori’s theory of the “uncanny valley” – which has been backed up by brain imaging studies -- if a character looks very close to human, but not close enough, then people will find it disturbing rather than appealing. We can empathize more with the distorted faces of Disney cartoons or manga, than with semi-photo-realistic renditions of humans that look almost-right-but-eerily-off.

To grasp the uncanny valley viscerally, watch one of the online videos of researcher Hiroshi Ishiguro and the “geminoid” robot that is his near-physical-clone -- an extremely lifelike imitation of his own body and its contours, textures and movements. No AI is involved here: the geminoid is controlled by motion-capture apparatus that watches what Ishiguro does and transfers his movements to the robot. The imitation is amazing – until the bot starts moving. It looks close enough to human that its lack of subtle human expressiveness is disturbing. We look at it and we try to empathize, but we find we’re empathizing with a feelingless robot, and the experience is unsettling and feels “wrong.” Jamais Cascio has proposed that existly this kind of reaction may occur to transhumans with body modifications – so from that point of view among others, this phenomenon may be worth attending.

It’s interesting to contrast the case of the geminoid, though, with the experience of interacting with ELIZA, the AI psychotherapist created by Joseph Weizenbaum in 1966. In spite of having essentially no intrinsic intelligence, ELIZA managed to carry out conversations that did involve genuine empathic sharing on the part of its conversaton-partners. (I admit ELIZA didn’t do much for me even back then, but, I encountered it knowing exactly what it was and intrigued by it from a programming perspective, which surely colored the nature of my experience.)

Relatedly, some people today feel more empathy with their online friends than their real-life friends. And yet, I can’t help feel there’s something key lacking in such relationships.

One of the benefits of online social life is that one is freed from the many socio-psychological restrictions that come along with real-world interaction. Issues of body image and social status recede into the background – or become the subject of wild, free-ranging play, as in virtual worlds such as Second Life. Many people are far less shy online than in person – a phenomenon that’s particularly notable in cultures like Japan and Korea where social regulations on face-to-face communcation are stricter.

And the benefits can go far beyond overcoming shyness: for example, a fifty-year-old overweight trucker from Arkansas may be able to relate to others more genuinely in the guise of a slender, big-busted Asian girl with blue hair, a microskirt and a spiky tail... and in Second Life he can do just that.

On the other hand, there’s a certain falsity and emotional distance that comes along with all this. The reason the trucker can impersonate the Asian ingenue so effectively is precisely that the avenues for precise emotional expression are so impoverished in today’s virtual environments. So, the other fifty-year-old trucker from Arkansas whose purple furry avatar is engaged in obscene virtual acts with the Asian babe, has to fill in the gaps left by the simplistic technology – to a large extent, the babe he’s interacting with is a construct of his own mind, which improvises on the cues provided by the signals given by the first trucker.

Of course, all social interaction is constructive in this way: the woman I see when I talk to my wife is largely a construct of my own mind, and may be a different woman than I would see if I were in a different mood (even if her appearance and actions were precisely the same). But text-chat or virtual-world interactions are even more intensely constructive, which is both a plus and a minus. We gain the ability for more complete wish-fulfillment (except for wishes that are intrinsically tied to the physical ... though some people do impressively well as satisfying virtual satisfactions for physical ones), but we lose much of the potential for growing in new directions via empathically absorbing emotional experiences dramatically different from anything we would construct on our own based on scant, sketchy inputs.

It will be interesting to see how the emotional experience of virtual world use develops as the technology advances ... in time we will have the ability to toggle how much detail our avatars project, just as we can now choose whether to watch cartoons or live action films. In this way, we will be able to adjust the degree of constructive wish-fulfillment versus self-expanding experience-of-other ... and of course to fulfill different sorts of wishes than can be satisfied currently in physical or virtual realities.

As avatars become more realistic, they may encounter the uncanny valley themselves: it may be more rewarding to look at a crude, iconic representation of someone else’s face, than a representation that’s almost-there-but-not-quite ... just as with Ishiguro’s geminoids. But just as with the geminoids, the technology will get there in time.

The gaming industry wants to cross the uncanny valley by making better and better graphics. But will this suffice? Yes, a sufficiently perfected geminoid or game character will evoke as much empathy as a real human. But for a robot or game character controlled by AI software, the limitation will probably lie in subtleties of movement. Just like verbal language, the language of emotional gestures is one where it’s hard to spell out the rules exactly: we humans grok them from a combination of heredity and learning. One way to create AIs that people can empathize with will be to make the AIs themselves empathize, and reflect back to people the sorts of emotions that they perceive. Much as babies imitate adult emotions. Envision a robot or game character that watches a video-feed of your face and tailors its responses to your emotions.

Arguably, creating AIs capable of empathy has importance far beyond the creation of more convincing game characters. One of the great unanswered questions as the Singularity looms is how to increase the odds that once our AIs get massively smarter than we are, they still value our existence and our happiness. Creating AIs that empathize with humans could be part of the answer.

Predictably, AI researchers so far have done more with the inferential than the simulative side of empathic response. Selmer Bringsjord’s team at RPI got a lot of press earlier this year for an AI that controls a bot in Second Life, in a way that demonstrates a limited amount of “theory of mind”: the bot watches other characters with a view toward figuring out what they’re aware of, and uses this to predict their behavior and guide its interactions.

But Bringsjord’s bots don’t try to feel the feelings of the other bots or human-controlled avatars they interact with. The creation of AIs embodying simulative empathy seems to be getting very little attention. Rosalind Picard’s Affective Computing Lab at MIT has done some interesting work bringing emotion into AI decision processses but has stopped short of modeling simulative empathy. But I predict this is a subfield that will emerge within the next decade. In fact, it seems plausible that AI’s will one day be far more empathic than humans are – not only with each other but also with human beings. Ultimately, an AI may be able to internally simulate you better than your best human friend, and hence demonstrate a higher degree of empathy. Which will make our games more fun, our robots less eerie, and potentially help make the post-Singularity world a more human-friendly place.

Thursday, October 30, 2008

Zarathustra, Plato, Saving Boxes, Oracle Machines and Pineal Antennae

Reading over the conversation I had (with Abram Demski) in the Comments to a prior blog post

http://multiverseaccordingtoben.blogspot.com/2008/10/are-uncomputable-entities-useless-for.html

I was reminded of a conversation I had once with my son Zarathustra when he was 4 years old.

Zar was defending his claim that he actually was omniscient, and explaining how this was consistent with his apparent ignorance on many matters. His explanation went something like this:

"I actually do know everything, Ben! It's just that with all that stuff in my memory, it can take me a really really long time to get the memories out ... years sometimes...."

Of course, Zar didn't realize Plato had been there before (since they didn't cover Plato in his pre-school...).

He also had the speculation that this infinite memory store, called his "saving box", was contained in his abdomen somewhere, separate from his ordinary, limited-scope memories in his brain. Apparently his intuition for philosophy was better than for biology... or he would have realized it was actually in the pineal gland (again, no Descartes in preschool either ;-p).

This reminded me of the hypothesis that arose in the conversation with Abram, that in effect all humans might have some kind of oracle machine in their brains.

If we all have the same internal neural oracle machine (or if, say, we all have pineal-gland antennas to the the same Cosmic Oracle Machine (operated by the ghost of Larry Ellison?)), then we can communicate about the uncomputable even though our language can never actually encapsulate what it is we're talking about.

Terrence McKenna, of course, had another word for these quasi-neural oracle machines: machine-elves ;-)

This means that the real goal of AGI should be to create a software program that can serve as a proper antenna 8-D

Just a little hi-fi sci-fi weirdness to brighten up your day ... I seem to have caught a bad cold and it must be interfering with my thought processes ... or messing up the reception of my pineal antenna ...

P.S.

perhaps some evidence for Zar's saving-box theory:

http://hubpages.com/hub/Cellular-Memories-in-Organ-Transplant-Recipients

Tuesday, October 28, 2008

Random Memory of a Creative Mind (Paul Feyerabend)

I had a brief but influential (for me: I'm sure he quickly forgot it) correspondence with the philosopher-of-science Paul Feyerabend when I was 19.

I sent him a philosophical manuscript of mine, printed on a crappy dot matrix printer ... I think it was called "Lies and False Truths." I asked him to read it, and also asked his advice on where I should go to grad school to study philosophy. I was in the middle of my second year of grad school, working toward my PhD in math, but I was having second thoughts about math as a career....

He replied with a densely written postcard, saying he wasn't going to read my book because he was spending most of his time on non-philosophy pursuits ... but that he'd glanced it over and it looked creative and interesting (or something like that: I forget the exact words) ... and, most usefully, telling me that if I wanted to be a real philosopher I should not study philosophy academically nor become a philosophy professor, but should study science and/or arts and then pursue philosophy independently.

His advice struck the right chord and the temporary insanity that had caused me to briefly consider becoming a professional philosopher, vanished into the mysterious fog from which it had emerged ...

(I think there may have been another couple brief letters back and forth too, not sure...)

(I had third thoughts about math grad school about 6 months after that, and briefly moved to Vegas to become a telemarketer and Henry-Miller-meets-Nietzsche style prose-poem ranter ... but that's another story ... and anyways I went back to grad school and completed my PhD fairly expeditiously by age 22...)


P.S.

Even at that absurdly young age (but even more so now), I had a lot of disagreements with Feyerabend's ideas on philosophy of science -- but I loved his contentious, informal-yet-rigorous, individualistic style. He thought for himself, not within any specific school of thought or tradition. That's why I wrote to him -- I viewed him as a sort of kindred maverick (if that word is still usable anymore, given what Maverick McCain has done to it ... heh ;-p)

My own current philosophy of science has very little to do with his, but, I'm sure we would have enjoyed arguing the issues together!

He basically argued that science was a social phenomenon with no fixed method. He gave lots of wonderful examples of how creative scientists had worked outside of any known methods.

While I think that's true, I don't think it's the most interesting observation one can make about science ... it seems to me there are some nice formal models you can posit that are good approximations explaining a lot about the social phenomenon of science, even though they're not complete explanations. The grungy details (in chronological order) are at:

But, one thing I did take from Feyerabend and his friend/argument-partner Imre Lakatos was the need to focus on science as a social phenomenon. What I've tried to do in my own philosophy of science is to pull together the social-phenomenon perspective with the Bayesian-statistics/algorithmic-information perspective on science.... But, as usual, I digress!

hiccups on the path to superefficient financial markets

A political reporter emailed me the other day asking my opinion on the role AI technology played in the recent financial crisis, and what this might imply for the future of finance.

Here's what I told him. Probably it freaked him out so much he deleted it and wiped it from his memory, but hey...

There's no doubt that advanced software programs using AI and other complex techniques played a major role in the current global financial crisis. However, it's also true that the risks and limitations of these software programs were known by many of the people involved, and in many cases were ignored intentionally rather than out of ignorance.

To be more precise: the known mathematical and AI techniques for estimating the risk of complex financial instruments (like credit default swaps, and various other exotic derivatives) all depend on certain assumptions. At this stage, some human intelligence is required to figure out whether the assumptions of a given mathematical technique really apply in a certain real-world situation. So, if one is confronted with a real-world situation where it's unclear whether the assumptions of a certain mathematical technique really apply, it's a human decision whether to apply the technique or not.

A historical example of this problem was the LTCM debacle in the 90's. In that case, the mathematical techniques used by LTCM assumed that the economies of various emerging markets were largely statistical independent. Based on that assumption, LTCM entered into some highly leveraged investments that were low-risk unless the assumption failed. The assumption failed.

Similarly, more recently, Iceland's financial situation was mathematically assessed to be stable, based on the assumption that (to simplify a little bit) a large number of depositors wouldn't decide to simultaneously withdraw a lot of their money. This assumption had never been violated in past situations that were judged as relevant. Oops.

A related, obvious phenomenon is that sometimes humans assigned with the job of assessing risk are given a choice between:

  1. assessing risk according to a technique whose assumptions don't really apply to the real-world situation, or whose applicability is uncertain
  2. saying "sorry, I don't have any good technique for assessing the risk of this particular financial instrument"

Naturally, the choice commonly taken is 1 rather than 2.


In another decade or two, I'd predict, we'll have yet more intelligent software, which is able to automatically assess whether the assumptions of a certain mathematical technique are applicable in a certain context. That would avoid the sort of problem we've recently seen.

So the base problem is that the software we have now is good at making predictions and assessments based on contextual assumptions ... but it is bad at assessing the applicability of contextual assumptions. The latter is left to humans, who often make decisions based on emotional bias, personal greed and so forth rather than rationality.

Obviously, the fact that a fund manager shares more in their fund's profit than in its loss, has some impact in their assessments. This will bias fund managers to take risks, because if the gamble comes out well, they get a huge bonus, but if it comes out badly, the worst that happens is that they find another job.

My feeling is that these sorts of problems we've seen recently are hiccups on the path to superefficient financial markets based on advanced AI. But it's hard to say exactly how long it will take for AI to achieve the needed understanding of context, to avoid this sort of "minor glitch."

P.S.

After I posted the above, there was a followup discussion on the AGI mailing list, in which someone asked me about applications of AGI to investment.

My reply was:


1)
Until we have a generally very powerful AGI, application of AI to finance will be in the vein of narrow-AI. Investment is a hard problem, not for toddler-minds.

Narrow-AI applications to finance can be fairly broad in nature though, e.g. I helped build a website called stockmood.com that analyzes financial sentiment in news

2)
Once we have a system with roughly adult-human-level AGI, then of course it will be possible to create specialized versions of this that are oriented toward trading, and these will be far superior to humans or narrow AIs at trading the markets, and whomever owns them will win a lot of everybody's money unless the government stops them.

P.P.S.

Someone on a mailing list pushed back on my mention of "AI and other mathematical techniques."

This seems worth clarifying, because the line between narrow-AI and other-math-techniques is really very fuzzy.


To give an indication of how fuzzy the line is ... consider the (very common) case of multiextremal optimization.

GA's are optimization algorithms that are considered AI ... but, is multi-start hillclimbing AI? Many would say so. Yet, some multiextremal optimization algorithms are considered operations research instead of AI -- say, multistart conjugate gradients...

Similarly, backprop NN's are considered AI .. yet, polynomial or exponential regression algorithms aren't. But they pretty much do the same stuff...

Or, think about assessment of credit risk, to determine who is allowed to get what kind of mortgage. This is done by AI data mining algorithms. OTOH it could also be done by some statistical algorithms that wouldn't normally be called AI (though I think it is usually addressed using methods like frequent itemset mining and decision trees, that are considered AI).

Are Uncomputable Entities Useless for Science?

When I first learned about uncomputable numbers, I was profoundly disturbed. One of the first things you prove about uncomputable numbers, when you encounter them in advanced math classes, is that it is provably never possible to explicitly display any example of an uncomputable number. But nevertheless, you can prove that (in a precise mathematical sense) "almost all" numbers on the real number line are uncomputable. This is proved indirectly, by showing that the real number line as a whole has one order of infinity (aleph-one) and the set of all computers has another order of infinite (aleph-null).

I never liked this, and I burned an embarrassing amount of time back then (I guess this was from ages 16-20) trying to find some logical inconsistency there. Somehow, I thought, it must be possible to prove this notion of "a set of things, none of which can ever actually be precisely characterized by any finite description" as inconsistent, as impossible.

Of course, try as I might, I found no inconsistency with the math -- only inconsistency with my own human intuitions.

And of course, I wasn't the first to tread that path (and I knew it). There's a philosophy of mathematics called "constructivism" which essentially bans any kind of mathematical entity whose existence can only be proved indirectly. Related to this is a philosophy of math called "intuitionism."

A problem with these philosophies of math is that they rule out some of the branches of math I most enjoy: I always favored continuous math -- real analysis, complex analysis, functional analysis -- over discrete math about finite structures. And of course these are incredibly useful branches of math: for instance, they underly most of physics.

These continuity-based branches of math also underly, for example, mathematical finance, even though the world of financial transactions is obviously discrete and computable, so one can't possibly need uncomputable numbers to handle it.

There always seemed to me something deeply mysterious in the way the use of the real line, with its unacceptably mystical uncomputable numbers, made practical mathematics in areas like physics and finance so much easier.

Notice, this implicitly uncomputable math is never necessary in these applications. You could reformulate all the equations of physics or finance in terms of purely discrete, finite math; and in most real applications, these days, the continuous equations are solved using discrete approximations on computers anyway. But, the theoretical math (that's used to figure out which discrete approximations to run on the computer) often comes out more nicely in the continuous version than the discrete version. For instance, the rules of traditional continuous calculus are generally far simpler and more elegant than the rules of discretized calculus.

And, note that the uncomputability is always in the background when you're using continuous mathematics. Since you can't explicitly write down any of these uncomputable numbers anyway, they don't play much role in your practical work with continuous math. But the math you're using, in some sense, implies their "existence."

But what does "existence" mean here?

To quote former President Bill Clinton, "it all depends on what the meaning of the word is, is."

A related issue arises in the philosophy of AI. Most AI theorists believe that human-like intelligence can ultimately be achieved within a digital computer program (most of them are in my view overpessimistic about how long it's going to take us to figure out exactly how to write such a program, but that's another story). But some mavericks, most notably Roger Penrose, have argued otherwise (see his books The Emperor's New Mind and Shadows of the Mind, for example). Penrose has argued specifically that the crux of human intelligence is some sort of mental manipulation of uncomputable entities.

And Penrose has also gone further: he's argued that some future theory of physics is going to reveal that the dynamics of the physical world is also based on the interaction of uncomputable entities. So that mind is an uncomputable consequence of uncomputable physical reality.

This argument always disturbed me, also. There always seemed something fundamentally wrong to me about the notion of "uncomputable physics." Because, science is always, in the end, about finite sets of finite-precision data. So, how could these mysterious uncomputable entities ever really be necessary to explain this finite data?

Obviously, it seemed tome, they could never be necessary. Any finite dataset has a finite explanation. But the question then becomes whether in some cases invoking uncomputable entities is the best way to explain some finite dataset. Can the best way of explaining some set of, say, 10 or 1000 or 1000000 numbers be "This uncomputable process, whose details you can never write down or communicate in ordinary language in a finite amount of time, generated these numbers."

This really doesn't make sense to me. It seems intuitively wrong -- more clearly and obviously so than the notion of the "existence" of uncomputable numbers and other uncomputable entities in some abstract mathematical sense.

So, my goal in this post is to give a careful explanation of why this wrong. The argument I'm going to give here could be fully formalized as mathematics, but, I don't have the time for that right now, so I'll just give it semi-verbally/semi-mathematically, but I'll try to choose my words carefully.

As often happens, the matter turned out to be a little subtler than I initially thought it would be. To argue that uncomputables are useless for science, one needs some specific formal model of what science itself is. And this is of course a contentious issue. However, if one does adopt the formalization of science that I suggest, then the scientific uselessness of uncomputables falls out fairly straightforwardly. (And I note that this was certainly not my motivation for conceiving the formal model of science I'll suggest; I cooked it up a while ago for quite other reasons.)

Maybe someone else could come up with a different formal model of science that gives a useful role to uncomputable entities ... though one could then start a meta-level analysis of the usefulness of this kind of formal model of science! But I'll defer that till next year ;-)

Even though it's not wholly rigorous math, this is a pretty mathematical blog post that will make for slow reading. But if you have suitable background and are willing to slog through it, I think you'll find it an interesting train of thought.

NOTE: the motivation to write up these ideas (which have been bouncing around in my head for ages) emerged during email discussions on the AGI list with a large group, most critically Abram Demski, Eric Baum and Mark Waser.

A Simple Formalization of the Scientific Process

I'll start by giving a simplified formalization of the process of science.

This formalization is related to the philosophy of science I outlined in the essay http://www.goertzel.org/dynapsyc/2004/PhilosophyOfScience_v2.htm (included in The Hidden Pattern) and more recently extended in the blog post http://multiverseaccordingtoben.blogspot.com/2008/10/reflections-on-religulous-and.html. But those prior writing consider many aspects not discussed here.

Let's consider a community of agents that use some language L to communicate. By a language, what I mean here is simply a set of finite symbol-sequences ("expressions"), utilizing a finite set of symbols.

Assume that a dataset (i.e., a finite set of finite-precision observations) can be expressed as a set of pairs of expressions in the language L. So a dataset D can be viewed as a set of pairs


((d11, d12), (d21,d22) ,..., (dn1,dn2))

or else as a pair D=(D1,D2) where

D1=(d11,...,dn1)
D2=(d12,...,dn2)

Then, define an explanation of a dataset D as a set E_D of expressions in L, so that if one agent A1 communicates E_D to another agent A2 that has seen D1 but not D2, nevertheless A2 is able to reproduce D2.

(One can look at precise explanations versus imprecise ones, where an imprecise explanation means that A2 is able to reproduce D2 only approximately, but this doesn't affect the argument significantly, so I'll leave this complication out from here on.)

If D2 is large, then for E_D to be an interesting explanation, it should be more compact than D2.

Note that I am not requiring E_D to generate D2 from D1 on its own. I am requiring that A2 be able to generate D2 based on E_D and D1. Since A2 is an arbitrary member of the community of agents, the validity of an explanation, as I'm defining it here, is relative to the assumed community of agents.

Note also that, although expressions in L are always finitely describable, that doesn't mean that the agents A1, A2, etc. are. According to the framework I've set up here, these agents could be infinite, uncomputable, and so forth. I'm not assuming anything special about the agents, but I am considering them in the special context of finite communications about finite observations.

The above is my formalization of the scientific process, in a general and abstract sense. According to this formalization, science is about communities of agents linguistically transmitting to each other knowledge about how to predict some commonly-perceived data, given some other commonly-perceived data.

The (Dubious) Scientific Value of the Uncomputable

Next, getting closer to the theme of this post, I turn to consider the question of what use it might be for A2 to employ some uncomputable entity U in the process of using E_D to generate D2 from D1. My contention is that, under some reasonable assumptions, there is no value to A2 in using uncomputable entities in this context.

D1 and E_D are sets of L-expressions, and so is D2. So what A2 is faced with, is a problem of mapping one set of L-expressions into another.

Suppose that A2 uses some process P to carry out this mapping. Then, if we represent each set of L-expressions as a bit string (which may be done in a variety of different, straightforward ways), P is then a mapping from bit strings into bit strings. To keep things simple we can assume some maximum size cap on the size of the bit strings involved (corresponding for instance to the maximum size expression-set that can be uttered by any agent during a trillion years).

The question then becomes whether it is somehow useful for A2 to use some uncomputable entity U to compute P, rather than using some sort of set of discrete operations comparable to a computer program.

One way to address this question is to introduce a notion of simplicity. The question then becomes whether it is simpler for A2 to use U to compute P, rather than using some computer program.

And this, then, boils down to one's choice of simplicity measure.

Consider the situation where A2 wants to tell A3 how to use U to compute P. In this case, A2 must represent U somehow in the language L.

In the simplest case, A2 may represent U directly in the language, using a single expression (which may then be included in other expressions). There will then be certain rules governing the use of U in the language, such that A2 can successfully, reliably communicate "use of U to compute P" to A3 only if these rules are followed. Call this rule-set R_U. Let us assume that R_U is a finite set of expressions, and may also be expressed in the language L.

Then, the key question is whether we can have

complexity(U) < complexity(R_U)

That is, can U be less complex than the set of rules prescribing the use of its symbol S_U within the community of agents?

If we say NO, then it follows there is no use for A2 to use U internally to produce D2, in the sense that it would be simpler for A2 to just use R_U internally.

On the other hand, if we say YES, then according to the given complexity measure, it may be easier for A2 to internally make use of U, rather than to use R_U or something else finite.

So, if we choose to define complexity in terms of complexity of expression in the community's language L, then we conclude that uncomputable entities are useless for science. Because, we can always replace any uncomputable entity U with a set of rules for manipulating the symbol S_U corresponding to it.

If you don't like this complexity measure, you're of course free to propose another one, and argue why it's the right one to use to understand science. In a previous blog post I've presented some of the intuitions underlying my assumption of this "communication prior" as a complexity measure underlying scientific reasoning.

The above discussion assumes that U is denoted in L by a single symbolic L-expression S_U, but the same basic argument holds if the expression of U in L is more complex.

What does all this mean about calculus, for example ... and the other lovely uses of uncomputable math to explain science data?

The question comes down to whether, for instance, we have

complexity(real number line R) <>

If NO, then it means the mind is better off using the axioms for R than using R directly. And, I suggest, that is what we actually do when using R in calculus. We don't use R as an "actual entity" in any strong sense, we use R as an abstract set of axioms.

What would YES mean? It would mean that somehow we, as uncomputable beings, used R as an internal source of intuition about continuity ... not thus deriving any conclusions beyond the ones obtainable using the axioms about R, but deriving conclusions in a way that we found subjectively simpler.

A Postcript about AI

And, as an aside, what does all this mean about AI? It doesn't really tell you anything definitive about whether humanlike mind can be achieved computationally. But what it does tell you is that, if
  • humanlike mind can be studied using the communicational tools of science (that is, using finite sets of finite-precision observations, and languages defined as finite strings on finite alphabets)
  • one accepts the communication prior (length of linguistic expression as a measure of complexity)
then IF mind is fundamentally noncomputational, science is no use for studying it. Because science, as formalized here, can never distinguish between use of U and use of S_U. According to science, there will always be some computational explanation of any set of data, though whether this is the simplest explanation depends on one's choice of complexity measure.