Reading over the conversation I had (with Abram Demski) in the Comments to a prior blog post

http://multiverseaccordingtoben.blogspot.com/2008/10/are-uncomputable-entities-useless-for.html

I was reminded of a conversation I had once with my son Zarathustra when he was 4 years old.

Zar was defending his claim that he actually was omniscient, and explaining how this was consistent with his apparent ignorance on many matters. His explanation went something like this:

"I actually do know everything, Ben! It's just that with all that stuff in my memory, it can take me a really really long time to get the memories out ... years sometimes...."

Of course, Zar didn't realize Plato had been there before (since they didn't cover Plato in his pre-school...).

He also had the speculation that this infinite memory store, called his "saving box", was contained in his abdomen somewhere, separate from his ordinary, limited-scope memories in his brain. Apparently his intuition for philosophy was better than for biology... or he would have realized it was actually in the pineal gland (again, no Descartes in preschool either ;-p).

This reminded me of the hypothesis that arose in the conversation with Abram, that in effect all humans might have some kind of oracle machine in their brains.

If we all have the same internal neural oracle machine (or if, say, we all have pineal-gland antennas to the the same Cosmic Oracle Machine (operated by the ghost of Larry Ellison?)), then we can communicate about the uncomputable even though our language can never actually encapsulate what it is we're talking about.

Terrence McKenna, of course, had another word for these quasi-neural oracle machines: machine-elves ;-)

This means that the real goal of AGI should be to create a software program that can serve as a proper antenna 8-D

Just a little hi-fi sci-fi weirdness to brighten up your day ... I seem to have caught a bad cold and it must be interfering with my thought processes ... or messing up the reception of my pineal antenna ...

P.S.

perhaps some evidence for Zar's saving-box theory:

http://hubpages.com/hub/Cellular-Memories-in-Organ-Transplant-Recipients

## Thursday, October 30, 2008

## Tuesday, October 28, 2008

### Random Memory of a Creative Mind (Paul Feyerabend)

I had a brief but influential (for me: I'm sure he quickly forgot it) correspondence with the philosopher-of-science Paul Feyerabend when I was 19.

I sent him a philosophical manuscript of mine, printed on a crappy dot matrix printer ... I think it was called "Lies and False Truths." I asked him to read it, and also asked his advice on where I should go to grad school to study philosophy. I was in the middle of my second year of grad school, working toward my PhD in math, but I was having second thoughts about math as a career....

He replied with a densely written postcard, saying he wasn't going to read my book because he was spending most of his time on non-philosophy pursuits ... but that he'd glanced it over and it looked creative and interesting (or something like that: I forget the exact words) ... and, most usefully, telling me that if I wanted to be a real philosopher I should not study philosophy academically nor become a philosophy professor, but should study science and/or arts and then pursue philosophy independently.

His advice struck the right chord and the temporary insanity that had caused me to briefly consider becoming a professional philosopher, vanished into the mysterious fog from which it had emerged ...

(I think there may have been another couple brief letters back and forth too, not sure...)

(I had third thoughts about math grad school about 6 months after that, and briefly moved to Vegas to become a telemarketer and Henry-Miller-meets-Nietzsche style prose-poem ranter ... but that's another story ... and anyways I went back to grad school and completed my PhD fairly expeditiously by age 22...)

P.S.

Even at that absurdly young age (but even more so now), I had a lot of disagreements with Feyerabend's ideas on philosophy of science -- but I loved his contentious, informal-yet-rigorous, individualistic style. He thought for himself, not within any specific school of thought or tradition. That's why I wrote to him -- I viewed him as a sort of kindred maverick (if that word is still usable anymore, given what Maverick McCain has done to it ... heh ;-p)

My own current philosophy of science has very little to do with his, but, I'm sure we would have enjoyed arguing the issues together!

He basically argued that science was a social phenomenon with no fixed method. He gave lots of wonderful examples of how creative scientists had worked outside of any known methods.

While I think that's true, I don't think it's the most interesting observation one can make about science ... it seems to me there are some nice formal models you can posit that are good approximations explaining a lot about the social phenomenon of science, even though they're not complete explanations. The grungy details (in chronological order) are at:

But, one thing I did take from Feyerabend and his friend/argument-partner Imre Lakatos was the need to focus on science as a social phenomenon. What I've tried to do in my own philosophy of science is to pull together the social-phenomenon perspective with the Bayesian-statistics/algorithmic-information perspective on science.... But, as usual, I digress!

I sent him a philosophical manuscript of mine, printed on a crappy dot matrix printer ... I think it was called "Lies and False Truths." I asked him to read it, and also asked his advice on where I should go to grad school to study philosophy. I was in the middle of my second year of grad school, working toward my PhD in math, but I was having second thoughts about math as a career....

He replied with a densely written postcard, saying he wasn't going to read my book because he was spending most of his time on non-philosophy pursuits ... but that he'd glanced it over and it looked creative and interesting (or something like that: I forget the exact words) ... and, most usefully, telling me that if I wanted to be a real philosopher I should not study philosophy academically nor become a philosophy professor, but should study science and/or arts and then pursue philosophy independently.

His advice struck the right chord and the temporary insanity that had caused me to briefly consider becoming a professional philosopher, vanished into the mysterious fog from which it had emerged ...

(I think there may have been another couple brief letters back and forth too, not sure...)

(I had third thoughts about math grad school about 6 months after that, and briefly moved to Vegas to become a telemarketer and Henry-Miller-meets-Nietzsche style prose-poem ranter ... but that's another story ... and anyways I went back to grad school and completed my PhD fairly expeditiously by age 22...)

P.S.

Even at that absurdly young age (but even more so now), I had a lot of disagreements with Feyerabend's ideas on philosophy of science -- but I loved his contentious, informal-yet-rigorous, individualistic style. He thought for himself, not within any specific school of thought or tradition. That's why I wrote to him -- I viewed him as a sort of kindred maverick (if that word is still usable anymore, given what Maverick McCain has done to it ... heh ;-p)

My own current philosophy of science has very little to do with his, but, I'm sure we would have enjoyed arguing the issues together!

He basically argued that science was a social phenomenon with no fixed method. He gave lots of wonderful examples of how creative scientists had worked outside of any known methods.

While I think that's true, I don't think it's the most interesting observation one can make about science ... it seems to me there are some nice formal models you can posit that are good approximations explaining a lot about the social phenomenon of science, even though they're not complete explanations. The grungy details (in chronological order) are at:

But, one thing I did take from Feyerabend and his friend/argument-partner Imre Lakatos was the need to focus on science as a social phenomenon. What I've tried to do in my own philosophy of science is to pull together the social-phenomenon perspective with the Bayesian-statistics/algorithmic-information perspective on science.... But, as usual, I digress!

### hiccups on the path to superefficient financial markets

A political reporter emailed me the other day asking my opinion on the role AI technology played in the recent financial crisis, and what this might imply for the future of finance.

Here's what I told him. Probably it freaked him out so much he deleted it and wiped it from his memory, but hey...

There's no doubt that advanced software programs using AI and other complex techniques played a major role in the current global financial crisis. However, it's also true that the risks and limitations of these software programs were known by many of the people involved, and in many cases were ignored intentionally rather than out of ignorance.

To be more precise: the known mathematical and AI techniques for estimating the risk of complex financial instruments (like credit default swaps, and various other exotic derivatives) all depend on certain assumptions. At this stage, some human intelligence is required to figure out whether the assumptions of a given mathematical technique really apply in a certain real-world situation. So, if one is confronted with a real-world situation where it's unclear whether the assumptions of a certain mathematical technique really apply, it's a human decision whether to apply the technique or not.

A historical example of this problem was the LTCM debacle in the 90's. In that case, the mathematical techniques used by LTCM assumed that the economies of various emerging markets were largely statistical independent. Based on that assumption, LTCM entered into some highly leveraged investments that were low-risk unless the assumption failed. The assumption failed.

Similarly, more recently, Iceland's financial situation was mathematically assessed to be stable, based on the assumption that (to simplify a little bit) a large number of depositors wouldn't decide to simultaneously withdraw a lot of their money. This assumption had never been violated in past situations that were judged as relevant. Oops.

A related, obvious phenomenon is that sometimes humans assigned with the job of assessing risk are given a choice between:

Naturally, the choice commonly taken is 1 rather than 2.

In another decade or two, I'd predict, we'll have yet more intelligent software, which is able to automatically assess whether the assumptions of a certain mathematical technique are applicable in a certain context. That would avoid the sort of problem we've recently seen.

So the base problem is that the software we have now is good at making predictions and assessments based on contextual assumptions ... but it is bad at assessing the applicability of contextual assumptions. The latter is left to humans, who often make decisions based on emotional bias, personal greed and so forth rather than rationality.

Obviously, the fact that a fund manager shares more in their fund's profit than in its loss, has some impact in their assessments. This will bias fund managers to take risks, because if the gamble comes out well, they get a huge bonus, but if it comes out badly, the worst that happens is that they find another job.

My feeling is that these sorts of problems we've seen recently are hiccups on the path to superefficient financial markets based on advanced AI. But it's hard to say exactly how long it will take for AI to achieve the needed understanding of context, to avoid this sort of "minor glitch."

P.S.

After I posted the above, there was a followup discussion on the AGI mailing list, in which someone asked me about applications of AGI to investment.

My reply was:

1)

Until we have a generally very powerful AGI, application of AI to finance will be in the vein of narrow-AI. Investment is a hard problem, not for toddler-minds.

Narrow-AI applications to finance can be fairly broad in nature though, e.g. I helped build a website called stockmood.com that analyzes financial sentiment in news

2)

Once we have a system with roughly adult-human-level AGI, then of course it will be possible to create specialized versions of this that are oriented toward trading, and these will be far superior to humans or narrow AIs at trading the markets, and whomever owns them will win a lot of everybody's money unless the government stops them.

P.P.S.

Someone on a mailing list pushed back on my mention of "AI and other mathematical techniques."

This seems worth clarifying, because the line between narrow-AI and other-math-techniques is really very fuzzy.

To give an indication of how fuzzy the line is ... consider the (very common) case of multiextremal optimization.

GA's are optimization algorithms that are considered AI ... but, is multi-start hillclimbing AI? Many would say so. Yet, some multiextremal optimization algorithms are considered operations research instead of AI -- say, multistart conjugate gradients...

Similarly, backprop NN's are considered AI .. yet, polynomial or exponential regression algorithms aren't. But they pretty much do the same stuff...

Or, think about assessment of credit risk, to determine who is allowed to get what kind of mortgage. This is done by AI data mining algorithms. OTOH it could also be done by some statistical algorithms that wouldn't normally be called AI (though I think it is usually addressed using methods like frequent itemset mining and decision trees, that are considered AI).

Here's what I told him. Probably it freaked him out so much he deleted it and wiped it from his memory, but hey...

There's no doubt that advanced software programs using AI and other complex techniques played a major role in the current global financial crisis. However, it's also true that the risks and limitations of these software programs were known by many of the people involved, and in many cases were ignored intentionally rather than out of ignorance.

To be more precise: the known mathematical and AI techniques for estimating the risk of complex financial instruments (like credit default swaps, and various other exotic derivatives) all depend on certain assumptions. At this stage, some human intelligence is required to figure out whether the assumptions of a given mathematical technique really apply in a certain real-world situation. So, if one is confronted with a real-world situation where it's unclear whether the assumptions of a certain mathematical technique really apply, it's a human decision whether to apply the technique or not.

A historical example of this problem was the LTCM debacle in the 90's. In that case, the mathematical techniques used by LTCM assumed that the economies of various emerging markets were largely statistical independent. Based on that assumption, LTCM entered into some highly leveraged investments that were low-risk unless the assumption failed. The assumption failed.

Similarly, more recently, Iceland's financial situation was mathematically assessed to be stable, based on the assumption that (to simplify a little bit) a large number of depositors wouldn't decide to simultaneously withdraw a lot of their money. This assumption had never been violated in past situations that were judged as relevant. Oops.

A related, obvious phenomenon is that sometimes humans assigned with the job of assessing risk are given a choice between:

- assessing risk according to a technique whose assumptions don't really apply to the real-world situation, or whose applicability is uncertain
- saying "sorry, I don't have any good technique for assessing the risk of this particular financial instrument"

Naturally, the choice commonly taken is 1 rather than 2.

In another decade or two, I'd predict, we'll have yet more intelligent software, which is able to automatically assess whether the assumptions of a certain mathematical technique are applicable in a certain context. That would avoid the sort of problem we've recently seen.

So the base problem is that the software we have now is good at making predictions and assessments based on contextual assumptions ... but it is bad at assessing the applicability of contextual assumptions. The latter is left to humans, who often make decisions based on emotional bias, personal greed and so forth rather than rationality.

Obviously, the fact that a fund manager shares more in their fund's profit than in its loss, has some impact in their assessments. This will bias fund managers to take risks, because if the gamble comes out well, they get a huge bonus, but if it comes out badly, the worst that happens is that they find another job.

My feeling is that these sorts of problems we've seen recently are hiccups on the path to superefficient financial markets based on advanced AI. But it's hard to say exactly how long it will take for AI to achieve the needed understanding of context, to avoid this sort of "minor glitch."

P.S.

After I posted the above, there was a followup discussion on the AGI mailing list, in which someone asked me about applications of AGI to investment.

My reply was:

1)

Until we have a generally very powerful AGI, application of AI to finance will be in the vein of narrow-AI. Investment is a hard problem, not for toddler-minds.

Narrow-AI applications to finance can be fairly broad in nature though, e.g. I helped build a website called stockmood.com that analyzes financial sentiment in news

2)

Once we have a system with roughly adult-human-level AGI, then of course it will be possible to create specialized versions of this that are oriented toward trading, and these will be far superior to humans or narrow AIs at trading the markets, and whomever owns them will win a lot of everybody's money unless the government stops them.

P.P.S.

Someone on a mailing list pushed back on my mention of "AI and other mathematical techniques."

This seems worth clarifying, because the line between narrow-AI and other-math-techniques is really very fuzzy.

To give an indication of how fuzzy the line is ... consider the (very common) case of multiextremal optimization.

GA's are optimization algorithms that are considered AI ... but, is multi-start hillclimbing AI? Many would say so. Yet, some multiextremal optimization algorithms are considered operations research instead of AI -- say, multistart conjugate gradients...

Similarly, backprop NN's are considered AI .. yet, polynomial or exponential regression algorithms aren't. But they pretty much do the same stuff...

Or, think about assessment of credit risk, to determine who is allowed to get what kind of mortgage. This is done by AI data mining algorithms. OTOH it could also be done by some statistical algorithms that wouldn't normally be called AI (though I think it is usually addressed using methods like frequent itemset mining and decision trees, that are considered AI).

### Are Uncomputable Entities Useless for Science?

When I first learned about uncomputable numbers, I was profoundly disturbed. One of the first things you prove about uncomputable numbers, when you encounter them in advanced math classes, is that it is provably never possible to explicitly display any example of an uncomputable number. But nevertheless, you can prove that (in a precise mathematical sense) "almost all" numbers on the real number line are uncomputable. This is proved indirectly, by showing that the real number line as a whole has one order of infinity (aleph-one) and the set of all computers has another order of infinite (aleph-null).

I never liked this, and I burned an embarrassing amount of time back then (I guess this was from ages 16-20) trying to find some logical inconsistency there. Somehow, I thought, it must be possible to prove this notion of "a set of things, none of which can ever actually be precisely characterized by any finite description" as inconsistent, as impossible.

Of course, try as I might, I found no inconsistency with the math -- only inconsistency with my own human intuitions.

And of course, I wasn't the first to tread that path (and I knew it). There's a philosophy of mathematics called "constructivism" which essentially bans any kind of mathematical entity whose existence can only be proved indirectly. Related to this is a philosophy of math called "intuitionism."

A problem with these philosophies of math is that they rule out some of the branches of math I most enjoy: I always favored continuous math -- real analysis, complex analysis, functional analysis -- over discrete math about finite structures. And of course these are incredibly useful branches of math: for instance, they underly most of physics.

These continuity-based branches of math also underly, for example, mathematical finance, even though the world of financial transactions is obviously discrete and computable, so one can't possibly need uncomputable numbers to handle it.

There always seemed to me something deeply mysterious in the way the use of the real line, with its unacceptably mystical uncomputable numbers, made practical mathematics in areas like physics and finance so much easier.

Notice, this implicitly uncomputable math is never necessary in these applications. You could reformulate all the equations of physics or finance in terms of purely discrete, finite math; and in most real applications, these days, the continuous equations are solved using discrete approximations on computers anyway. But, the theoretical math (that's used to figure out which discrete approximations to run on the computer) often comes out more nicely in the continuous version than the discrete version. For instance, the rules of traditional continuous calculus are generally far simpler and more elegant than the rules of discretized calculus.

And, note that the uncomputability is always in the background when you're using continuous mathematics. Since you can't explicitly write down any of these uncomputable numbers anyway, they don't play much role in your practical work with continuous math. But the math you're using, in some sense, implies their "existence."

But what does "existence" mean here?

To quote former President Bill Clinton, "it all depends on what the meaning of the word is, is."

A related issue arises in the philosophy of AI. Most AI theorists believe that human-like intelligence can ultimately be achieved within a digital computer program (most of them are in my view overpessimistic about how long it's going to take us to figure out exactly how to write such a program, but that's another story). But some mavericks, most notably Roger Penrose, have argued otherwise (see his books The Emperor's New Mind and Shadows of the Mind, for example). Penrose has argued specifically that the crux of human intelligence is some sort of mental manipulation of uncomputable entities.

And Penrose has also gone further: he's argued that some future theory of physics is going to reveal that the dynamics of the physical world is also based on the interaction of uncomputable entities. So that mind is an uncomputable consequence of uncomputable physical reality.

This argument always disturbed me, also. There always seemed something fundamentally wrong to me about the notion of "uncomputable physics." Because, science is always, in the end, about finite sets of finite-precision data. So, how could these mysterious uncomputable entities ever really be necessary to explain this finite data?

Obviously, it seemed tome, they could never be necessary. Any finite dataset has a finite explanation. But the question then becomes whether in some cases invoking uncomputable entities is the best way to explain some finite dataset. Can the best way of explaining some set of, say, 10 or 1000 or 1000000 numbers be "This uncomputable process, whose details you can never write down or communicate in ordinary language in a finite amount of time, generated these numbers."

This really doesn't make sense to me. It seems intuitively wrong -- more clearly and obviously so than the notion of the "existence" of uncomputable numbers and other uncomputable entities in some abstract mathematical sense.

So, my goal in this post is to give a careful explanation of why this wrong. The argument I'm going to give here could be fully formalized as mathematics, but, I don't have the time for that right now, so I'll just give it semi-verbally/semi-mathematically, but I'll try to choose my words carefully.

As often happens, the matter turned out to be a little subtler than I initially thought it would be. To argue that uncomputables are useless for science, one needs some specific formal model of what science itself is. And this is of course a contentious issue. However, if one does adopt the formalization of science that I suggest, then the scientific uselessness of uncomputables falls out fairly straightforwardly. (And I note that this was certainly not my motivation for conceiving the formal model of science I'll suggest; I cooked it up a while ago for quite other reasons.)

Maybe someone else could come up with a different formal model of science that gives a useful role to uncomputable entities ... though one could then start a meta-level analysis of the usefulness of this kind of formal model of science! But I'll defer that till next year ;-)

Even though it's not wholly rigorous math, this is a pretty mathematical blog post that will make for slow reading. But if you have suitable background and are willing to slog through it, I think you'll find it an interesting train of thought.

NOTE: the motivation to write up these ideas (which have been bouncing around in my head for ages) emerged during email discussions on the AGI list with a large group, most critically Abram Demski, Eric Baum and Mark Waser.

A Simple Formalization of the Scientific Process

I'll start by giving a simplified formalization of the process of science.

This formalization is related to the philosophy of science I outlined in the essay http://www.goertzel.org/dynapsyc/2004/PhilosophyOfScience_v2.htm (included in The Hidden Pattern) and more recently extended in the blog post http://multiverseaccordingtoben.blogspot.com/2008/10/reflections-on-religulous-and.html. But those prior writing consider many aspects not discussed here.

Let's consider a community of agents that use some language L to communicate. By a language, what I mean here is simply a set of finite symbol-sequences ("expressions"), utilizing a finite set of symbols.

Assume that a dataset (i.e., a finite set of finite-precision observations) can be expressed as a set of pairs of expressions in the language L. So a dataset D can be viewed as a set of pairs

((d11, d12), (d21,d22) ,..., (dn1,dn2))

or else as a pair D=(D1,D2) where

D1=(d11,...,dn1)

D2=(d12,...,dn2)

Then, define an explanation of a dataset D as a set E_D of expressions in L, so that if one agent A1 communicates E_D to another agent A2 that has seen D1 but not D2, nevertheless A2 is able to reproduce D2.

(One can look at precise explanations versus imprecise ones, where an imprecise explanation means that A2 is able to reproduce D2 only approximately, but this doesn't affect the argument significantly, so I'll leave this complication out from here on.)

If D2 is large, then for E_D to be an interesting explanation, it should be more compact than D2.

Note that I am not requiring E_D to generate D2 from D1 on its own. I am requiring that A2 be able to generate D2 based on E_D and D1. Since A2 is an arbitrary member of the community of agents, the validity of an explanation, as I'm defining it here, is relative to the assumed community of agents.

Note also that, although expressions in L are always finitely describable, that doesn't mean that the agents A1, A2, etc. are. According to the framework I've set up here, these agents could be infinite, uncomputable, and so forth. I'm not assuming anything special about the agents, but I am considering them in the special context of finite communications about finite observations.

The above is my formalization of the scientific process, in a general and abstract sense. According to this formalization, science is about communities of agents linguistically transmitting to each other knowledge about how to predict some commonly-perceived data, given some other commonly-perceived data.

The (Dubious) Scientific Value of the Uncomputable

Next, getting closer to the theme of this post, I turn to consider the question of what use it might be for A2 to employ some uncomputable entity U in the process of using E_D to generate D2 from D1. My contention is that, under some reasonable assumptions, there is no value to A2 in using uncomputable entities in this context.

D1 and E_D are sets of L-expressions, and so is D2. So what A2 is faced with, is a problem of mapping one set of L-expressions into another.

Suppose that A2 uses some process P to carry out this mapping. Then, if we represent each set of L-expressions as a bit string (which may be done in a variety of different, straightforward ways), P is then a mapping from bit strings into bit strings. To keep things simple we can assume some maximum size cap on the size of the bit strings involved (corresponding for instance to the maximum size expression-set that can be uttered by any agent during a trillion years).

The question then becomes whether it is somehow useful for A2 to use some uncomputable entity U to compute P, rather than using some sort of set of discrete operations comparable to a computer program.

One way to address this question is to introduce a notion of simplicity. The question then becomes whether it is simpler for A2 to use U to compute P, rather than using some computer program.

And this, then, boils down to one's choice of simplicity measure.

Consider the situation where A2 wants to tell A3 how to use U to compute P. In this case, A2 must represent U somehow in the language L.

In the simplest case, A2 may represent U directly in the language, using a single expression (which may then be included in other expressions). There will then be certain rules governing the use of U in the language, such that A2 can successfully, reliably communicate "use of U to compute P" to A3 only if these rules are followed. Call this rule-set R_U. Let us assume that R_U is a finite set of expressions, and may also be expressed in the language L.

Then, the key question is whether we can have

complexity(U) < complexity(R_U)

That is, can U be less complex than the set of rules prescribing the use of its symbol S_U within the community of agents?

If we say NO, then it follows there is no use for A2 to use U internally to produce D2, in the sense that it would be simpler for A2 to just use R_U internally.

On the other hand, if we say YES, then according to the given complexity measure, it may be easier for A2 to internally make use of U, rather than to use R_U or something else finite.

So, if we choose to define complexity in terms of complexity of expression in the community's language L, then we conclude that uncomputable entities are useless for science. Because, we can always replace any uncomputable entity U with a set of rules for manipulating the symbol S_U corresponding to it.

If you don't like this complexity measure, you're of course free to propose another one, and argue why it's the right one to use to understand science. In a previous blog post I've presented some of the intuitions underlying my assumption of this "communication prior" as a complexity measure underlying scientific reasoning.

The above discussion assumes that U is denoted in L by a single symbolic L-expression S_U, but the same basic argument holds if the expression of U in L is more complex.

What does all this mean about calculus, for example ... and the other lovely uses of uncomputable math to explain science data?

The question comes down to whether, for instance, we have

complexity(real number line R) <>

If NO, then it means the mind is better off using the axioms for R than using R directly. And, I suggest, that is what we actually do when using R in calculus. We don't use R as an "actual entity" in any strong sense, we use R as an abstract set of axioms.

What would YES mean? It would mean that somehow we, as uncomputable beings, used R as an internal source of intuition about continuity ... not thus deriving any conclusions beyond the ones obtainable using the axioms about R, but deriving conclusions in a way that we found subjectively simpler.

A Postcript about AI

And, as an aside, what does all this mean about AI? It doesn't really tell you anything definitive about whether humanlike mind can be achieved computationally. But what it does tell you is that, if

I never liked this, and I burned an embarrassing amount of time back then (I guess this was from ages 16-20) trying to find some logical inconsistency there. Somehow, I thought, it must be possible to prove this notion of "a set of things, none of which can ever actually be precisely characterized by any finite description" as inconsistent, as impossible.

Of course, try as I might, I found no inconsistency with the math -- only inconsistency with my own human intuitions.

And of course, I wasn't the first to tread that path (and I knew it). There's a philosophy of mathematics called "constructivism" which essentially bans any kind of mathematical entity whose existence can only be proved indirectly. Related to this is a philosophy of math called "intuitionism."

A problem with these philosophies of math is that they rule out some of the branches of math I most enjoy: I always favored continuous math -- real analysis, complex analysis, functional analysis -- over discrete math about finite structures. And of course these are incredibly useful branches of math: for instance, they underly most of physics.

These continuity-based branches of math also underly, for example, mathematical finance, even though the world of financial transactions is obviously discrete and computable, so one can't possibly need uncomputable numbers to handle it.

There always seemed to me something deeply mysterious in the way the use of the real line, with its unacceptably mystical uncomputable numbers, made practical mathematics in areas like physics and finance so much easier.

Notice, this implicitly uncomputable math is never necessary in these applications. You could reformulate all the equations of physics or finance in terms of purely discrete, finite math; and in most real applications, these days, the continuous equations are solved using discrete approximations on computers anyway. But, the theoretical math (that's used to figure out which discrete approximations to run on the computer) often comes out more nicely in the continuous version than the discrete version. For instance, the rules of traditional continuous calculus are generally far simpler and more elegant than the rules of discretized calculus.

And, note that the uncomputability is always in the background when you're using continuous mathematics. Since you can't explicitly write down any of these uncomputable numbers anyway, they don't play much role in your practical work with continuous math. But the math you're using, in some sense, implies their "existence."

But what does "existence" mean here?

To quote former President Bill Clinton, "it all depends on what the meaning of the word is, is."

A related issue arises in the philosophy of AI. Most AI theorists believe that human-like intelligence can ultimately be achieved within a digital computer program (most of them are in my view overpessimistic about how long it's going to take us to figure out exactly how to write such a program, but that's another story). But some mavericks, most notably Roger Penrose, have argued otherwise (see his books The Emperor's New Mind and Shadows of the Mind, for example). Penrose has argued specifically that the crux of human intelligence is some sort of mental manipulation of uncomputable entities.

And Penrose has also gone further: he's argued that some future theory of physics is going to reveal that the dynamics of the physical world is also based on the interaction of uncomputable entities. So that mind is an uncomputable consequence of uncomputable physical reality.

This argument always disturbed me, also. There always seemed something fundamentally wrong to me about the notion of "uncomputable physics." Because, science is always, in the end, about finite sets of finite-precision data. So, how could these mysterious uncomputable entities ever really be necessary to explain this finite data?

Obviously, it seemed tome, they could never be necessary. Any finite dataset has a finite explanation. But the question then becomes whether in some cases invoking uncomputable entities is the best way to explain some finite dataset. Can the best way of explaining some set of, say, 10 or 1000 or 1000000 numbers be "This uncomputable process, whose details you can never write down or communicate in ordinary language in a finite amount of time, generated these numbers."

This really doesn't make sense to me. It seems intuitively wrong -- more clearly and obviously so than the notion of the "existence" of uncomputable numbers and other uncomputable entities in some abstract mathematical sense.

So, my goal in this post is to give a careful explanation of why this wrong. The argument I'm going to give here could be fully formalized as mathematics, but, I don't have the time for that right now, so I'll just give it semi-verbally/semi-mathematically, but I'll try to choose my words carefully.

As often happens, the matter turned out to be a little subtler than I initially thought it would be. To argue that uncomputables are useless for science, one needs some specific formal model of what science itself is. And this is of course a contentious issue. However, if one does adopt the formalization of science that I suggest, then the scientific uselessness of uncomputables falls out fairly straightforwardly. (And I note that this was certainly not my motivation for conceiving the formal model of science I'll suggest; I cooked it up a while ago for quite other reasons.)

Maybe someone else could come up with a different formal model of science that gives a useful role to uncomputable entities ... though one could then start a meta-level analysis of the usefulness of this kind of formal model of science! But I'll defer that till next year ;-)

Even though it's not wholly rigorous math, this is a pretty mathematical blog post that will make for slow reading. But if you have suitable background and are willing to slog through it, I think you'll find it an interesting train of thought.

NOTE: the motivation to write up these ideas (which have been bouncing around in my head for ages) emerged during email discussions on the AGI list with a large group, most critically Abram Demski, Eric Baum and Mark Waser.

A Simple Formalization of the Scientific Process

I'll start by giving a simplified formalization of the process of science.

This formalization is related to the philosophy of science I outlined in the essay http://www.goertzel.org/dynapsyc/2004/PhilosophyOfScience_v2.htm (included in The Hidden Pattern) and more recently extended in the blog post http://multiverseaccordingtoben.blogspot.com/2008/10/reflections-on-religulous-and.html. But those prior writing consider many aspects not discussed here.

Let's consider a community of agents that use some language L to communicate. By a language, what I mean here is simply a set of finite symbol-sequences ("expressions"), utilizing a finite set of symbols.

Assume that a dataset (i.e., a finite set of finite-precision observations) can be expressed as a set of pairs of expressions in the language L. So a dataset D can be viewed as a set of pairs

((d11, d12), (d21,d22) ,..., (dn1,dn2))

or else as a pair D=(D1,D2) where

D1=(d11,...,dn1)

D2=(d12,...,dn2)

Then, define an explanation of a dataset D as a set E_D of expressions in L, so that if one agent A1 communicates E_D to another agent A2 that has seen D1 but not D2, nevertheless A2 is able to reproduce D2.

(One can look at precise explanations versus imprecise ones, where an imprecise explanation means that A2 is able to reproduce D2 only approximately, but this doesn't affect the argument significantly, so I'll leave this complication out from here on.)

If D2 is large, then for E_D to be an interesting explanation, it should be more compact than D2.

Note that I am not requiring E_D to generate D2 from D1 on its own. I am requiring that A2 be able to generate D2 based on E_D and D1. Since A2 is an arbitrary member of the community of agents, the validity of an explanation, as I'm defining it here, is relative to the assumed community of agents.

Note also that, although expressions in L are always finitely describable, that doesn't mean that the agents A1, A2, etc. are. According to the framework I've set up here, these agents could be infinite, uncomputable, and so forth. I'm not assuming anything special about the agents, but I am considering them in the special context of finite communications about finite observations.

The above is my formalization of the scientific process, in a general and abstract sense. According to this formalization, science is about communities of agents linguistically transmitting to each other knowledge about how to predict some commonly-perceived data, given some other commonly-perceived data.

The (Dubious) Scientific Value of the Uncomputable

Next, getting closer to the theme of this post, I turn to consider the question of what use it might be for A2 to employ some uncomputable entity U in the process of using E_D to generate D2 from D1. My contention is that, under some reasonable assumptions, there is no value to A2 in using uncomputable entities in this context.

D1 and E_D are sets of L-expressions, and so is D2. So what A2 is faced with, is a problem of mapping one set of L-expressions into another.

Suppose that A2 uses some process P to carry out this mapping. Then, if we represent each set of L-expressions as a bit string (which may be done in a variety of different, straightforward ways), P is then a mapping from bit strings into bit strings. To keep things simple we can assume some maximum size cap on the size of the bit strings involved (corresponding for instance to the maximum size expression-set that can be uttered by any agent during a trillion years).

The question then becomes whether it is somehow useful for A2 to use some uncomputable entity U to compute P, rather than using some sort of set of discrete operations comparable to a computer program.

One way to address this question is to introduce a notion of simplicity. The question then becomes whether it is simpler for A2 to use U to compute P, rather than using some computer program.

And this, then, boils down to one's choice of simplicity measure.

Consider the situation where A2 wants to tell A3 how to use U to compute P. In this case, A2 must represent U somehow in the language L.

In the simplest case, A2 may represent U directly in the language, using a single expression (which may then be included in other expressions). There will then be certain rules governing the use of U in the language, such that A2 can successfully, reliably communicate "use of U to compute P" to A3 only if these rules are followed. Call this rule-set R_U. Let us assume that R_U is a finite set of expressions, and may also be expressed in the language L.

Then, the key question is whether we can have

complexity(U) < complexity(R_U)

That is, can U be less complex than the set of rules prescribing the use of its symbol S_U within the community of agents?

If we say NO, then it follows there is no use for A2 to use U internally to produce D2, in the sense that it would be simpler for A2 to just use R_U internally.

On the other hand, if we say YES, then according to the given complexity measure, it may be easier for A2 to internally make use of U, rather than to use R_U or something else finite.

So, if we choose to define complexity in terms of complexity of expression in the community's language L, then we conclude that uncomputable entities are useless for science. Because, we can always replace any uncomputable entity U with a set of rules for manipulating the symbol S_U corresponding to it.

If you don't like this complexity measure, you're of course free to propose another one, and argue why it's the right one to use to understand science. In a previous blog post I've presented some of the intuitions underlying my assumption of this "communication prior" as a complexity measure underlying scientific reasoning.

The above discussion assumes that U is denoted in L by a single symbolic L-expression S_U, but the same basic argument holds if the expression of U in L is more complex.

What does all this mean about calculus, for example ... and the other lovely uses of uncomputable math to explain science data?

The question comes down to whether, for instance, we have

complexity(real number line R) <>

If NO, then it means the mind is better off using the axioms for R than using R directly. And, I suggest, that is what we actually do when using R in calculus. We don't use R as an "actual entity" in any strong sense, we use R as an abstract set of axioms.

What would YES mean? It would mean that somehow we, as uncomputable beings, used R as an internal source of intuition about continuity ... not thus deriving any conclusions beyond the ones obtainable using the axioms about R, but deriving conclusions in a way that we found subjectively simpler.

A Postcript about AI

And, as an aside, what does all this mean about AI? It doesn't really tell you anything definitive about whether humanlike mind can be achieved computationally. But what it does tell you is that, if

- humanlike mind can be studied using the communicational tools of science (that is, using finite sets of finite-precision observations, and languages defined as finite strings on finite alphabets)
- one accepts the communication prior (length of linguistic expression as a measure of complexity)

## Tuesday, October 07, 2008

### Cosmic, overblown Grand Unified Theory of Development

In the 80's I spent a lot of time in the "Q" section of various libraries, which hosted some AI books, and a lot of funky books on "General Systems Theory" and related forms of interdisciplinary scientifico-philosophical wackiness.

GST is way out of fashion in the US, supplanted by Santa Fe Institute style "complexity theory" (which takes the same basic ideas but fleshes them out differently using modern computer tech), but I still have a soft spot in my heart for it....

Anyway, today when I was cleaning out odd spots of the house looking for a lost item (which I failed to find and really need, goddamnit!!) I found some scraps of paper that I scribbled on a couple years back while on some airline flight or another, sketching out the elements of a general-systems-theory type Grand Unified Theory of Development ... an overall theory of the stages of development that complex systems go through as they travel from infancy to maturity.

I'm not going to type in the whole thing here right now, but I made a table depicting part of it, so as to record the essence of the idea in some nicer, more permanent form than the fading dirty pieces of notebook paper....

The table shows the four key stages any complex system goes through, described in general terms, and then explained in a little more detail in the context of two examples: the human (or humanlike) mind as it develops from infancy to maturity, and the maturity of life from proto-life up into its modern form.

I couldn't get the table to embed nicely in this blog interface, so it's here as a PDF:

This was in fact the train of thought that led to two papers Stephan Bugaj and I wrote over the last couple years, on the stages of cognitive development of uncertain-inference based AI systems, and the stages of ethical development of such AI systems. While not presented as such in those papers, the stages given there are really specialized manifestations of the more general stages outlined in the above table.

Stephan and I are (slowly) brewing a book on hyperset models of mind and reality, which will include some further-elaborated, rigorously-mathematized version of this general theory of development...

Long live General Systems thinking ;-)

GST is way out of fashion in the US, supplanted by Santa Fe Institute style "complexity theory" (which takes the same basic ideas but fleshes them out differently using modern computer tech), but I still have a soft spot in my heart for it....

Anyway, today when I was cleaning out odd spots of the house looking for a lost item (which I failed to find and really need, goddamnit!!) I found some scraps of paper that I scribbled on a couple years back while on some airline flight or another, sketching out the elements of a general-systems-theory type Grand Unified Theory of Development ... an overall theory of the stages of development that complex systems go through as they travel from infancy to maturity.

I'm not going to type in the whole thing here right now, but I made a table depicting part of it, so as to record the essence of the idea in some nicer, more permanent form than the fading dirty pieces of notebook paper....

The table shows the four key stages any complex system goes through, described in general terms, and then explained in a little more detail in the context of two examples: the human (or humanlike) mind as it develops from infancy to maturity, and the maturity of life from proto-life up into its modern form.

I couldn't get the table to embed nicely in this blog interface, so it's here as a PDF:

This was in fact the train of thought that led to two papers Stephan Bugaj and I wrote over the last couple years, on the stages of cognitive development of uncertain-inference based AI systems, and the stages of ethical development of such AI systems. While not presented as such in those papers, the stages given there are really specialized manifestations of the more general stages outlined in the above table.

Stephan and I are (slowly) brewing a book on hyperset models of mind and reality, which will include some further-elaborated, rigorously-mathematized version of this general theory of development...

Long live General Systems thinking ;-)

Subscribe to:
Posts (Atom)