I never liked this, and I burned an embarrassing amount of time back then (I guess this was from ages 16-20) trying to find some logical inconsistency there. Somehow, I thought, it must be possible to prove this notion of "a set of things, none of which can ever actually be precisely characterized by any finite description" as inconsistent, as impossible.
Of course, try as I might, I found no inconsistency with the math -- only inconsistency with my own human intuitions.
And of course, I wasn't the first to tread that path (and I knew it). There's a philosophy of mathematics called "constructivism" which essentially bans any kind of mathematical entity whose existence can only be proved indirectly. Related to this is a philosophy of math called "intuitionism."
A problem with these philosophies of math is that they rule out some of the branches of math I most enjoy: I always favored continuous math -- real analysis, complex analysis, functional analysis -- over discrete math about finite structures. And of course these are incredibly useful branches of math: for instance, they underly most of physics.
These continuity-based branches of math also underly, for example, mathematical finance, even though the world of financial transactions is obviously discrete and computable, so one can't possibly need uncomputable numbers to handle it.
There always seemed to me something deeply mysterious in the way the use of the real line, with its unacceptably mystical uncomputable numbers, made practical mathematics in areas like physics and finance so much easier.
Notice, this implicitly uncomputable math is never necessary in these applications. You could reformulate all the equations of physics or finance in terms of purely discrete, finite math; and in most real applications, these days, the continuous equations are solved using discrete approximations on computers anyway. But, the theoretical math (that's used to figure out which discrete approximations to run on the computer) often comes out more nicely in the continuous version than the discrete version. For instance, the rules of traditional continuous calculus are generally far simpler and more elegant than the rules of discretized calculus.
And, note that the uncomputability is always in the background when you're using continuous mathematics. Since you can't explicitly write down any of these uncomputable numbers anyway, they don't play much role in your practical work with continuous math. But the math you're using, in some sense, implies their "existence."
But what does "existence" mean here?
To quote former President Bill Clinton, "it all depends on what the meaning of the word is, is."
A related issue arises in the philosophy of AI. Most AI theorists believe that human-like intelligence can ultimately be achieved within a digital computer program (most of them are in my view overpessimistic about how long it's going to take us to figure out exactly how to write such a program, but that's another story). But some mavericks, most notably Roger Penrose, have argued otherwise (see his books The Emperor's New Mind and Shadows of the Mind, for example). Penrose has argued specifically that the crux of human intelligence is some sort of mental manipulation of uncomputable entities.
And Penrose has also gone further: he's argued that some future theory of physics is going to reveal that the dynamics of the physical world is also based on the interaction of uncomputable entities. So that mind is an uncomputable consequence of uncomputable physical reality.
This argument always disturbed me, also. There always seemed something fundamentally wrong to me about the notion of "uncomputable physics." Because, science is always, in the end, about finite sets of finite-precision data. So, how could these mysterious uncomputable entities ever really be necessary to explain this finite data?
Obviously, it seemed tome, they could never be necessary. Any finite dataset has a finite explanation. But the question then becomes whether in some cases invoking uncomputable entities is the best way to explain some finite dataset. Can the best way of explaining some set of, say, 10 or 1000 or 1000000 numbers be "This uncomputable process, whose details you can never write down or communicate in ordinary language in a finite amount of time, generated these numbers."
This really doesn't make sense to me. It seems intuitively wrong -- more clearly and obviously so than the notion of the "existence" of uncomputable numbers and other uncomputable entities in some abstract mathematical sense.
So, my goal in this post is to give a careful explanation of why this wrong. The argument I'm going to give here could be fully formalized as mathematics, but, I don't have the time for that right now, so I'll just give it semi-verbally/semi-mathematically, but I'll try to choose my words carefully.
As often happens, the matter turned out to be a little subtler than I initially thought it would be. To argue that uncomputables are useless for science, one needs some specific formal model of what science itself is. And this is of course a contentious issue. However, if one does adopt the formalization of science that I suggest, then the scientific uselessness of uncomputables falls out fairly straightforwardly. (And I note that this was certainly not my motivation for conceiving the formal model of science I'll suggest; I cooked it up a while ago for quite other reasons.)
Maybe someone else could come up with a different formal model of science that gives a useful role to uncomputable entities ... though one could then start a meta-level analysis of the usefulness of this kind of formal model of science! But I'll defer that till next year ;-)
Even though it's not wholly rigorous math, this is a pretty mathematical blog post that will make for slow reading. But if you have suitable background and are willing to slog through it, I think you'll find it an interesting train of thought.
NOTE: the motivation to write up these ideas (which have been bouncing around in my head for ages) emerged during email discussions on the AGI list with a large group, most critically Abram Demski, Eric Baum and Mark Waser.
A Simple Formalization of the Scientific Process
I'll start by giving a simplified formalization of the process of science.
This formalization is related to the philosophy of science I outlined in the essay http://www.goertzel.org/dynapsyc/2004/PhilosophyOfScience_v2.htm (included in The Hidden Pattern) and more recently extended in the blog post http://multiverseaccordingtoben.blogspot.com/2008/10/reflections-on-religulous-and.html. But those prior writing consider many aspects not discussed here.
Let's consider a community of agents that use some language L to communicate. By a language, what I mean here is simply a set of finite symbol-sequences ("expressions"), utilizing a finite set of symbols.
Assume that a dataset (i.e., a finite set of finite-precision observations) can be expressed as a set of pairs of expressions in the language L. So a dataset D can be viewed as a set of pairs
((d11, d12), (d21,d22) ,..., (dn1,dn2))
or else as a pair D=(D1,D2) where
D1=(d11,...,dn1)
D2=(d12,...,dn2)
Then, define an explanation of a dataset D as a set E_D of expressions in L, so that if one agent A1 communicates E_D to another agent A2 that has seen D1 but not D2, nevertheless A2 is able to reproduce D2.
(One can look at precise explanations versus imprecise ones, where an imprecise explanation means that A2 is able to reproduce D2 only approximately, but this doesn't affect the argument significantly, so I'll leave this complication out from here on.)
If D2 is large, then for E_D to be an interesting explanation, it should be more compact than D2.
Note that I am not requiring E_D to generate D2 from D1 on its own. I am requiring that A2 be able to generate D2 based on E_D and D1. Since A2 is an arbitrary member of the community of agents, the validity of an explanation, as I'm defining it here, is relative to the assumed community of agents.
Note also that, although expressions in L are always finitely describable, that doesn't mean that the agents A1, A2, etc. are. According to the framework I've set up here, these agents could be infinite, uncomputable, and so forth. I'm not assuming anything special about the agents, but I am considering them in the special context of finite communications about finite observations.
The above is my formalization of the scientific process, in a general and abstract sense. According to this formalization, science is about communities of agents linguistically transmitting to each other knowledge about how to predict some commonly-perceived data, given some other commonly-perceived data.
The (Dubious) Scientific Value of the Uncomputable
Next, getting closer to the theme of this post, I turn to consider the question of what use it might be for A2 to employ some uncomputable entity U in the process of using E_D to generate D2 from D1. My contention is that, under some reasonable assumptions, there is no value to A2 in using uncomputable entities in this context.
D1 and E_D are sets of L-expressions, and so is D2. So what A2 is faced with, is a problem of mapping one set of L-expressions into another.
Suppose that A2 uses some process P to carry out this mapping. Then, if we represent each set of L-expressions as a bit string (which may be done in a variety of different, straightforward ways), P is then a mapping from bit strings into bit strings. To keep things simple we can assume some maximum size cap on the size of the bit strings involved (corresponding for instance to the maximum size expression-set that can be uttered by any agent during a trillion years).
The question then becomes whether it is somehow useful for A2 to use some uncomputable entity U to compute P, rather than using some sort of set of discrete operations comparable to a computer program.
One way to address this question is to introduce a notion of simplicity. The question then becomes whether it is simpler for A2 to use U to compute P, rather than using some computer program.
And this, then, boils down to one's choice of simplicity measure.
Consider the situation where A2 wants to tell A3 how to use U to compute P. In this case, A2 must represent U somehow in the language L.
In the simplest case, A2 may represent U directly in the language, using a single expression (which may then be included in other expressions). There will then be certain rules governing the use of U in the language, such that A2 can successfully, reliably communicate "use of U to compute P" to A3 only if these rules are followed. Call this rule-set R_U. Let us assume that R_U is a finite set of expressions, and may also be expressed in the language L.
Then, the key question is whether we can have
complexity(U) < complexity(R_U)
That is, can U be less complex than the set of rules prescribing the use of its symbol S_U within the community of agents?
If we say NO, then it follows there is no use for A2 to use U internally to produce D2, in the sense that it would be simpler for A2 to just use R_U internally.
On the other hand, if we say YES, then according to the given complexity measure, it may be easier for A2 to internally make use of U, rather than to use R_U or something else finite.
So, if we choose to define complexity in terms of complexity of expression in the community's language L, then we conclude that uncomputable entities are useless for science. Because, we can always replace any uncomputable entity U with a set of rules for manipulating the symbol S_U corresponding to it.
If you don't like this complexity measure, you're of course free to propose another one, and argue why it's the right one to use to understand science. In a previous blog post I've presented some of the intuitions underlying my assumption of this "communication prior" as a complexity measure underlying scientific reasoning.
The above discussion assumes that U is denoted in L by a single symbolic L-expression S_U, but the same basic argument holds if the expression of U in L is more complex.
What does all this mean about calculus, for example ... and the other lovely uses of uncomputable math to explain science data?
The question comes down to whether, for instance, we have
complexity(real number line R) <>
If NO, then it means the mind is better off using the axioms for R than using R directly. And, I suggest, that is what we actually do when using R in calculus. We don't use R as an "actual entity" in any strong sense, we use R as an abstract set of axioms.
What would YES mean? It would mean that somehow we, as uncomputable beings, used R as an internal source of intuition about continuity ... not thus deriving any conclusions beyond the ones obtainable using the axioms about R, but deriving conclusions in a way that we found subjectively simpler.
A Postcript about AI
And, as an aside, what does all this mean about AI? It doesn't really tell you anything definitive about whether humanlike mind can be achieved computationally. But what it does tell you is that, if
- humanlike mind can be studied using the communicational tools of science (that is, using finite sets of finite-precision observations, and languages defined as finite strings on finite alphabets)
- one accepts the communication prior (length of linguistic expression as a measure of complexity)
6 comments:
Ben,
I am having trouble "seeing" the argument.
It seems to slip by the idea that if a mind contained uncomputable entities, actual information could be obtained by them.
So, for example, if all the agents in the scientific community were born with halting oracles, some descriptions of sensory data could be somewhat shorter by invoking the halting oracle. (Other descriptions would be longer as a result, but if the world generally had halting oracles in it, then the sensory-data would probably be more easily described using the halting oracle.)
The specific point at which the argument seems to trip for me is at the "If we say no... if we say yes..." point. You talk about replacing U with R_U, the set of rules associated with using S_U. It seems to me that U could not be replaced so easily, since those manipulation rules would not give all the answers a halting oracle could give. Furthermore, it seems to me that the rules for using S_U would include actual references to U, like "close your eyes and meditate and the answer will come to you". So I don't even see how it would make sense to replace U with R_U.
Perhaps you could clarify?
Abram ...
There is only a finite number of possible rules for mapping L-expression-sets of size less than N into L-expression-sets of size less than N.
So, whatever voodoo happens involving U, in the context of mapping L-expression-sets of size less than N into other ones, it must be equivalent to some subset of this rule-set. There must be some finite set of rules that does *exactly* what U does, within the specified size constraint.
Remember, there are no infinities involved here.
Whatever the halting oracle does **within the specified finite domain** could be emulated by some set of rules.
I could prove this to you in any specific case. Given a halting oracle U, you could make a finite table of its answer to all questions that can be posed within the finite domain. Then, there is some minimal Turing program that will give the same answers as the oracle to all those questions.
The question is whether you want to consider this corresponding set of rules (the Turing program) as being simpler than U or not (e.g. simpler than the oracle or not). If you want to consider the oracle as being simpler than the corresponding set of rules, then that's your right. The problem is that the oracle cannot be communicated ... but you're right that if *everyone in the community had the same halting oracle in their brain*, then they could do science in an uncomputable way, if they agree to measure simplicity relative to their brains rather than relative to their (discrete, finite language).
But, then they are not using the communication prior. Then they are using a prior that is, effectively, something like "complexity relative to the common brain structure of the members of the community."
This is interesting, but it's more like "shared intuition" than like what we think of as science. Because the common means of making judgments is not something that the community of agents is able to explicitly articulate in language.
I'm not sure when I will find time to write that argument up in a fully mathematical way, which might be what it takes to get it across to you, I dunno?
Ben,
OK, that makes perfect sense. For some reason I thought you were *assuming* that every agent had the same uncomputable oracle U.
Abram: well, if every agent had the same oracle in their brains, how could they know this? If they need to verify this thru linguistic communication, then you run up against the issues I mentioned in my post. But on the other hand, if they know this via some non-linguistic, non-finite-data-driven, intuitive voodoo method, then, yeah, you've got a loophole, as you mentioned...
Ben,
They would know it in the same way that we know that anyone has the ability to learn to count... the uncomputable stuffs would just be another mechanism of the mind that everyone would be able to use.
The reason I assumed that this was what you meant was because you are talking about an agent communicating the idea of using U to compute something. So, I figured that U must be something that all the agents had access to.
I think I may have mentioned this before... but it seems like your arguments could be applied just as well to argue that Turing machines are unnecessary and all we need is finite-state machines. (Which is, of course, literally all we have.) If someone claimed to have found a pattern in nature that was generated by a Turing machine (let's call it a Turing object, T), then they would need to communicate the idea to others using a symbol, S_T, and a set of manipulation rules that define the "turing-ness" of T, R_T. The rules would of course need to be implementable on a finite-state machine. So, the agent would be unable to convince its peers, because R_T would be a working finite-state explanation, unless all of the agents happened to have some bias towards objects conveniently described with R_T, in which case they would subjectively like to think that T is really Turing-computed rather than finite-state-computed.
If the above argument doesn't sound analogous to your argument, then take the disanalogies to be places where I still don't understand your argument.
Abram, yes you seem to be right, my argument also seems to show that for the purposes of science (defined in terms of finite datasets and finite linguistic utterances) we don't need TM's but only FSM's.
This of course doesn't bother me at all.
Note that I am not saying that uncomputable entities or TM's don't "exist" in any deep philosophical sense. I'm just saying that, from the point of view of science, they might as well not exist. Science isn't necessarily everything....
You raise an interesting point which is that if all the agents in a community share the same internal uncomputable entity U ... and if they each assume that each other share it (without need for experimental evidence) ... then, they can coherently use this to explain scientific datasets.
This is consistent with what I showed above, so long as it means that the community considers U simpler than its linguistic explication.
I am reminded of how we each implicitly assume each other are conscious, without empirical evidence.
Yet, I am wary of the approach of taking this kind of shared, unsubstantiable intuition as a basis for scientific process....
Post a Comment