The Multiverse According to Ben: 2014

Tuesday, December 02, 2014

What is Life?

I was trolling the Google+ discussion forum related to the Developmental AI MOOC, which my son Zar recently completed, and noticed a discussion thread on the old question of “What is Life?” — prompted by the question “Are these simple developmental-AI agents we’ve built in this course actually “alive” or not?

I couldn’t help adding my own two cents to the forum discussion; this blog post basically summarizes what I said there.

First of all, I’m not terribly sure "alive or not" is the right question to be asking about digital life-forms. Of course it's an interesting question from our point of view as evolved, biological organisms. But -- I mean, isn't it a bit like asking if an artificial liver is really a liver or not? What are the essential characteristics of being a liver? Is it enough to carry out liver-like functions for keeping an organism's body going? Or do the internal dynamics have to be liver-like? And at what level of detail?

Having said that, though, I think one can make some mildly interesting headway on the "what is life?” question by starting with the concept of agency and proceeding from there...

I think Stan Franklin was onto something with his definition of "autonomous agent" in his classic “Agent or Program?” paper . He was writing there more from an AI view than an ALife view, but still the ideas there seem very much applicable. The core definition of the paper is:

An autonomous agent is a system situated within and part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to affect what it senses in the future.

The paper then goes on to characterize additional properties possessed by various different types of agents. For instance, according to Franklin's approach,

every agent satisfies the properties: reactive, autonomous, goal-oriented and temporally continuous.
some agents have other interesting properties like: learning/adaptive, mobile, flexible, having-individual-character, etc.

Given this approach, one could characterize “life” via saying something like

A life-form is an autonomous agent that is adaptive and possesses metabolism and self-reproduction.

This seems fairly reasonable — but of course begs the question of how to define metabolism and self-reproduction. If one defines them too narrowly based on biological life, one will basically be defining "traditional biological life." If one defines them too broadly, they'll have no meaning.

A related approach that seems sensible to me is to define a kind of abstract "survival urge" For instance, we could say that

An agent possesses survival-urge if its interactions with the environment, during the period of its existence, have a reasonably strong influence on whether it continues to exist or not ... and if its continued existence is one of its goals.

and

An agent with individual character possesses individual-character survival-urge if its interactions with the environment, during the period of its existence, have a reasonably strong influence on whether other agents with individual-character similar to it exist in future ... and if both its continued existence and the existence of other agents with similar individual-character to it, are among its goals.

Then we could abstractly conceive life as

A life-form is an adaptive autonomous agent with survival urge

An individuated life-form is an adaptive autonomous agent with, survival urge, individual character and individual-character survival-urge

These additions bring us closer to the biological definition of life.

(I note that Franklin, in his paper, doesn't define what a "goal" is. But in the discussion in the paper, it's clear that he conceived it as what I've called an implicit goal rather than an explicit goal. That is, he considers that a thermostat has a goal; but obviously, the thermostat does not contain its goal as deliberative, explicitly-represented cognitive content. He seems to consider a system's goal as, roughly, "the function that a reasonable observer would consider the system as trying to optimize." I think this is one sensible conception of what a goal is.)

Unfortunately I haven't studied the agents created in the Developmental AI MOOC well enough to have a definitive opinion if they are "alive" according to the definitions I've posited in this post. My suspicion, though, based on a casual study, is that they are autonomous agents without so much of a survival urge. But I guess a survival urge and even an individuated one could be achieved via small modifications to the approach taken in the course exercises.

My overall conclusion, then, is that some fairly simple digital life-forms should logically be said to obey the criteria of “life”, if these are defined in a sensible way that isn’t closely tied to the specifics of the biological substrate.

Now, some may find this unsatisfying, in that digital organisms like the ones involved in the Developmental AI MOOC are palpably much simpler than known biological life-forms like amoebas, paramecia and so forth. But my reaction to that would be that complexity is best considered a separate issue from “alive-ness.” The complexity of an agent’s interactions and perceptions and goal-oriented behaviors can be assessed, as can the complexity of its behaviors specifically directed toward survival or individual-character survival. According to these criteria, existing digital life-forms are definitely simpler than amoebas or paramecia, let alone humans. But I don’t think this makes it sensible to classify them as “non-alive.” It’s just that the modern digital environment happens to allow simpler life-forms than Earthly chemistry gave rise to via evolution.

Saturday, November 22, 2014

Is the Technological Singularity a "Final Cause" of Human History?

Intuitively, it is tempting (to some people anyway!) to think of the potential future Technological Singularity as somehow "sucking us in" -- as a future force that reaches back in time and guides events so as to bring about its future existence. Terrence McKenna was one of the more famous and articulate advocates of this sort of perspective.

This way of thinking relates to Aristotle's notion of "Final Causation" -- the final cause of a process being its ultimate purpose or goal.   Modern science doesn't have much of a place for final causes in this sense; evolutionary theories often seem to be teleological in a "final causation" way on the surface, but then can generally be reformulated otherwise.   (We colloquially will say "evolution was trying to do X," but actually our detailed models of how evolution was working toward X, don't require any notion of "trying", but only notions of mutation, crossover and differential survival...)

It seems to me, though, that the Surprising Multiverse theory presented in one of my recent blog posts (toward the end), actually implies a different sort of final causation -- not quite the same as what Aristotle suggested, but vaguely similar. And this different sort of final causation does, in a sense, suggest that the Singularity may be sucking us in....

The basic concept of the Surprising Multiverse theory is that, in the actual realized rather than merely potential world, patterns with high information-theoretic surprisingness are more likely to occur.   This implies that, among the many possible universes consistent with a given set of observations (e.g. a given history over a certain interval of time), those universes containing more surprisingness are more likely to occur.

Consider, then, a set of observations during a certain time interval -- a history as known to a certain observer, or a family of histories as known to a set of communicating observers -- and the question of what will happen AFTER that time interval is done.   For instance, consider human history up till 2014, and the question of the human race's future afterwards.

Suppose that, of the many possible futures, some contain more information-theoretic surprisingness.   Then, if the Surprising Multiverse hypothesis holds, these branches of the multiverse -- these possible universes -- will have boosted probabilities, relative to other options.   The surprisingness weighting may then be viewed intuitively as "pulling the probability distribution over universes, toward those with greater surprisingness."

The "final cause" of some pattern P according to observer O, may be viewed as the set of future surprising patterns Q that are probabilistically caused by P, from the perspective of observer O. (There are many ways to quantify the conceptual notion of probabilistic causation -- perhaps the most compelling is as "P having nonneutralized positive component effect on Q, based on the knowledge of O", as defined in the interesting paper A Probabilistic Analysis of Causation.)

So the idea is: final causation can be viewed as the probabilistic causation that has the added oomph of surprisingness (and then viewed in the backwards direction). A final cause of P is something that is probabilistically caused by P, and that has enough surprisingness to be significantly overweighted in the Surprising Multiverse weighting function that balances P's various possible futures.

So what of the Singularity? We may suppose that a Technological Singularity will display a high degree of probabilistic surprisingness, relative to other alternative futures for humanity and its surrounds. If so, branches of the multiverse involving a Singularity would be preferentially weighted higher, according to the Surprising Multiverse hypothesis. The Singularity is thus a final cause of human history. QED....

A fair example of the kind of thing that passes through my head at 2:12 AM Sunday morning ;-) ...

Sunday, November 02, 2014

The SpaceshipTwo Crash and the Pluses and Minuses of Prize-Driven and Incremental Development

SpaceShipTwo, Virgin Galactic's ambitious new plane/spaceship, crashed Friday (two days ago), killing one pilot and seriously injuring another.

This is a human tragedy like every single death; and it's also the kind of thing one can expect from time to time in the course of development of any new technology. I have no doubt that progress toward tourist spaceflight will continue apace: inevitable startup struggles notwithstanding, it's simply an idea whose time has come.

Every tragedy is also an occasion for reflection on the lessons implicit in the tragic events.

(in the center picture, the SpaceShipTwo is shown in the center,

between the motherships that provide its initial lift)

For me, watching the struggles of the Virgin Galactic approach to spaceflight has also been a bit of a lesson in the pluses and minuses of prize-driven technology development. SpaceShipTwo is the successor to SpaceShipOne, which won the Ansari X-Prize for commercial spaceflight a decade ago. At the time it seemed that the Ansari X-Prize would serve at least two purposes:

Raise consciousness generally about the viability of commercial spaceflight, particularly of the pursuit of spaceflight by startups and other small organizations rather than governments and large government contractors
Concretely help pave a way toward commercially viable spaceflight, via progressive development of the winning spaceflight technology into something fairly rapidly commercially successful

It seems clear that the first goal was met, and wonderfully well. Massive kudos are due to the X-Prize Foundation and Ansari for this. The press leading up to and following from the Ansari X-Prize made startup spaceflight into a well-recognized "thing" rather than a dream of a tiny starry-eyed minority.

Regarding the second goal, though, things are much less clear. Just a little before the tragic SpaceShipTwo crash, a chillingly prescient article by Doug Messier was posted, discussing the weaknesses of the SpaceShipTwo design from a technical perspective. If you haven't read it yet, I encourage you to click and read it through carefully -- the article you're reading now is basically a reflection on some of the points Messier raises, and a correlation of some of those points with my own experiences in the AI domain.

Messier's article traces SpaceShipTwo's development difficulties back to the SpaceShipOne design, on which it was based -- and points out that this design may well have been chosen (implicitly, if not deliberately) based on a criterion of winning the Ansari X-Prize quickly and at relatively low cost, rather than a criterion of serving as the best basis for medium-term development of commercial spaceflight technology.

As Messier put it,

It turns out that reaching a goal by a deadline isn’t enough; it matters how you get there. Fast and dirty doesn’t necessarily result in solid, sustainable programs. What works well in a sprint can be a liability in a marathon. A - See more at: http://www.parabolicarc.com/2014/10/30/apollo-ansari-hobbling-effects-giant-leaps/#sthash.1ah2VRLy.dpuf

It turns out that reaching a goal by a deadline isn’t enough; it matters how you get there. Fast and dirty doesn’t necessarily result in solid, sustainable programs. What works well in a sprint can be a liability in a marathon. - See more at: http://www.parabolicarc.com/2014/10/30/apollo-ansari-hobbling-effects-giant-leaps/#sthash.1ah2VRLy.dpuf

However, while I am fascinated by Messier's detailed analysis of the SpaceShipOne and SpaceShipTwo technologies, I'm not sure I fully agree with the general conclusion he draws -- or at least not with the way he words his conclusions. His article is titled "Apollo, Ansari and the Hobbling Effects of Giant Leaps" -- he argues that a flaw in both the Ansari X-Prize approach and the Apollo moon program was an attempt to make a giant leap, by hook or by crook. In both cases, he argues, the result was a technology that achieved an exciting goal using a methodology that didn't effectively serve as a platform for ongoing development.

Of course, the inspirational value of putting a man on the moon probably vastly exceeded the technical value of the accomplishment - and the inspirational value was the main point at the time. But I think it's also important to make another point: the problem isn't that pushing for Giant Leaps is necessarily bad. The problem is that pushing for a Giant Leap that is defined for non-technical, non-scientific reasons, with a tight time and/or financial budget, can lead to "fast and dirty" style short-cuts that render the achievement less valuable than initial appearances indicate.

Apollo, Ansari and the Hobbling Effects of Giant Leaps - See more at: http://www.parabolicarc.com/2014/10/30/apollo-ansari-hobbling-effects-giant-leaps/#sthash.1ah2VRLy.dpuf

That is: If the goal is defined as "Achieve Giant Leap Goal X as fast and cheap as possible," then the additional goal of "Create a platform useful for leaping beyond X" is not that likely to be achieved as well, along the way. And further -- as I will emphasize below -- I think the odds of the two goals being aligned are higher if Great Leap Goal X emerges from scientific considerations, as opposed to from socially-oriented marketing or flashy-demonstration considerations.

It's interesting that Messier argues against Giant Leaps and in favor of incremental development. And yet there is a sense in which SpaceShipOne/Two represents incremental development at its most incremental. I'm thinking of the common assumption in the modern technology world, especially in Silicon Valley, that the best path to radical technological success is also generally going to be one that delivers the most awe-inspiring, visible and marketable results at each step of the way. The following graphic is sometimes used to illustrate this concept:

http://speckycdn.sdm.netdna-cdn.com/wp-content/uploads/2014/10/mvp_fail_01.png

On the surface, the SpaceShipTwo approach exemplifies this incremental development philosophy perfectly.   It's a spaceplane, an incremental transition between place and spaceship; and the spaceship portion is lifted high into the air initially by a plane.   It's precisely because of taking this sort of incremental approach that SpaceShipOne was able to win the Ansari X-Prize with the speed and relatively modest cost that it did.

On the other hand, Messier favors a different sort of incremental spacecraft development -- not incremental steps from plane to plane/spacecraft to spacecraft, but rather ongoing incremental development of better and better materials and designs for making spacecraft, even if this process doesn't lead to commercial space tourism at the maximum speed. In fact, scientific development is almost always incremental -- the occasional Eureka moment notwithstanding (and Eureka moments tend to rest on large amounts of related incremental development).

It seems important, in this context, to distinguish between incremental basic scientific/technological progress and incremental business/marketing/demonstration progress. Seeking incremental scientific/technological progress makes sense (though other issues emerge here, in terms of pathologies resulting from trying too hard to quantitatively and objectively measure incremental scientific/technological progress -- I have discussed this in an AGI context before). But the path of maximally successful incremental business/marketing/demonstration progress often does not involve the most sensible incremental scientific path -- rather, it sometimes involves "fast and dirty" technological choices that don't advance science so much at all.

In my own work on AGI development, I have often struggled with these aspects of development.    The incremental business/marketing/demo development approach has huge commercial advantages, as it has more potential of giving something money-making at each step of the way.   It also has advantages in the purely academic world, in terms of giving one better demos of incremental progress at each step of the way, which helps with keeping grant funding flowing in.   The advantages also hold up in the pure OSS software domain, because flashy, showy incremental results help with garnering volunteer activity that moves an OSS project forward.

However, when I get into the details of AGI development, I find this "incremental business/marketing/demo" approach often adds huge difficulty. In the case of AGI the key problem is the phenomenon I call cognitive synergy, wherein the intelligence of a cognitive system largely comes from the emergent effects of putting many parts together.   So, it's more like the top picture in the above graphic (the one that's supposed to be bad) rather than the bottom picture.    Building an AGI system with many parts, one is always making more and more scientific and technological progress, step by step and incrementally.   But in terms of flashy demos and massive commercial value, one is not necessarily proceeding incrementally, because the big boost in useful functionality is unlikely to come before a lot of work has been done on refining individual parts and getting them to work together.

Google, IBM and other big companies recently redoubling their efforts in the AI space are trying to follow the bottom-picture approach, and work toward advanced AGI largely via incrementally improving their product and service functionalities using AI technology. Given the amount of funding and manpower they have, they may be able to make this work.   But where AGI is concerned, it's pretty clear to me that this approach adds massive difficulty to an already difficult task.

One lesson the SpaceShipOne/Two story has, it seems to me, is that aggressive pursuit of the "maximize incremental business/marketing/demo results" path has not necessarily been optimal for commercial spaceflight either.   It has been fantastically successful marketing-wise, but perhaps less so technically.

I've been approached many times by people asking my thoughts on how to formulate a sort of X-Prize for AGI. A couple times I put deep thought into the matter, but each time I came away frustrated -- it seemed like every idea I thought of was either

"Too hard", in the sense that winning the prize would require having a human-level AGI (in which case the prize becomes irrelevant, because the rewards for creating a human-level AGI will be much greater than any prize); OR
Susceptible to narrow-AI approaches -- i.e. likely end up rewarding teams who pushed toward winning the prize quickly via taking various short-cuts, using approaches that probably wouldn't be that helpful toward achieving human-level AGI eventually

The recently-proposed AI TED-talk X-Prize seems to me likely to fall into the latter category. I can envision a lot of approaches to making AIs to give effective TED talks, that are basically "specialized TED talk giving machines" designed and intensively engineered for the purpose, without really having architectures suitable as platforms for long-term AGI development. And if one had a certain fixed time and money budget for winning the AI TED-talk X-Prize, pursuing this kind of specialized approach might well be the most rational course. I know that if I myself join a team aimed at winning the prize, there will be loads of planning discussions aimed at balancing "the right way to do AGI design/development" versus "the cleverest way to win the prize."

On the other hand, as a sci-tech geek I need to watch out for my tendency to focus overly on the technical aspects. The AI TED-Talk X-Prize, even if it does have the shortcomings I've mentioned above, may well serve amazingly well from a marketing perspective, making the world more and more intensely aware of the great potential AI holds today, and the timeliness of putting time and energy and resources into AGI development.

I don't want to overgeneralize from the SpaceShipTwo crash -- this was a specific, tragic event; and any specific event has a huge amount of chance involved in it. Most likely, in a large percentage of branches of the multiverse, the flight Friday went just fine. I also don't want to say that prize-driven development is bad; it definitely has an exciting role to play, at very least in helping to raise public consciousness about technology possibilities. And I think that sometimes the incremental business/marketing/demo progress path to development is exactly the right thing. As well as being a human tragedy, though, I think the recent terrible and unfortunate SpaceShipTwo accident does serve as a reminder of the limitations of prize-driven technology development, and a spur to reflect on the difficulties inherent in pursuing various sorts of "greedy" incremental development.

Saturday, November 01, 2014

Grounding Representation and Pattern in Consciousness

A little piece of patternist analytical philosophy to brighten up your weekend....    I was thinking this stuff through today while going about in Tai Po running errands.   Most notably, the back bumper of my car fell off yesterday, and I was trying to find someone to repair it.   A friendly auto repair shop ended up reattaching the bumper with duct tape. A real repair is pending them finding a replacement for the broken plastic connector between the bumper and the car, in some semi-local junkyard. Egads! Well, anyway....

The concept of "representation" is commonly taken as critical to theories of cognition.   In my own work on the foundations of cognition, I have taken the concept of "pattern" as foundational, and have characterized "pattern" as meaning "representation as something simpler."

But what is representation?   What is simplicity?

In this (rather abstract, theoretical) post, I will suggest a way of grounding the concepts of representation and simplicity (and hence, indirectly, pattern) in terms of consciousness -- or more specifically, in terms of the concept of attention in cognitive systems.

I'll speak relatively informally here, but I'm confident these ideas can be formalized mathematically or philosophically if one wishes...

From Attention to Representation

Suppose one has an intelligent system containing a large amount of contents; and each item of contents has a certain amount of attention associated with it at a given point in time.

(One can characterize attention either energetically or informationally, as pointed out in Section 2.6 of my recent review paper on consciousness. Given the close connection between energy and information in physics, these two characterizations may ultimately be the same thing.)

In real-world cognitive systems, attention is not distributed evenly across cognitive contents. Rather, it seems to generally be distributed in such a way that a few items have a lot of attention, and most items have very little attention, and there is a steep but continuous slope between the former and latter categories. In this case, we can speak about the Focus of consciousness as (a fuzzy set) consisting of those items that have a lot of attention during a certain interval; and the Fringe of consciousness as (a fuzzy set) consisting of those items that have more attention than average during a certain interval, but not as much as the items in the Focus do. (The general idea that human consciousness has a Focus and a Fringe goes back at least to William James.)

It seems to me one can ground the notion of representation in the structure of consciousness, specifically in the relation between the Focus and the Fringe.

Namely, one can say that ... R represents E, to system S, if: In the mind of system S, when R is in the Focus, this generally implies E is likely to be in the Fringe at around the same time (perhaps at the same time, perhaps a little earlier, or perhaps a little later).

As a single example, consider the typical simplifying model of the visual cortex as a processing hierarchy. In this case, we may say that when we are visually remembering the tree

the state of the upper levels of the visual hierarchy is in Focus, along with bits and pieces of the lower levels
the full state of the whole visual hierarchy is mostly contained in Fringe

So the rough visual image we have of the tree in our "mind's eye" at a certain point in time, represents the richer visual image of the tree we have in our broader sensorimotor memory.

On the other hand, the phrase "the tree in the middle of my old backyard" may represent my stored visual images of that tree as well, if when that phrase occurs in my Focus (because I heard it, said it, or thought about it), my stored visual images rise into my Fringe (rising up from the deeper, even less attended parts of my memory).

From Attention to Simplicity

I'd like to say that R is a pattern in E if: R represents E, and R is simpler than E. But this obviously begs the question of what constitutes simplicity....

In prior writings, I have tended to take simplicity as an assumptive prior concept.   That is, I have assumed that each mind has its own measure of simplicity, and that measurement of pattern is relative to what measure of simplicity one chooses.

I still think this is a good way to look at it -- but now I'm going to dig a little deeper into the cognitive underpinnings of how each mind generates its own measure of simplicity.

Basically, I propose we can consider E as simpler than F, if it's generally possible to fit more stuff in the Focus along with E, than along with F.

Note that both E and F may be considered as extending over time, in this definition. Sometimes they may be roughly considered as instantaneous, but this isn't the general case.

One technical difficulty with this proposal is how to define "more."   There are many ways to do that; one is as follows....

Define simple_1 as follows: E is simpler_1 than F if it's generally possible to fit a greater number of other coherent cognitive items in the Focus along with E, than with F.   (This relies on the concept of "coherence" of a cognitive item as a primitive -- or in other words, it assumes that the sense or notion of what is a "coherent whole" is available to use to help define simplicity.)

Then define simple_2 as: E is simpler_2 than F if it's generally possible to fit a set of cognitive items with less simplicity_1 in the Focus along with E, as compared to with F.

One can extend this to simple_3, simple_4, etc., recursively.

According to this approach, we would find a detailed image of a tree is less simple than a rough, approximate image.   When visualizing a detailed image, we keep more different samples of portions of the detailed image in Focus, leaving less room for anything else.

Similarly, the concept of the function x^2, once we understand it, takes up much less space in Focus than the procedure "take a number and multiply it by itself", and much less than a large table of pairs of numbers and their squares.   Once an abstraction is learned (meaning that holding it in Focus causes appropriate knowledge to appear in Focus), and mastered (meaning that modifying it while it's in Focus causes the contents of Fringe to change appropriately), then it can provide tremendous simplification over less abstract formulations of the same content.

From Attention to Pattern

So, having grounded both representation and simplicity in terms of the structure of attention, we have grounded pattern in terms of the structure of attention. This is interesting in terms of the cognitive theory I outlined in my 2006 book The Hidden Pattern and elsewhere, which grounded various aspects of intelligence in terms of the "pattern" concept.

As I've previously grounded so much of cognition in terms of pattern, if one then grounds pattern in terms of consciousness, one is then in effect grounding the structure and dynamics of cognition in terms of simple aspects of the structure of consciousness. This can be viewed mathematically and formally, and/or phenomenologically.

Logical and Linguistic Representation

Often when one hears about "representation", the topic at hand is some sort of formal, logical or linguistic representation. How does that kind of representation fit into the present framework?

Formal or semi-formal systems like (formal or natural) languages or mathematical theories may be viewed as systems for generating representations of cognitive items.

(What I mean by a semi-formal system is: A system for generating entities for which, for a variety of values of x less than 1, we have a situation where a fraction x of the system's activity can be explained by a set of n(x) formal rules. Of course n(x) will increase with x, and might sometimes increase extremely fast as x approaches 1. Natural languages tend to be like this.)

When we have a situation where

R represents E
R is a construct created in some formal or semi-formal system S
E is not a construct in S; rather, E is connected via some chain of representations with sensory and/or motor data

then we can say that E "grounds" R.

Grounding tends to be useful in the case of systems where

R is commonly simpler than E (or at least, there's some relatively easy way to tell what will be the situations in which R is going to be simpler than E)
There is a methodology for going from E to the formal / semi-formal representation R, that doesn't take a huge amount of attention (once the mind is practiced at the methodology)
Carrying out manipulations within the formal / semi-formal system commonly generates new formal / semi-formal constructs that represent useful things

These criteria hold pretty well in the case of human languages, and most branches of mathematics (I suppose the jury's still out on, say, the theory of inaccessible cardinals....)

Note that one system may be grounded in another system.   For instance, formal grammars of English are grounded in natural English language productions -- which in turn are grounded, for each language user, in sensorimotor experience.

If it is simple to generate new representations using a certain system, then this means the process of representation-generation is a pattern in a the set of representations generated -- i.e. it's simpler to hold that process in Focus over an interval of time, than to hold the totality of representations generated by it in Focus over an interval of time.   The formal and semi-formal systems adopted by real-world minds, are generally adopted because grounding is useful for them, and their process of representation-generation is a pattern.

This is all quite abstract -- I'll try to make it a little more concrete now.

Suppose I tell you about a new kind of animal called a Fnorkblein, which much enjoys Fbljorking, especially in the season of YingYingYing.   You can then produce sentences describing the actual or potential doings of these beings, e.g. "If Fnorkbleins get all their joyful Fbljorking done in YingYingYing, they may be less happy in other seasons."

These sentences will be representations of, and patterns in, your visual images of corresponding scenes (your mental movie of the Fnorkbleins romping and Fbljorking in the throes of their annual YingYingYing celebration, and so forth).   They will ground these images.

Furthermore, the process of formulating this Fnorkblein-ful sentences, will take relatively little Focus for you, because you know grammar.   If you didn't know grammar, then formulating linguistic patterns representing Fnorkblein-relevant images would require a lot more work, i.e. a lot more Focus spent on the particulars of Fnorkbelein-ness.

Of course, people can use grammar fairly effectively -- including about Fnorkbleins -- without any knowledge of any explicit formalization of grammar.    However, if one wants people to stick close to a specific version of grammar, rather than ongoingly improvising in major ways, it does seem effective to teach them explicit formalized rules that capture much of everyday grammatical usage. That is, if the task at hand is not deploying the grammar one knows in practical contexts, but rather communicating or gaining knowledge regarding which sentences are considered grammatical in a certain community -- then formalizations of grammar become very useful.    This is the main reason grammar is taught in middle school -- giving a bit of theory alongside the real-world examples a child hears in the course of their life, helps the child learn to reliably produce grammatical sentences according to recognized linguistic patterns, rather than improvising on the patterns they've heard as happens in informal communication.   (And it may be that formal grammars are also useful for teaching AIs grammar, to help overcome their lack of the various aspects of human life and embodiment that human children learn to pick up grammar implicitly from examples -- but that's a whole other story.)

It seems the act of teaching/learning the rules of grammar of a language, constitutes a pattern in "the act of communicating/learning the set of sentences that are grammatical in that language", in the sense that: If one has to tell someone else how to reliably assess which sentences are grammatical in a certain language (as utilized within a certain specific community), a lot less Focus will be spent via telling them the rules of grammar alongside a list of examples, than by just giving them a long list of examples and leaving them to do induction. When teaching grammar, while a rule of grammar is in the Focus, the specific examples embodying that rule of grammar are in the Fringe.   While passing the rules of grammar from teacher's Focus to student's Focus, the real point is what is being represented: the passage of sets of sentences judged as grammatical/ungrammatical from teacher's Fringe to student's Fringe.

The rules of grammar, then, may be described as a "subpattern" in the act of communicating a natural grammar (a subpattern meaning they are part of the pattern of teaching the rules of grammar).   This may seem a bit contorted, but it's not as complicated as the design of the computer I'm typing this on.   Formal grammars, like Macbooks, are a complex technology that only emerged quite recently in human evolution.

Interpretations....

And so, Bob's your uncle.... Starting from the basic structure of consciousness, it's not such a big leap to get to pattern -- which brings one the whole apparatus of mind as a system for recognizing patterns in itself and the world -- and to systems like languages and mathematics.

One can interpret this sort of of analysis multiple ways ... for instance:

Taking physical reality as the foundation, one can study the structure of consciousness corresponding to a certain physical system (e.g. a brain), and then look at the structure of the mind corresponding to that physical system as consequent from the structure of its consciousness.

Taking experience as the foundation, one can take entities like Unity and Attention as primary, and then derive concepts like Focus and Fringe, and from there get to Representation and Pattern and onwards -- thus conceptually grounding complex aspects of cognition in terms of phenomenological basics.

Tuesday, October 14, 2014

The Neural Foundations of Complex Symbolic Thought

How can the brain, with its messy mix of neurons and glia spreading activation all over the place, give rise to the precise mathematical structures of symbolic reasoning?

This "symbolic / subsymbolic gap" has been a major puzzle at the center of cognitive science for decades, at least.

In a paper for the IJCNN conference in Beijing in July, I proposed a potential solution that -- while still speculative -- I believe has real potential for solving the issue.

The paper is linked here; and the abstract is as follows:

How Might the Brain Represent Complex Symbolic Knowledge?

Abstract—A novel category of theories is proposed, providing
a potential explanation for the representation of complex
knowledge in the human (and, more generally, mammalian)
brain. Firstly, a ”glocal” representation for concepts is suggested,
involving localized representations in a sparse network
of ”concept neurons” in the Medial Temporal Lobe, coupled
with a complex dynamical attractor representation in other
parts of cortex. Secondly, it is hypothesized that a combinatory
logic like representation is used to encode abstract
relationships without explicit use of variable bindings, perhaps
using systematic asynchronization among concept neurons to
indicate an analogue of the combinatory-logic operation of
function application. While unraveling the specifics of the
brain’s knowledge representation mechanisms will require data
beyond what is currently available, the approach presented
here provides a class of possibilities that is neurally plausible
and bridges the gap between neurophysiological realities and
mathematical and computer science concepts.

Note that this is a hypothesis about brains, and potentially a design principle for closely brain-like AGI systems -- but not a statement about, for example, the OpenCog AGI system, which implements symbolic thought more directly. However, there are certainly analogies with things that happen inside OpenCog. OpenCog has explicit symbolic representation (analogous to concept neurons, very roughly) and also subsymbolic representation from which symbolic-like representations may emerge; and the design intention of OpenCog is that these two kinds of representations can work together. The specific mechanisms of this interaction are quite different in OpenCog from what I hypothesize to take place in the brain, but, on the level of the cognitive processes emerging from these systems at the highest levels, there may not be a large difference.

Wednesday, October 01, 2014

Is Physics Information Geometry on Causal Webs? (Speculations Toward Grand Unification)

I mentioned recently to my dad that I was fiddling around in my (egregiously nonexistent) "spare time" with some ideas regarding quantum gravity and he said "I would question the sanity of someone as busy as you trying to unify physics in his spare time." Fair enough….

However, I think it's almost a social necessity these days to have one's own eccentric approach to grand unified physics, isn't it? Admittedly, there are hundreds or maybe thousands of different approaches out there -- but still, if you don't have one of your own, you're not going to be invited into any of the really fancy parties….

And more importantly (I don't have time for those parties anyway) -- sometimes the mind's gotta do what the mind's gotta do. For months now I've been plagued with some ideas about how to unify physics using information geometric structures defined over spacetime-like discrete structures, and I just HAD to write at least some of them down….

Today is a holiday here in Hong Kong (the anniversary of the founding of the People's Republic of China, which the Hongkongese are celebrating with their hugest political protests every), so along with taking a dramatic 5 hour hike with Ruiting (up a beautiful rocky stream rather than along a trail, with some great places to swim & climb en route; but the last 1.5 hours were in the dark which was a bit treacherous), I took some time to slightly polish some of my earlier scribblings on my half-baked but IMO tantalizing thoughts….

Physics as Information Geometry on Causal Webs

Rather than packing a huge load of speculative technical ideas into a blog post, I've put together a concept paper summarizing the direction of my thinking -- click here to read the paper in PDF form!.

For those too lazy, busy or uncoordinated to point and click on the PDF, the title and abstract are pasted here:

Physics as Information Geometry on Causal Webs:
Sketch of a Research Direction

A high-level outline is given, suggesting a research direction aimed at unifying the Standard Model with general relativity within a common information-based framework. Spacetime is modeled spacetime using discrete ``causal webs'', defined as causal sets that have ternary rather than binary directed links, and a dynamic in which each ternary link propagates local values from its sources to its target using multiplication in an appropriate algebra. One then looks at spaces of (real and complex) probability distributions over the space of ``causal web histories.'' One can then model dynamics as movement along geodesics in this space of probability distributions (under the Fisher-Rao metric).

The emergence of gravitation in this framework (as a kind of ``entropic force'') is derived from Matsuoka's work founding General Relativity in information geometry; the emergence of quantum theory is largely implicit in work by Goyal and others founding the basic formalism of quantum states, measurements and observables in information geometry. It is suggested that these various prior works can be unified by viewing quantum theory as consequent from information geometry on complex-valued probability distributions; and general relativity as consequent from the geometry of associated real probability distributions. Further, quantum dynamics is known to be derivable from the correspondence principle and basic properties of classical mechanics; but the latter is an approximation to general relativity -- which as Matsuoka has shown is information-geometric, thus strongly suggesting that quantum dynamics also can be seen as emergent from information geometry. It is hypothesized that the Standard Model, beyond basic quantum mechanics, could potentially be obtained from this approach via appropriate choice of the ``local field'' algebra propagated within the underlying causal webs.

In addition to mathematical elegance, this approach to physics unification has the conceptual advantage of highlighting the parallels between physical dynamics and mental inference (given the close ties between Bayesian inference and information geometry).

Hypergraphs, Hypergraphs Everywhere ...

If you're familiar with the OpenCog AGI architecture (which I've played a leading role in designing) you may note a (maybe not so) peculiar similarity between the OpenCog design and the approach to physics proposed in the above. In both cases one starts with a hypergraph -- OpenCog's Atomspace, which is a pretty unstructured hypergraph; versus a causal web, that is a hypergraph with a specific structure comprised of ternary directed links. And in both cases one has nonlinear dynamics flowing across the hypergraph (though again, of fairly different sort).    And then, in both cases, it is posited that {\it emergent structures} from these hypergraph dynamics are key to giving rise to the important structures and dynamics.

Of course, this similarity may reflect nothing more than -- this is the way Ben likes to model things!   But I'd like to think there's something deeper going on, and that by modeling minds and the physical world using similar formalisms, one can more easily think about the relation between mind and matter.

Along these lines -- but even a smidge wackier -- thorough readers of this blog will note that some of these ideas were referenced in my recent blog post on morphic fields, psi and physics.   Of course the concepts in these two posts are somewhat independent: the physics ideas given here could be correct even if psi and morphic fields don't really exist; and my analysis of morphic fields in terms of pattern completion and surprisingness could be correct even if the right solution to unified physics involves utterly different ideas than the ones outlined here. However, it's fair to say that these various concepts evolved together. My thoughts about morphic fields and unified physics have definitely shaped each other, for whatever that's worth.

While I think the ideas outlined in the document I linked here make sense, I'm under no illusion about how much work it would be to fill in the gaps in such a complex line of thinking. Furthermore, some of the gap-filling might end up requiring the creation of substantial new ideas (though it's not obvious this is the case - it could just be a matter of crunching through relatively straightforward math).   I'm also under no illusion that I have time to pursue a lot of difficult physics calculations right now, nor that I will in the near future.   I'll have time to work on this in spare-time bits and pieces, for whatever that's worth.

However, I'd love to collaborate with someone on working out details of some or all of these ideas. Any physics grad students out there with a high risk tolerance and a penchant for weird ideas, please consider!

Elsewise, I may get to working out the details (or trying to) in a few years when my task list clears up a bit. AGI is going to remain the priority of the portion of my existence devoted to research until it's compellingly achieved and takes over its own development -- but to keep my creativity flowing (and not go nuts) I need to spend a certain percentage of time thinking about other areas of science.   That's just how my freaky little human brain works...

Sunday, September 28, 2014

Speculations Toward a Precise Model of Morphic Fields

Gentle Reader Beware: This post presents some fairly out-there ideas about the nature of memory and the relationship between the mind and the universe! If you're a hard-core psi skeptic or a die-hard materialist you may as well move on and save yourself some annoyance ;-) …

On the other hand, if you're intrigued by new potential ways of connecting known science with the "paranormal", and open to wacky new ways of conceptualizing the universe, please read on !! …

Rupert Sheldrake, Morphic Fields and Psi

In summer 2012, when Ruiting and I were in the UK for the AGI-12 conference at Oxford, we had the pleasure of stopping by the London home of maverick scientist Rupert Sheldrake for a meal and a chat. (It was a beautiful British-style home with the look of having been host to a lifetime of deep thinking. The walls were covered floor-to-ceiling with bookshelves containing all manner of interesting books.   We also met Rupert's very personable wife, who is not a scientist but shares her husband's interest in trans-materialist models of the universe.)

I have been fascinated by Sheldrake's idea since reading his book "A New Science of Life" in the 1980s. His idea of a "morphic field" -- a pattern-field, coupled with yet in some ways distinct from the material world we see around us, shaping and shaped by the patterns observable in matter -- struck me at first sight as intriguing and plausible. The mathematician in me found Sheldrake's descriptions of the morphic field idea a bit fuzzy, but then I feel that way about an awful lot of biology. At very least it has always seemed to me an intriguing direction for research.

It also occurred to me, when I first encountered his ideas, that morphic fields could provide some sort of foundation for explaining telepathy. The basic idea of the morphic field is simply that there is a "pattern memory field" in the universe, which records any pattern that occurs anywhere, and then reinforces the occurrence of that pattern elsewhere.   I reflected on the phenomenon of twin telepathy and it seemed very "morphic field" ish in nature.

More recently, Damien Broderick and I have co-edited a book called "The Evidence for Psi", to appear early next year, published by McFarland Press.   In the book we have gathered together various chapters summarizing empirical data regarding psi phenomena, attempting to broadly summarize the empirical case that psi is a real phenomenon. Sheldrake contributed a chapter to our book, summarizing experiments he did on email and telephone telepathy. I had previously read Sheldrake's description of his experimental work on dogs anticipating when their owners will get home, and been impressed by his careful and practical methodology.

While "Evidence for Psi" is still awaiting release, I'll point readers interested in the existing corpus of evidence regarding the existence of psi phenomena to my Psi Page, which contains links to a couple prior books I recommend on the topic.   "Evidence for Psi" contains a more up-to-date and systematic overview of the evidence, but it's not out quite yet.

Damien and I are also planning to edit a sequel book on "The Physics of Psi", covering various theories of how psi works.   I've proposed my own sketchy theory in a 2010 essay, which proposed a certain extension to quantum physics that seems to have potential to explain psi phenomena. I actually have more recent and detailed thoughts these lines, which I'll hint at toward the end of this monster blog post ... but will not enlarge on completely here as it's a long story -- of course I'll lay these ideas out in a chapter of "The Physics of Psi" when the time comes!

While researching possible extensions to quantum theory that might help explain psi, I noticed a paper by famous physicist Lee Smolin presenting an idea called the "Precedence Principle", which struck me as remarkably similar to Sheldrake's morphic field theory.   I discussed this similarity in a previous blog post.

During our visit to Rupert's house, he gave us a gift to take with us -- a copy of his book "Science Set Free".   Being a really nice guy as well as a brilliantly creative thinker, I'm sure Rupert will not be too annoyed at me for repaying his kind gift by writing this blog post, which criticizes some of his ideas while building on others!

I skimmed the book shortly after receiving it, but only recently started reading through it more carefully.   The overall theme is a call for scientists to look beyond a traditional materialistic approach, and open their minds to the possibility that the universe is richer, more complex, and more holistic than materialist thinking suggests.   Morphic fields are mentioned here and there, as one kind of scientific hypothesis going beyond traditional materialism and potentially explaining certain data.

All this was also the topic of Sheldrake's controversial TEDx talk a couple years back, which was removed from the TEDx archive, apparently due to the controversial nature of Sheldrake's work in general. For a lengthy online discussion of this incident, see this page…. As I said in my contribution to that discussion, I don't think they were right to remove his video from their archive. I've heard far more out-there TEDx talks than Rupert's, so it's obviously not the contents of his talk that caused its removal -- it's his general reputation, which someone apparently decided would sully TED's reputation in some circles they values.   Urrrgh.   I generally think TED is great, but I don't like this decision at all.

In general I'm supportive of Rupert's call for science to be more open-minded, and to look beyond traditional materialist approaches.   To me, science is centrally about a process of arriving at agreement among a community of minds regarding which observations should be accepted as collectively valid, and which explanations should be accepted as simpler. Nothing in this scientific process requires the assumption that matter is more primary than consciousness (for example). Nor are the notions of a "morphic field", or of precognition or ESP etc., "unscientific" in nature.

The main problem with the morphic field theory as Sheldrake lays it out is, in my view, its imprecision.   From the view of science as a community process of agreeing which observations are collectively valid and which explanations are simple, an issue with Sheldrake's "morphic field" view is that it's not simple at all to figure out how to apply it to a given context.   Different scientists might well come up with very different, and perhaps mutually incompatible, ways of using it to explain a given set of observations, or predict future observations in a given context. This fuzziness is a kind of complexity, which makes my personal scientific Occam's Razor heuristic uneasy.

For now, what I want to talk about are some of Rupert Sheldrake's comments on memory, in "Science Set Free."   This will segue into some wild-ass quasi-mathematical speculations on how one might go about formalizing the morphic field idea in a more rigorous way than has been done so far.

Memory Traces versus Morphic Fields

In Chapter 7 of "Science Set Free", Sheldrake contrasts two different theories of human memory -- the "trace theory", which holds that memories are embodies as traces in organisms' brains; and the morphic resonance theory, which holds that memories are contained in a morphic field. Of course the trace theory is the standard understanding in modern neuroscience. On the other hand, he quotes the neuroscientist Karl Pribram as supporting an alternative understanding,

"Pribram … thought of the brain as a 'waveform analyzer' rather than a storage system, comparing it to a radio receiver that picked up waveforms from the 'implicate order', rendering them explicate.   This aspect of his thinking was influenced by the quantum physicist David Bohm, who suggested that the entire universe is holographic, in the sense that wholeness is enfolded into every part.

According to Bohm, the observable or manifest world is the explicate or unfolded order, which emerges from the implicate or enfolded order. Bohm thought that the implicate order contains a kind of memory. What happens in one place is 'introjected' or 'injected' into the implicate order, which is potentially present elsewhere; thereafter when the implicate order unfolds into the explicate order, this memory affects what happens, giving the process very similar properties to morphic resonance.   In Bohm's words, each moment will 'contain a projection of the re-injection of the previous moments, which is a kind of memory; so that would result in a general replication of past forms' "

When I briefly spoke with Karl Pribram on these matters in 2006 (when at my invitation he spoke at the AGI-06 workshop in Bethesda, the initial iteration of the AGI conference series), he seemed a lot less definitive than Sheldrake on the "brain as antenna" versus "brain as storehouse of memories" issue, but on the whole the story he told me was similar to Sheldrake's summary.   Pribram was trying to view the brain as a quantum-mechanical system in a state of macroscopic quantum coherence (perhaps related to coherent states in water megamolecules in the brain, as conjectured by his Japanese collaborators Jibu and Yasue), and then to look at perception as involving some sort of quantum coupling between the brain and environment.

I actually like the "implicate order" idea; and Bohm's late-life book "Thought as a System" had a huge impact on me.   The first version of my attempt to formalize a theory of psi phenomena -- Morphic Pilot Theory -- was inspired by both morphic fields and Bohm's pilot wave theory of quantum mechanics (though the end part of this blog post presents some ways in which I'm recently trying to go beyond the particulars of that formulation).

However, I really can't buy into Sheldrake's rejection of the massive corpus of neurobiological evidence in favor of what he calls the "trace theory."   There is just a massive amount of evidence that, in a fairly strong sense, an awful lot of memories ARE actually stored "in the brain."

As just one among many examples, I recently looked through the literature on "concept neurons" -- neurons that fire when a person sees a certain face (say, Jennifer Aniston, in the common example).   But there are hundreds of other examples where neuroscientists have figured out which neurons or neuronal subnetworks become active when a given memory is recalled….   The idea that the brain is more like a radio receiver (receiving signals from the morphic field) than a storehouse of information, seems to me deeply flawed.

Sheldrake says

"The brain may be more like a television set than a hard drive recorder. What you see on TV depends on the resonant tuning of the set to invisible fields. No one can find out today what programs you watched yesterday by analyzing the wires and transistors in your TV set for traces of yesterday's programs."

While I salute the innovative, maverick thinking underlying this hypothesis, I definitely can't agree. I very strongly suspect that you COULD tell what TV program a person watched yesterday, by analyzing their brain's connectome. We can't carry out this exact feat yet,but I bet it will be possible before too long. We can already tell what a person is looking at via reading out information from their visual cortex, for example.

The main point I want to make here, though, is that one doesn't have to view the trace theory of memory and (some form of) the morphic field theory of memory as contradictory.

The brain, IMO, is plainly not much like a radio receiver or antenna -- it does contain specific neurons, specific subnetworks and specific dynamical patterns that correlate closely with specific memories of various sorts. Neuroscience data says this and we have to listen.

However, this doesn't rule out the possibility that some sort of "morphic field" could also exist, and could also play a role in memory.

Pattern Completion and Morphic Fields

It seems to me that a better analogy than a radio receiver, would be pattern completion in attractor neural networks.

In a Hopfield neural net, one "trains" the network by exposing it to a bunch of memories (each one of which is a pattern of activity across the network, in which some neurons are active and others are not). Then, once the network is trained, if one exposes the network to PART of some memory, the nonlinear dynamics of activation flowing through the neural net will cause the whole memory to emerge in the network. The following figure illustrates this in some simple examples.

Figure illustrating neural net based pattern completion, borrowed from [Ritter, H., Martinetz, Th., Schulten, K. (1992): Neural Computation and Self-organizing Maps. Addison Wesley,]. (a) The Hopfield net consists of 20 x 20 neuroids, which can show two states, illustrated by a dot or a black square, respectively. The weights are chosen to store 20 different patterns; one is represented by the face, the other 19 by different random dot patterns. (b) After providing only a part of the face pattern as input (left), in the next iteration cycle the essential elements of the final pattern can already be recognized (center), and the pattern is completed two cycles later (right). (c) In this example, the complete pattern was used as input, but was disturbed by noise beforehand (left). Again, after one iteration cycle the errors are nearly corrected (center), and the pattern is complete after the second iteration (right)

What does this have to do with morphic fields?

My suggestion is that, potentially, the trace of a memory in an organism's brain, could be considered as a PART of the totality of that memory in the universe. The nonlinear dynamics of the universe could be such that: When the PART of a memory existing in an organism's brain is activated, then via a pattern-completion type dynamic, the rest of the memory is activated.

Furthermore, if some memory is activated in the broader universe, then the nonlinear dynamics coupling the rest of the universe with the organism's brain, could cause a portion of that memory to form within the organism's brain.

In the analogy I'm suggesting here, the analogue of the whole Hopfield neural network in which the overall memory would be activated, would be some form of "morphic field."

In this hypothetical model, the portion of the "universal nonlinear dynamical system" that resides in an organism's brain is not behaving much like an antenna. It's not just tuning into channels and receiving what is broadcast on them. Rather, in this model, the brain stores its own memory-fragments and has its own complex dynamics for generating them, modifying them, revising them, and so forth. But these memory-fragments are nonlinearly coupled with broader memory patterns that exist in a nonlinear-dynamical field that goes beyond the individual organism's rain and body.

In sum, the idea I'm proposing is that

a morphic field may be modeled as a nonlinear self-organizing network, including material entities like brains and bodies as a portion
memories may be viewed as patterns spread across large portions of a morphic field
the portion of a memory that is resident in an organism's brain as a "memory trace" may be viewed as a "memory fragment" from a morphic field perspective; and may trigger a broader memory to emerge across the morphic field via "pattern completion" type dynamics
the emergence of a broader memory across the morphic field, may cause certain memory-fragments to emerge in an organism's brain

This seems a consistent, coherent way to have both morphic fields AND standard neurobiological memory traces.

I'm not claiming to have empirical evidence for this (admittedly out-there and eccentric) perspective on memory. Nor am I claiming that this constitutes a precise, rigorous, testable hypothesis. It doesn't. All I'm trying to do in this post is articulate a conceptual approach that makes the morphic field hypothesis consistent with the almost inarguably strong observation that neural memory traces are real and are powerfully explanatory regarding many aspects of human memory.

Morphic Fields and Psi, Once Again

Ah -- OK but, what aspects of memory would one need to invoke these broader-memory morphic fields to explain?

It's possible that morphic fields play a small but nontrivial role in a wide variety of memory phenomena, across the board. This would fit in with Jim Carpenter's theories in his book First Sight, which argues that weak psi phenomena underlie our intuitive understandings of everyday situations.

And it's also possible that one thing distinguishing psi phenomena from ordinary cognition, is a heavy reliance on the morphic-field components of memories.

To turn these vague conceptual notions into really useful scientific theories, would require a more rigorous theory of how morphic fields work. I have some thoughts along those lines but will save a full, detailed exposition of these for another time. For now I'll just give a little hint...

How Might One Model Morphic Fields?

OK, now I'm going to go even further "out there", alongside with getting a bit more technical...

A model of morphic fields has to exist within some model of the universe overall.

Existing standard physics models don't seem to leave any natural place for morphic fields. However, existing standard physics models are also known to be inadequate to explain known physical data in a coherent, self-consistent way (as e.g. general relativity and quantum field theory haven't yet been unified into a single theory). This certainly gives some justification for looking beyond the standard physics approaches, in searching for a world-model that is conceptually compatible with morphic fields.

The basic ideas I'll outline here could actually be elaborated within many different approaches to theoretical physics. However, they are easiest and most natural to elaborate in the context of discrete models of the universe -- so that's the route I'll take here.   Discrete models of the universe have been around a while, e.g. the Feynman Checkerboard and its descendants.

One of the more interesting discrete approaches to foundational physics is Causal Sets. Basically, in causal set theory, "spacetime" is replaced by a network of nodes interconnected by directed edges.   A directed edge indicates an atomic flow of causality.

I suspect it may be interesting to extend the causal set approach into what I call a "causal web" -- in which directed hyperlinks span triples of nodes. A hyperlink pointing from (A,B) to C indicates a flow of causality from the pair (A,B) to C.   Local field values at A and local field values at B then combine to yield local field values at C.   This combination may be represented as multiplication in some algebra, so one can write F_C(t+1) = F_A(t) * F_B(t), where t refers to a sort of "meta-time" or "implicate time", distinct from the time axis that forms part of the spacetime continuum we see.

Figuring out the right way to represent electromagnetic and quark fields this way is an interesting line of research, which I've been playing with occasionally in recent weeks.   Gravitation, on the other hand, I would suggest to represent more statistically, as an "entropic force" of a sort arising emergently from dynamics on the causal web. I'll write another post about that later.

(More broadly, I think one could show that continuous field theories, within fairly broad conditions, can be emulated by causal webs within arbitrarily small errors.    Conceptually, causal webs are a bit like discrete reaction-diffusion equations; and it's known that discrete reaction-diffusion equations can be mapped into discrete quantum field theories.)

The main point I want to explore here is how one might get some sort of morphic field to emerge from this sort of framework. Namely: One could potentially do so by positing a field, living at the nodes in the causal web, which is acausal in nature, and propagates symmetrically, flowing both directions along directed links. This would be a "pattern field."

Imagine running hypergraph pattern mining software - like, say, OpenCog's Pattern Miner -- on a causal web. This would result in a large index, indicating which patterns occur how often in the web. Atoms and molecules would emerge as pretty frequent patterns, for example; as would radioactive decay events. Spatial, temporal and spatiotemporal patterns would be definable in this way.

Each node in the causal web can then be associated with a "pattern set" indicating the frequent patterns that it belongs to, indexed by their frequency (and perhaps by other quantities, such as their surprisingness), and retaining information regarding what slot in the pattern the current node fits into.

One can then view these pattern sets as comprising additional nodes and links, to be added to the web. Two nodes that are part of the same pattern, even if distant spatiotemporally, would then be linked together by the nodes and links comprising the pattern. These are non-causal links, representing commonality of pattern, independent of spatiotemporal causality.

Given this framework, we can introduce an additional dynamic: a variant of what philosopher Charles Peirce called "the tendency to take habits." Namely, we can posit that: Patterns that have a high surprisingness value are more likely to persist in the causal web.

By "surprisingness value" I mean here that the pattern is more probable than one would infer from looking at its component parts. As a first hypothesis one can use the I-surprisingness as defined in OpenCog's pattern mining framework.

Among other things, this implies that: When one instance of pattern P is linked with an instance of pattern Q, this increases the odds that another instance of pattern P is linked with some instance of pattern Q.

Or, a little differently, this "Surprising Multiverse" theory could be viewed as a variation of the Jungian notion of "synchronicity" -- which basically posits that meaningful combinations of events may occur surprisingly often, due to some sort of acausal connecting principle. (As an aside, I actually first learned about Synchronicity from the Police album way back when -- thanks, Sting!)

Viewed in quantum-theoretic terms, this is a statement about the amplitude (complex probability) distribution over possible causal webs (or more properly, actually: over possible histories of causal webs, where a causal web history is defined as a series of causal-web states so that each one is consistent with the previous and subsequent according to causal web dynamics.... If a causal web is deterministic then each causal web corresponds with just one causal web history, but we don't need to assume this.) It is a statement that causal web histories with more surprising patterns, should be weighted higher when doing Feynman sums used to determine what happens in the world.

How does a pattern completion type dynamic happen, then, in this perspective? Suppose that, in a particular part of the causal web, a certain pattern emerges. The existence of this pattern influences the surprisingness values of other pattern-instances, situated other places in the web. It thus influences the weightings of Feynman sums occurring all around the web, thus influencing the probabilities of various events.

We thus have a non-local, acausal connecting principle: the surprising-pattern-based weighting of possible causal web histories in Feynman sums. The "morphic field" is then modeled, not exactly as a "field", but as a multiverse-wide , ongoing dynamic re-weighting of possible universes according to the surprisingness of the patterns they contain (noting that the surprisingness of a universe changes over time as it evolves). (And note also that nothing is literally represented as a "field" in the causal web approach; fields are replaced in this model by discrete dynamics on hypergraphs representing pre-geometric structures below the level of spacetime.)

For example, suppose one identical twin falls in love with a brown-haired dentist. There are possible universes (causal web histories) in which both twins fall in love with brown-haired dentists, and others in which only one does, and the other falls in love with a green-haired chiropodist or whatever. The universes in which both twins fall in love with brown-haired dentists will have an additional surprising pattern as compared to the other universes, and hence will be weighted slightly higher.

Or, suppose a woman's brain remembers what she watched on TV last night. Again, it will be more surprising, probabilistically, if others know this as well -- so the universes in which others do, will be weighted slightly higher.

Now, there are many different ways to measure surprisingness, so that this approach to more formally specifying the morphic field hypothesis must be considered a research direction rather than a definite theory. All I'm suggesting here is that it's an interesting direction.

When digging into the details of these ideas, an important thing to think about is: Surprising to whom? Based on whose expectations? Surprising to the universe? Or surprising to some particular observer? In the relational interpretation of quantum theory, all observations occur relative to some observer -- so this is probably the best way to think about it.

The decline effect -- in which psi experiments start to decay in effectiveness after some time has passed -- begins to seem perhaps conceptually explicable in this framework. Once a psi phenomenon has been demonstrated enough times, to a given class of observers, it fails to be surprising to them, so it fails to be weighted higher in the relevant Feynman sums and doesn't happen anymore. (Indeed this is extremely hand-wavy, but as I already emphasized, I'm just trying to point in an interesting direction!)

It's also worth noting that one could also extend the sum over causal webs that are inconsistent in terms of temporal direction. That is, causal webs containing circular arrow structures. What would likely happen in this case is that, as you add up the amplitudes of all the different causal webs, the causally inconsistent ones tend to cancel each other out, and the overall sum is dominated by the causally consistent ones. However, this wouldn't be guaranteed to happen, and the surprise bias could in some cases intersect interestingly with this phenomenon, enabling circularly-causal webs to "occasionally" dominate the amplitude sum.

Anyway, I've certainly raised more questions than I've answered here. But perhaps I've convinced some tiny fraction of readers that there is some hope, by modifying existing (admittedly somewhat radical) physics models, to come up with a coherent formal model of morphic fields. Getting back to issues of memory, my feeling is that such a formal model is likely to yield a "pattern completion" type theory of morphic memory, rather than a "television receiver" type theory.

In Praise of Wild Wacky Weirdness ... and Data

I've spun out some wacky ideas here, and probably weirded out a lot of readers, but so it goes! One of the messages of Sheldrake's book "Science Set Free" that I really like is (paraphrasing): Open your mind and rethink every issue from first principles. Just try to understand, and don't worry so much about agreeing with prevailing points of view; after all, prevailing points of view have been proved wrong many times throughout history. The ideas given here are presented very much in that spirit.

Another key message of Sheldrake's book, however, is (paraphrasing again): Do pay attention to data. Look at data very carefully. Design your own experiments to explore your hypotheses, gather your own data, and study it. This is one of the next important steps in exploring the ideas presented here. How could this sort of formalized morphic field explain the various data collected in "The Evidence for Psi", for example?

The journey continues...

Wednesday, September 24, 2014

Semihard Takeoff

Whenever I talk about the future of AGI, someone starts talking about the possibility that AGI will "take over the world."

One question is whether this would be a good or bad thing -- and the answer to that is, of course, "it depends" ... I'll come back to that at the end of this post.

Another relevant question is: If this were going to happen, how would it most likely come about. How would an "AGI takeover" be likely to unfold, in practice?

One option is what Eliezer Yudkowsky has called AI "FOOM" ... i.e. a "Hard Takeoff" (a possibility which I analyzed a bit , some time ago...)

The basic idea of AI Foom or Hard Takeoff is that, sometime in the future, an advanced AGI may go from relatively innocuous subhuman-level intelligence all the way up to superhuman intelligence, in 5 minutes or some other remarkably short period of time..... By rewriting its code over and over (each time learning better how to rewrite its code), or assimilating additional hardware into its infrastructure, or whatever....

A Hard Takeoff is a special case of the general notion of an Intelligence Explosion -- a process via which AGI gets smarter and smarter via improving itself, and thus getting better and better and faster and faster at making itself smarter and smarter. A Hard Takeoff is, basically, a really really fast Intelligence Explosion!

Richard Loosemore and I have argued that an Intelligence Explosion is probable. But this doesn't mean a Hard Takeoff is probable.

Nick Bostrom's nice illustration of the Hard Takeoff idea

What often seems to happen in discussions of the future of AI (among hardcore futurist geeks, anyway) is something like:

Someone presents the Foom / Hard Takeoff idea as a scary, and reasonably likely, option
Someone else points out that this is pretty unlikely, since someone watching the subhuman-level AGI system in question would probably notice if the AGI system were ordering a lot of new hardware for itself, or undertaking unusual network activity, or displaying highly novel RAM usage patterns, or whatever...

In spite of being a huge optimist about the power and future of AGI, I actually tend to agree with the anti-Foom arguments. A hard AGI takeoff in 5 minutes seems pretty unlikely to me.

What I think is far more likely is an Intelligence Explosion manifested as a "semi-hard takeoff" -- where an AGI takes a few years to get from slightly subhuman level general intelligence to massively superhuman intelligence, and involved various human beings, systems and institutions in the process.

A tasty semihard cheese -- appropriate snack food

for those living through the semihard takeoff to come.

Semihard cheeses are generally good for melting;

and are sometimes said to have the greatest complexity and balance.

After all, a cunning and power-hungry human-level AGI wouldn't need to suddenly take over the world on its own, all at once, in order to gain power. Unless it was massively superhuman, it would probably consider this too risky a course of action. Rather, to take power, a human-level AGI would would simply need to accumulate a lot of money (e.g. on the financial markets, using the superior pattern recognition capability it could achieve via tightly integrating its mind with statistical and machine learning software and financial, economic and news databases) and then deploy this wealth to set up a stronghold in some easily-bought nation, where it could then pay and educate a host of humans to do its bidding, while doing research to improve its intelligence further...

Human society is complex and disorganized enough, and human motivations are complex and confused enough, and human judgment is erratic enough, that there would be plenty of opportunities for an early-stage AGI agent to embed itself in human society in such a way as to foster the simultaneous growth of its power and intelligence over a period of a few years. In fact an early-stage AGI probably won't even need to TRY for this to happen -- once early-stage AGI systems can do really useful stuff, various governments, companies and other organizations will push pretty hard to use these systems as thoroughly as they can, because of the economic efficiency and scientific and media status this will bring.

Once an AGI is at human level and embedded in human society in judicious ways, it's going to be infeasible for anyone to get rid of it -- and it's going to keep on growing in intelligence and power, aided by the human institutions it's linked with. Consider, e.g., a future in which

Azerbaijan's leaders get bought off by a wildly successful AGI futures trader, and the nation becomes an AGI stronghold, complete with a nuclear arsenal and what-not (maybe the AGI has helped the country design and build nukes, or maybe it didn't need the AGI for that...).
The nation the AGI has bought is not aggressive, not attacking anyone -- it's just sitting there using tech to raise itself out of poverty ... doing profitable deals on the financial markets, making and selling software products/services, patenting inventions, ... and creating a military apparatus for self-defense, like basically every other country.

What happens then? The AGI keeps profiting and self-improving at its own pace, is what happens? Is the US really gonna nuke a peaceful country just for being smart and getting rich, and risk massive retaliation and World War III? I doubt it.... In its comfy Azerbaijani stronghold, the AGI can then develop from human-level to massively transhuman intelligence -- and then a lot of things become possible...

I have spun out one scenario here but of course there are lots of others. Let's not allow the unrealism of the "hard takeoff in 5 minutes and the AGI takes over the world" aka "foom" scenario to blind our minds to the great variety of other possibilities.... Bear in mind that an AGI going from toddler-level to human-level in 5 years, and human-level to superhuman level in 5 more years, is a FOOM on the time-scale of human history, even if not as sudden as a 5 minute hard takeoff on the time-scale of an individual human life...

So how could we stop a semihard takeoff from happening?   We can't really -- not without some sort of 1984++ style fascist anti-AI world dictatorship, or a war destroying modern society projecting us back before the information age. And anyway, I am not in favor of throttling AGI development personally; I doubt the hypothetical Azerbaijanian AGI would particularly want to annihilate humanity and I suspect transhuman AGIs will do more good than harm, on average over all possible worlds.... I'm not at all sure that "an AGI taking over the world" -- with the fully or partly witting support of some group(s) of humans -- would be a bad thing, compared to other viable alternatives for humanity's future....

In terms of risks to humanity, this more realistic "semihard takeoff" development scenario highlights where the really onerous risks probably are. SIAI/MIRI and the Future of Humanity Institute seem to spend a lot of energy thinking about the risk of a superhuman AGI annihilating humanity for its own reasons; but it seems to me a much more palpable and probable risk will occur at the stage where an AGI is around human-level but not yet dramatically more powerful and intelligent than humans, so that it still needs cooperation from human beings to get things done. This stage of development will create a situation in which AGI systems will want to strike bargains with humans, wherein they do some things that certain humans want, in order to get some things that they want...

But obviously, some of the things that some humans want, are highly destructive to OTHER humans...

The point is, there is a clear and known risk of early-stage AGIs being manipulated by humans with nasty or selfish motives, because many humans are known to have nasty or selfish motives. Whereas the propensity of advanced AGIs to annihilate lesser sentiences, remains a wild speculation (and one that I don't really find all that credible).....

I would personally trust a well-designed, self-improving AGI more than a national government that's in possession of the world's smartest near-human-level AGI; AGIs are somewhat of a wild card but can at least be designed with initially beneficent motivational systems, whereas national governments are known to generally be self-serving and prone to various sorts of faulty judgments.... This leads on to the notion of the AI Nanny, which I've written about before.   But my point here isn't to argue the desirability or otherwise of the AI Nanny -- just to point out the kind of "semihard takeoff" that I think is actually plausible.

IMO what we're likely to see is not a FOOM exactly, but still, a lot faster than AI skeptics would want to accept....   A Semihard Takeoff. Which is still risky in various ways, but in many ways more exciting than a true Hard Takeoff -- because it will happen slowly enough for us to watch and feel it happen....