To follow this blog by email, give your address here...

Saturday, December 10, 2016

The Tech-Startup Attractor: Musings on the Thiel Fellowships, Singularity University, and the value of good old-fashioned universities


This Business Insider article on the outcome so far of the Thiel Fellowship experiment is interesting, though not surprising..

As you may recall, Peter Thiel launched the Thiel Fellowship program as a "20 under 20" initiative, with a stated aim of showing that - for bright ambitious youth anyway -- college is not necessary and is in many ways not the best way to spend 4 years of one's young adulthood.   The Thiel Fellows were each given $100K over 2 years, with a goal of supporting them in pursuing their own thoughts, dreams and visions....

As the Business Insider article reports, as the Fellowship experiment has continued for a few years, it has evolved a bit ... in the beginning it seemed like it was going to focus broadly on ambitious and brilliant youth with all sorts of creative new ideas and direction, and on giving them space to flesh out their thinking without needing to worry about paying the bills ... but it seems to have gravitated more toward a sort of "social network for young entrepreneurs", focusing mostly on young people with tech business projects reasonably likely to create near-term profit ... including many who are already having significant business success.   And the main value-add of the program is coming out to be, not the cash stipend, but rather the social network to which the Fellowship gives access.

All of which is great, and surely moves technology and business and society forwards a bit.   However, it does not whatsoever show that dropping out of college is a great path forward for youth in general.   What it shows is more like: IF you are young and want to start a tech biz based on an idea that appears to the Silicon Valley tech community to have significant near-term financial potential, THEN dropping out of school and into an extremely influential social network (well-connected with a host of high-net-worth individuals and impactful tech companies and VC funds etc. etc.) is a damn good idea, if you get the opportunity...

Well, yeah....  But this doesn't really say much about the pluses or minuses of going to college or getting a college degree if you DON'T have the opportunity to get rapidly embraced by a world-class social network like this...

I'm not especially an apologist for the contemporary university system, which annoys me in many ways, with  (among other problems) its focus on rote learning, its obsession with dividing knowledge into irrelevant disciplinary bins, and its tendency to squelch individual and group creativity.   

On the other hand, I have to admit that universities are the one area in human society that has consistently, over a long period of time (nearly a millenium!), provided an environment in which learning new things and developing radical new ideas is generally encouraged, apart from the short-term reward that such learning and ideas may bring.   All the egregious flaws of the university system aside, this is not to be scoffed at.

Society gives all of us a lot of pressure to pursue short-term reward in various ways, and on various levels.   Even as universities come to focus more on career-preparation majors, and professors are pushed to pull in grant funding rather than work on obscure  or out-there topics that funding agencies ignore -- still, compared to the other aspects of our society, universities seem by far the MOST supportive of learning and creation and invention not tied to short-term reward.

Of course there are non-university institutions that out-do universities in this regard, but they are small and scattered and end up not being accessible to most people.

Of course, nearly anyone in the developed world can find other ways to spend their time learning and creating, without enrolling in university.   But we are all susceptible to various social pressures, so -- even with all the information and communities available on the Internet -- it is still valuable to have a physical environment where learning and creation are core to the mission and vibe.

One point I often end up making in conversations about AI is that every one of the "deep neural net" algorithms being used by big tech companies these days, was invented by university professors and published in the academic literature.   Then the big tech companies took these (often via hiring said professors or their grad students) and implemented them more scalably and got amazing practical results.   But the core deep learning algorithms were invented in the university setting not the tech company setting, and they were invented alongside thousands of other algorithms, none of which had widely obvious commercial value at the time they were invented.   Many of these other algorithms will never prove practically valuable; some may ultimately prove far more valuable than deep neural networks.

Quantum computing obviously is the same way.  Where was quantum theory developed?  And where were the original ideas underlying quantum computing worked out?  Where are the speculative designs and lab experiments and math papers being done today, that are laying the groundwork for the quantum computers we'll have in our compute clouds 15-25 years from now ... for the Quantum Processing Units (QPUs) we'll have in our smartphones and smartwatches, in our robots' brains, maybe implanted in our own brains.   Hint: mostly not in venture-funded startups, nor in the labs of big tech companies...

The Thiel Fellowships are a cool program, but they don't seem to be fostering the kind of wide-ranging intellectual exploration and concept creation that universities -- in their screwed-up, contorted and semi-archaic way -- have so often fostered.   Rather they seem to be fostering some young people to do what Silicon Valley does best -- take ideas already formed by other folks and commercialize and market them, make them scalable and slick.   I don't want to discount this sort of work; I love my Android phone and Macbook and Google Search and all that too....   But this is a very particular sort of pursuit, and the fact that getting embedded in an awesome social network is more useful than university for this sort of thing, is pretty bloody obvious, right?

I see some parallels with how Singularity University (the non-degree granting educational organization, founded by Ray Kurzweil and Peter Diamandis and others in Silicon Valley) has developed.   While I'm currently an advisor for SU's AI and robotics track, I'm not that intensively involved with the organization these days.    However, I was fairly heavily involved with SU when it was founded, and before it was founded. 

The initial legwork for putting SU together was largely done by Amara Angelica (who runs KurzweilAI.net for Ray) and Bruce Klein, who at the time was working for me as President of Novamente LLC.   I was paying Bruce a modest salary for his Novamente work, which covered his basic bills while he spent 6 months doing social networking trying to put together the founding meeting for Singularity University.   The founding meeting -- which I ended up not attending as I was busy with so much other stuff -- was a big success ... Ray Kurzweil and Peter Diamandis shared their vision wonderfully and recruited the needed founding donations from the various individuals Bruce and Amara and their colleagues had gathered together (with help from Ray and Peter as well) ... and SU was off to the races ...

When Bruce and Amara and I were first talking about SU, however, our discussions had a pretty strongly Singularitarian vibe.   We were talking about how to radically accelerate progress toward AGI, mind uploading, Drexlerian nanotech, radical longevity, and so forth.   And in the first couple years of its existence, SU was fairly much in this vein, though already a bit more "startup bootcamp" oriented than we had been thinking.

Looking at SU now, it's awesome for what it is -- but it's become far more focused on short-term hi-tech business opportunities than it initially seemed would be the case.   SU has done a heap of good for the world, by bringing future-minded entrepreneurs and others from all around the world together, to social network with Silicon Valley tech leaders and brainstorm on how to create new startups using advanced tech to improve the world. 

And much as with the Thiel Fellowship, I believe the  main value-add SU has ended up providing to its students is the social network.   Definitely, for a future-oriented business executive or scientist from Quatar or China or Bolivia or Ethiopia, the chance to get to know dozens to hundreds of Silicon Valley and international tech-biz geeks can be pretty priceless...

Perhaps much of what we see in both the Thiel Fellowship and Singularity University cases is merely the power of the "tech startup attractor" for programs based in Silicon Valley.    Silicon Valley is damn good at tech startups, and is not necessarily equally good at providing alternative means of giving young people broad education or space to wild-mindedly create new ideas ... nor at encouraging people to make huge leaps toward the Singularity in ways that don't promise short-term business success.... 

Of course, Silicon Valley doesn't have to be everything -- it's a big world out there, with lots of wealth and brilliance and capability in so many different places -- and it's incredibly impressive what things like the Thiel Fellowship and Singularity University are contributing to the world.   But the relatively small role these sorts of things play in the bigger picture of what's going on on the planet, should also not be lost sight of...

So far, universities are still pretty damn useful, in terms of providing environments for young people to learn how to learn, and space for young people to create and grow without the world's usual pressures.... 

And so far, the challenge of directing significant resources to really ambitious Singularitarian goals like AGI, mind uploading and Drexlerian nanotech, has yet to be met....   We are moving toward these goals anyway, and progress is excitingly fast by historical standards.   Yet it seems to me that we could be progressing faster and in many ways more beneficially and interestingly, if not for the tendency of more visionary initiatives to get sucked into attractors related to short-term profit-seeking.... 

And it's also clear to me, that, in our current path toward a radically better future, good old traditional universities are continuing to play a very central role, in spite of all their archaic peculiarities.  I would love for far better modes of social organization to emerge, but this process is still underway; and currently the Silicon Valley tech-startup network -- with all its diverse and fascinating manifestations -- is more a complement to traditional universities than an alternative...






Monday, October 31, 2016

SEMREM: The Search for Extraterrestrial, Morphically-REsonating Mathematicians

-->
An interesting idea came up in an email thread with my dad Ted Goertzel, his friend Bill McNeely, and my son Zar Goertzel…

Suppose that morphic resonance works – so that when a pattern arises somewhere in the universe, it then becomes more likely to appear other places in the universe.   Suppose that, like quantum entanglement, it operates outside the scope of special relativity – so that when a pattern occurs on this side of the universe, its probability of occurrence is immediately increased way on the other side of the universe. 

(As with quantum entanglement, the language of causation is not really the best one to use here – rather than saying “pattern X occurring here increases the odds of pattern Y occurring there”, it’s better to say “in our universe, the odds of the same pattern occurring in two distant locations, sometimes with a time lag, is higher than one would expect based on straightforward independence assumptions” – this has the same empirical consequences and less needless metaphysical baggage.   I’ve pointed this out here )

Suppose also that the physical universe contains multiple intelligent species and civilizations, flung all over the place – scattered across our galaxy and/or multiple galaxies.

It would follow that when one intelligent civilization creates a certain pattern, other civilizations halfway across the galaxy or universe would have a higher probability of happening upon that same pattern.   And perhaps there would be an increasing-returns type dynamic here: once half the intelligent civilizations in the universe have manifested a certain pattern, the odds of the rest coming to manifest it would be much higher.

But what kinds of patterns would be most likely to get propagated in this way?   A pattern highly specific to Earthly life would not be likely to get picked up by gas-cloud aliens in some other galaxy – because morphic resonance, if it works, would only mildly increase the odds of a pattern being found in one location or context, based on it already having been found in another.    Most likely its mechanism of action would involve slightly nudging the internal stochastic dynamics of existing processes – and there is a limit to how much change can be enacted via such nudging.   If the odds of a certain Earthly pattern being formed in the world of the gas-cloud aliens is very low, morphic resonance probably isn’t going to help.

Probably the most amenable patterns for morphic resonance based cross-intelligent-civilization transmission would be the most abstract ones, the ones that are of interest to as many different intelligent civilizations as possible, regardless of their particular cultural or physical  or psychological makeup.    Mathematics would seem the best candidate.

So, if this hypothesis is right, then mathematical theorems and structures that have already been discovered by alien civilizations elsewhere, would be especially easy for us to find – we would find ourselves somehow mysteriously/luckily guided to finding them.

It’s not hard to imagine how we might test this hypothesis.   What if we built a giant AGI mathematical theorem prover, and set it about searching for new mathematical theorems, proofs and structures in a variety of different regions of math-space.   Based on all this activity, it would be able to develop a reasonably decent estimator of how difficult it should be, on average, to discover new theorems and proofs in a certain area of mathematics.  

Suppose this AGI mathematician then finds that certain areas of mathematics are unreasonably easy for it – that in these areas, it often seems to “just get lucky” in finding the right mathematical patterns, without having to try as hard as its general experience would lead it to suspect.   These areas of mathematics would be the prime suspects for the role of “focus area of the intergalactic, cross-species community of morphically resonating mathematicians.”

Suppose the AGI mathematician is trying to solve some problem, and has to choose between two potential strategies, A and B.   If A lies in a region of math-space that seems to have lots of morphic resonance going on, then on the whole it’s going to be a better place to look than B.    But of course, every alien species is going to be reasoning the same way.   So without any explicit communication, the community of mathematically-reasoning species (which will probably  mostly be AGIs of some form or another, since it’s unlikely evolved organisms are going to be nearly as good at math as AGIs) will tend to help each other and collectively explore math-space.

This is an utterly different form of Search for Extraterrestrial Intelligence – I’ll label it the “Search for Extraterrestrial Morphically-REsonating Mathematicians”, or SEMREM.  

As soon as we have some highly functional AGI theorem-provers at our disposal, work on SEMREM can begin!

P.S. -- After reading the above, Damien Broderick pointed out that species doing lots of math but badly could pollute the morphic math-space, misdirecting all the other species around the cosmos.   Perhaps this will be the cause of some future intergalactic warfare --- AI destroyer-bots will be sent out to nuke the species polluting the morphic math metaverse with wrong equations or inept, roundabout proofs ... or, more optimistically, to properly educate them in the ways of post-Singularity mathemagic...

Saturday, October 29, 2016

Symbiobility

I want to call attention here to a concept that seems to get insufficient attention: “symbiobility”, or amenability to symbiosis.

The word “symbiobility” appears to have been used quite infrequently, according to Google; but I haven’t found any alternative with the same meaning and more common usage.   The phrase “symbiotic performance” is more commonly used in biology and seems to mean about the same thing, but it’s not very concise or euphonious.

What I mean by symbiobility is: The ability to enter into symbiotic unions with other entities.

In evolutionary theory (and the theory of evolutionary computation) one talks sometimes about the “evolution of evolvability” – where “evolvability” means the ability to be improved via mutation and crossover.   Similarly, it is important to think about the evolution and symbiogenesis of symbiobility.

There are decent (though still a bit speculative) arguments that symbiogenesis has been a major driver of biological evolution on Earth, perhaps even as critical as mutation, crossover and selection.  Wikipedia gives a conservative review of the biology of symbiogenesis.  Schwemmler has outlined a much more sweeping perspective on the role of symbiogenesis, including a symbiogenesis-based analysis of the nature of cancer; I reviewed his book in 2002.

One can think about symbiobility fairly generally, on various levels of complex systems.   For instance,

  •  Carbon-based compounds often have a high degree of symbiobility – they can easily be fused with other compounds to form larger compounds.  
  • Happily married couples in which both partners are extraverts also have a high degree of symbiobility, in the sense that they can be relatively easily included in larger social groups (without dissolving but also without withdrawing into isolation).


These usages could be considered a bit metaphorical, but no more so than many uses of the term “evolution.”

One of the weaknesses of most Artificial Life research, I would suggest, is that the Alife formalisms created have inadequate symbiobility.   I have been thinking about this a fair bit lately due to musing about how to build an algorithmic-chemistry-type system in OpenCog (see my blog post on Cogistry).    A big challenge there is to design an algorithmic-chemical (“codelet”) formalism so that the emergent systems of codelets (“codenets”) will have a reasonably high degree of symbiobility.  

My hope with Cogistry is to achieve symbiobility via using very powerful and flexible methods (e.g. probabilistic logic) to figure out how to merge two entities A and B into a new entity symbiotically combining A and B.   This requires that A and B be composed in a way that enables the logic engine in use to draw conclusions about how to best compose A and B, based on a reasonablye amount of resource usage.

In terms of the Maximum Pattern Creation Principle I have written about recently, it seems that symbiogenesis is often a powerful way for a system to carry out high-speed high-volume pattern creation.   In ideal cases the symbiotic combination of A and B can carry out basically the same sorts of pattern creation that A and B can, plus new ones besides.


As the world gets more and more connected and complex, each of us acts more and more as a part of larger networks (culminating in the so-called “Global Brain”).   This means that symbiobility is a more and more important characteristic for all of us to cultivate – along with evolvability generally, which is a must in a world so rapidly and dramatically changing.

Thursday, October 27, 2016

MaxPat: The Maximum Pattern Creation Principle


I will argue here that, in natural environments (I’ll explain what this means below), intelligent agents will tend to change in ways that locally maximize the amount of pattern created.    I will refer to this putative principle as MaxPat.

The argument I present here is fairly careful, but still is far from a formal proof.  I think a formal proof could be constructed along the lines of this argument, but obviously it would acquire various conditions and caveats along the route to full formalization.

What I mean by “locally maximize” is, roughly: If an intelligent agent in a natural environment has multiple possible avenues it may take, on the whole it will tend to take the one that involves more pattern creation (where “degree of patternment” is measured relative to its own memory’s notion of simplicity, a measure that is argued to be correlated with the measurement of simplicity that is implicit in the natural environment).

This is intended to have roughly the same conceptual form as the Maximum Entropy Production Principle (MEPP), and there may in fact be some technical relationship between the two principles as well.   I will indicate below that maximizing pattern creation also involves maximizing entropy in a certain sense, though this sense is complexly related to the sort of entropy involved in MEPP.

Basic Setting: Stable Systems and Natural Environments

The setting in which I will consider MaxPat is a universe that contains a large number of small “atomic” entities (atoms, particles,  whatever), which exist in space and time, and are able to be assembled (or to self-assemble) into larger entities.   Some of these larger entities are what I’ll call Stable Systems (or SS’s), i.e. they can persist over time.   A Stable System may be a certain pattern of organization of small entities, i.e. some or all of the specific small entities comprising it may change over time, and the Stable System may still be considered the same system.  (Note also that a SS as I conceive it here need not be permanent; stability is not an absolute concept...)

By a “natural environment” I mean one in which most Stable Systems are forming via heavily stochastic processes of evolution and self-organization, rather than e.g. by highly concerted processes of planning and engineering.  

In a natural environment, systems will tend to build up incrementally.   Small SS’s will build up from atomic entities.   Then larger SS’s will build up from small SS’s and atomic entities, etc.    Due to the stochastic nature of SS formation, all else equal, smaller combinations will be more likely to get formed than bigger ones.  On the other hand, if a bigger SS does get formed eventually, if it happens to be highly stable it may still stay around a while.

To put it a little more explicitly: The odds of an SS surviving in a messy stochastic world are going to depend on various factors, including its robustness and its odds of getting formed.   If formation is largely stochastic and evolutionary there will be a bias toward: smaller SS’s, and SS’s that can be built up hierarchically via combination of previous ones…  Thus there will be a bias toward survival of SS’s that can be merged with others into larger SS’s….   If a merger of S1 and S2 generally leads to S3 so that the imprint of S1 and S2 can still be seen in the observations produced by S3 ( a kind of syntax-semantics continuity) then we have a set of observations with hierarchical patterns in it…

Intelligent Agents Observing Natural Environments

Now, consider the position of an intelligent agent in a natural environment, collecting observations, and making hypotheses about what future observations it might collect.

Suppose the agent has two hypotheses about what kind of SS might have generated the observations it has made so far: a big SS of type X, or a small SS of type Y.   All else equal, it should prefer the hypothesis Y, because (according to the ideas outlined above) small SS’s are more likely to form in its (assumed natural) environment.   That is, in Bayesian terms, the prior probability of small SS’s should be considered greater.

Suppose the agent has memory capacity that is quite limited compared to the number of observations it has to process.  Then the SS’s it observes and conjectures have to be saved in its memory, but some of them will need to be forgotten as time passes; and compressing the SS’s it does remember will be important for it to make the most of its limited memory capacity.   Roughly speaking the agentwill do better to adopt a memory code in which the SS’s that occur more often, and have a higher probability of being relevant to the future, get a shorter code.   

So, concision in the agent’s internal “computational model” should end up corresponding roughly to concision in the natural environment’s “computational model.”

The agent should then estimate that the most likely future observation-sets will be those that are most probable given the system’s remembered observational data, conditioned on the understanding that those generated by smaller SS’s will be more likely.  

To put it more precisely and more speculatively: I conjecture that, if one formalizes all this and does the math a bit, it will turn out that: The most probable observation-sets O will be the ones minimizing some weighted combination of

  • Kullback-Leibler distance between: A) the distribution over entity-combinations on various scales that O demonstrates, and B) the distribution over entity combinations on various scales that’s implicit in the agent’s remembered observational data
  •  The total size of the estimated-likely set of SS generators for O


As KL distance is relative entropy, this is basically a “statistical entropy/information based on observations” term, and then an “algorithmic information” type term reflecting a prior assumption that more simply generated things are more likely.

Now, wha does this mean in terms of “pattern theory”?  -- in which a pattern in X is a function that is simpler than X but (at least approximately) produces X?   If one holds the degree of approximation equal, then the simpler the function is, the more 'intense" it is said to be as a pattern.

In the present case, the most probable observation-sets will be ones that are the most intense patterns relative to the background knowledge of the agent’s memory.  They will be the ones that are most concise to express in terms of the agent’s memory, since the agent is expressing smaller SS generators more concisely in its memory, overall.

Intelligent Agents Acting In Natural Environments

Now let us introduce the agent’s actions into the picture. 

If an agent, in interaction with a  natural, environment, has multiple possible avenue of action, then ones involving setting up smaller SS’s will on the whole be more appealing to the agent than ones involving setting up larger SS’s. 

Why?  Because they will involve less effort -- and we can assume the system has limited energetic resources and hence wants to conserve effort. 

Therefore, the agent’s activity will be more likely to result in possible scenarios with more patterns, than ones with less patterns.   That is -- the agent’s actions will, roughly speaking tend to lead to maximal pattern generation -- conditioned on the constraints of moving in the direction of the agent’s goals according to the agent’s “judgment.”  

MaxPat

So, what we have concluded is that: Given the various avenues open to it at a certain point in time, an intelligent agent in a natural environment will tend to choose actions that locally maximize the amount of pattern it understands itself to create (i.e., that maximize the amount of pattern created, where “pattern intensity” is measured relative to the system’s remembered observations, and its knowledge of various SS’s in the world with various levels of complexity.)    

This is what I call the Maximum Pattern Creation Principle – MaxPat.

If the agent has enough observations in its memory, and has a good enough understanding of which SS’s are small and which are not in the world, then measuring pattern intensity relative to the agent will be basically the same as measuring pattern intensity relative to the world.  So a corollary is that: A sufficiently knowledgeable agent in a natural environment, will tend to choose actions that lead to locally maximum pattern creation, where pattern intensity is measured relative to the environment itself.


There is nothing tremendously philosophically surprising here; however, I find it useful to spell these conceptually plain things out in detail sometimes, so I can more cleanly use them as ingredients in other ideas.    And of course, going from something that is conceptually plain to a real rigorous proof can still be a huge amount of work; this is a task I have not undertaken here.

Saturday, September 17, 2016

Musing about inference control, biography and episodic memory

This is just some notes-to-myself type rambling about declarative and episodic memory and reasoning ... stuff I'm thinking through in the back of my mind related to some ongoing OpenCog detailed-design work....

It occurred to me last week that inference control (the control of which logical inference steps to take, in trying to resolve some question using reasoning based on knowledge) has a lot in common with the decisions a person makes about how to live their life --- both over the course of their whole lifetime, and in various specific contexts.  

Further, I think this analogy may be important in terms of guiding the interaction between semantic (declarative) and episodic memory.   That is -- I think that, in many cases, real-life or imagined episodes stored in episodic memory may serve as high-level structural templates for inference control...

At a very crude level, the analogy I see is: both an inference process aimed at resolving a question, and a series of life-decisions aimed at navigating a certain everyday situation, are concerned with achieving certain goals in a limited time-frame, using limited resources, and via a series of choices, where each choice made guides the set of choices that are next available.

At a more precise level, what I also see is that: both in inference control and in real-life everyday episodic human decision-making, the "fitness landscape" is such that it is a reasonably useful strategy to iteratively focus attention on local regions of decision-space, each of which can

  • be explored within reasonable accuracy in, say, 1-3 orders of magnitude less time than is available for the overall process
  • be explored more thoroughly, yielding generally better results, in the same order of magnitude of time as is available for the overall process

So, in the inference context, one can break one's inference goal into subgoals in multiple ways, where crudely exploring each way of breaking the goal into subgoals may take 1/10 or 1/500 the amount of time one has available for the inference.    Thoroughly exploring any one way of breaking the goal into subgoals may then take longer -- indeed it may take as much time as one has.

In the episodic context, for instance, if one is unsure what career to choose or who to marry, one can crudely explore a certain job or a certain potential mate in 1/5 or 1/500 the total amount of time one has to choose a career or mate.  On the other hand, thoroughly exploring and optimizing the possibilities offered by a given job or a given mate, if the choice is not a terrible one, may take as much time as one has.

So in both cases one carrying out a sequence of actions over time, in a context where the available actions depend on the actions one has taken previously -- and in both cases one heuristically has time to crudely explore maybe dozens, hundreds or thousands of local regions of action-space, before one's time runs out ... but one has time to thoroughly explore only a small handful of local regions of action-space, before one's time runs out...

In this sort of context, it seems, a reasonable approach will often be to:

  • Start out by trying a bunch of different stuff, gaining information about "where in the space of possibilities a good answer may lie"
  • Then, when one's time starts to run out, one should pick some of the best options one has found (maybe just one of them) and explore it more deeply.

In part this is just straightforward "exploration versus exploitation" stuff.

For instance, in the everyday life context: When young, one should try out many different jobs and date many different people, and try to understand what one can about the landscape.    But then once one gets middle-aged, the time has often come to pick a single mate or a single career area and focus on that.   The logic behind this is: In the years one has on Earth, one probably only has time to thoroughly explore and become great at a small number of careers, and to develop deep love relationships (or build families with) a small number of partners.   However, some careers and some mates will suit one better.  So given the nature of the fitness landscape, the best strategy is to look around a bit till one finds something that seems about as good as one is going to find before one gets too close to the end, and then at a certain point pick something "good enough" and go with it.   A review of this sort of process in the mate-selection context is given in this article.

In inference one has the same basic issue.   Suppose one wants to figure out X so that X is a person and X lives in China and X knows both p-adic analysis and particle physics.   But suppose one doesn't have much time to carry out the search.   One option is to look at online publications in those areas, and check out which papers have authors based in China.   Another option is to look at Chinese university websites and the listing of professors and their publications.   Obviously it's a wrong idea to choose only one approach and pursue it solely, unless one is very, very short on time.  Instead it makes more sense to attempt each approach a bit and see which one is more promising.    This is just "exploration versus exploitation."

But the nature of inference, and the nature of life-decisions, is that one has a series of exploration-versus-exploitation type choices, where the choices that one is presented with depend on the choices one has made previously .. and where exploring each choice often take an amount of time that is meaningful but not humongous relative to one's total available time.

The same sort of structure applies to social decision-making in contexts briefer and less consequential than choosing who to marry: for instance, figuring out how to entertain a specific person on a date, or figuring out how to get ahead in a specific company.  In each of these cases there is a limited amount of time, a series of sequential decisions to make, and a situation where one can explore a bunch of options roughly but very few options thoroughly.

An interesting question, then, is how much the analogy between inference-control decisions and everyday-life decisions helps a human-like mind in its thinking.   Are we actually helped by being able to consider our inferences as being like stories?  

A typical story has a beginning, middle and end -- where the beginning is about setting out context and making clear what possibilities exist, the middle is about exploring some particular possibility in depth (typically with great uncertainty about whether this possibility will yield a good result or not), and the end is about what happens from exploring this particular possibility (which ideally should have some surprising aspect, whether the exploration is successful or not).    A typical inference process will have the same sort of beginning, middle and end ... though it may be a bit more like a big fat epic Russian novel, with multiple possibilities, involving different characters, explored in depth during the middle section.

What does seem likely is that the brain re-uses the same mechanisms it uses for managing stories in episodic memory, for managing memories and patterns of inference control.    Evolutionarily, it is not clear to me whether sophisticated episodic memory came before sophisticated inference or not.   Perhaps the two co-evolved.

Using similar mechanisms for controlling inference and guiding episodic memory and everyday-life decision-making, should ease "cognitive synergy" between declarative reasoning and episodic recollection.   When declarative reasoning gets stuck, it can be helped out via recollection or simulation of episodes; and vice versa.

For instance, suppose a robot needs to figure out how to amuse a certain person.  Episodic memory might give it examples of things that have amused that person, or associated people, in prior situations.   Declarative reasoning might note that the person has a Pearl Jam T-shirt on, and might then tell the system to look for funny statements involving rock music.   The same goal can be explored both via episodic mind's-eye simulations, and via logical reasoning based on general background knowledge.   And the overlap can recurse.  For instance, if logic tells the system to look for funny statements involving rock music, episodic memory search might come up with specific past situations in which funny statements involving rock music have been uttered.   If episodic memory digs up certain people closely associated with the person in question, logic might come up with some clever conclusions regarding what would have amused these associates.   Etc.

This sort of cross-memory-mode helping-out would seem to be eased by using the same sort of representation for both processes.

This is interesting to me in terms of our next phase of OpenCog work, because we need to figure out both the best way to represent real-life episodes from a robot's everyday life in the system's memory, and the best way to represent abstractions from probabilistic-logic inference chains.  What this rambling, musing discussion makes clear is that we want to use essentially the same representation for both cases.   This way the commonality of the decision-processes involved in the two cases can be manifested in the software, and the ease of achieving episodic-declarative cognitive synergy will be maximized.




Morphics and Ethics

Reading the news about the Duterte, new Philippine leader, killing thousands of accused drug dealers and drug users without trial ... and noticing so many generally good-hearted Filipinos defend him on the grounds that he's "cleaning up the country" ... reminded me how far the human world is from understanding the weakness of simplistic utilitarian approaches to ethics ...

My own inclination, to be quite open about it, is toward a highly peace-biased conditional pacifism in the style professed by, for instance, Einstein and Bertrand Russell.....   I.e., I don't believe violence should be eschewed in every case, but I think it should be avoided except in extreme cases where -- after as much careful, compassionate reflection as the situation allows -- there really seems no plausible alternative but to do some violence to avoid even worse violence...

What I'll do in this post is connect these ethical issues with some more metaphysical and complex-systems-dynamical points....   I will lay out what I see as a  fairly conceptually obvious connection between the notion of morphic resonance aka “tendency to take habits”, and the reasons why a naïve utilitarian approach to ethics could be expected to generally fail in reality, whereas conditional pacifism could be expected to do better...

Morphic Systems

I'll start in a fairly airy abstract realm, and then eventually get back to the practicalities of pacifism....   

I'll start by formulating the notion of “morphic dynamics” in a highly general way.   My notion of morphic dynamic is inspired loosely by Rupert Sheldrake's thinking on morphic resonance, but is not quite the same as his idea....  (Rupert is a great guy and a deep, honest, adventurous thinker; and we have discussed these ideas a few times, and we don't exactly disagree profoundly on anything, we just have different intellectual styles and orientations.)

I suggest that: A system may be said to be “morphic” if its dynamics manifest the “tendency to take habits”  (the latter phrase being drawn from the philosophy of Charles Peirce) – i.e. if it’s the case that, within subsystems of the system S, the odds of the future resembling the past are surprisingly high.   

What does “surprisingly” mean?   That gets subtle, but one way to formulate it is: “Surprisingly often means significantly more often than in a random possible world meeting the specified conditions.”  

Suppose there are 10 different subsystems of the system S, in each of which one has observed pattern P 5 times during the last hour.    Then, across these subsystems, what will be the distribution of the number of occurrences of P during the next hour?   

In general, one would expect the mean of this distribution to be 5.  But in a morphic system S, the variance of the distribution will be much narrower than one would find via looking at random systems.   Because there would be a surprising tendency for the pattern distribution in the future of a subsystem, to resemble the pattern distribution in the past of a subsystem.

Smolin’s “precedence principle” suggests that the physical universe is morphic in a similar sense (though he uses a quite different language), and derives aspects of quantum mechanics therefrom.   Sheldrake’s morphic resonance theory suggests the biological, psychological, physical and metaphysical universe is morphic in a similar sense, and seeks to explain a variety of phenomena such as psi, epigenesis and the origin of life in these terms.   


Regardless of whether the universe is foundationally morphic, though, it may still be the case that particular systems like, say, human minds or human societies are morphic.

One way that society would get to be morphic, apart from any general principle of morphic resonance, would simply be via the tendency of people to jump to conclusions emotionally (even before the evidence merits it probabilistically), and the tendency of people to copy each other.   Both of these tendencies are, of course, very real and well documented.

If human societies are morphic then this has significant ethical implications.  It affects the logic of voting – in a morphic society, on the whole, whether one person votes today, has a surprising impact on the number of people to vote tomorrow.   As another example, it also has implications regarding the argument for pacifism.

Pacifism and Morphic Dynamics

--> My father’s parents were Quakers and devout pacifists.   I grew up with a highly pacifist orientation as well, to the point where up till a certain point in high school, I tended to let other kids beat me up, simply because I felt it would be wrong to hit them back.   I hated being beaten up, but I also had no desire to hurt the other kids, and felt hurting them would still be wrong, even though they were hurting me.   At a certain point I was just getting beaten up too often, though, because certain bad-hearted kids had decided it was really fun to beat the crap out of the local pacifist every day after school.   I started fighting back, which predictably decreased the incidence of attacks on me.   I wasn’t quite convinced this was philosophically correct, but in practice life pushed me to adopt what philosophers would call a conditional pacifism, with something of a utilitarian flavor.
-->

Einstein and Bertrand Russell were also conditional pacifists – they were strongly biased against violence, yet both advocated fighting back against Hitler, feeling that in this case taking a pacifist hard line would cause much more harm than good.   In general, if one is conditionally pacifist in their style, one believes that violence should generally be avoided, but that in some extreme cases it may be the most ethical course to take.


-->
On the other hand, as I write these words, the new leader of the Philippines, Duterte, is making headlines for large-scale extrajudicial killing of suspected drug dealers or drug users.   It seems clear that some false positives are occurring in this process – i.e. some folks who are being killed, were not actually drug dealers or drug users.   But a utilitarian argument could be made that this is a justifiable cost, if it results in a massive decrease in drug use across the nations, because the drug epidemic is killing so many people. 

Clearly the conditional pacifism of Einstein and Russell was not the kind of simplistic utilitarianism that would justify Duterte's recent actions.  Yet, as a child and on into adulthood, the conceptual foundation of their more sophisticated style of conditional pacifism has often vexed me.

Even fairly extreme pacifists such as Gandhi have acknowledged the inevitably conditional nature of real-world pacifism.  Gandhi noted that living in the world almost inevitably involves doing harm to some other beings; but that if one acts with awareness and compassion toward all living beings, one will be able to minimize the amount of harm one causes.   This may not have been the emotional orientation of Einstein and Russell, but it would have been quite possible to fight Hitler’s army while feeling compassion for Hitler himself and his soldiers as well.    Similar to how one may feel compassion for a rabid dog, yet still shoot it to avoid it from spreading rabies and thus causing even greater harm (and in that case, to end its own suffering as well).  

But conditional pacifism does not equate to naive utilitarianism in which one kills or harms whenever a simplistic calculation suggests one may save 2 lives by killing one guy.

-->

One argument for having a strong bias toward peace – in the style of Einstein and Russell -- is that, to put it simply, violence tends to breed violence – and sometimes in non-obvious ways.

The “violence breeds violence” argument against Duterte-style utilitarian murder, argues that solving problems with violence tends to lead to further problems indirectly, down the line.   So that, for instance: Even if it’s true that lives are saved via the extrajudicial killing intimidating people into not selling or using drugs, achieving this goal via this means creates subtler problems.  It creates people who hate the government due to its murder of their innocent friends and family members.  It scars the minds of the killers themselves in various ways, with various (mostly bad) consequences.   And it also leaves people with the same psychological and social problems that pushed them to use drugs in the first place – which may then find an outlet via other means … suicide, emotional abuse of friends or family members, etc.  

The particulars via which “violence breeds violence” may be quite complicated.  But the point I want to make here is that the existence of SOME such particulars would follow naturally from the assertion that society is morphic … whether due to manifesting a broader cosmic principle of morphic resonance, or “just” due to that being part of the nature of its self-organizing dynamic.

So, for instance: If one holds that life is generally valuable, then the hypothesis of a morphic society would lead one to the conclusion that one should not generally kill N people to save N+1 people, because doing violence often has indirect consequences that are bad (for life-forms associated with the violence in various ways).   In some cases one should kill N people to save N+K people for various K; but the more morphic society is, on the whole, the larger K would be.

In terms of the practicalities of ethics and pacifism, I have certainly broken no new ground here.   In terms of conventional philosophical categories, I suppose my point regards the potential derivation of certain ethical stances (e.g. variants of conditional pacifism) from either a) metaphysics or b) empirical facts regarding the nature of complex social systems.

-->