Monday, July 22, 2019

Toward an Abstract Energetics of Computational Processes (in Brains, Minds, Physics and Beyond)


Given the successes of energy-based formalisms in physics, it is natural to want to extend them into other domains like computation and cognition.

In this vein: My aim here is to sketch what I think is a workable approach to an energetics of computational processes (construed very broadly).  

By this I mean: I will explain how one can articulate highly general principles of the dynamics of computational processes, that take a similar form to physics principles such as the stationary action principle (which often takes the form of "least action") and the Second Law of Thermodynamics (the principle of entropy non-decrease).

Why am I interested in this topic?   Two related reasons, actually.

First, I would like to create a "General Theory of General Intelligence" -- or to be more precise, a general theory of what kinds of systems can display what levels of general intelligence in what environments given realistically limited (space, time and energy) resources.   Marcus Hutter's Universal AI theory is great but it doesn't say much about general intelligence under realistic resource assumptions, most of its power is limited to the case of AI systems with unrealistically  massive processing power.   I have published some ideas on this before -- e.g. formalizing Cognitive Synergy in terms of category theory, and articulating the Embodied Communication Prior in regard to which human-like agents attempt to be intelligent -- but nothing remotely near fully satisfying.  So I'm searching for new directions.

Second, I would like to come up with a real scientific theory of psi phenomena.  I am inclined toward what I call "euryphysical" theories -- i.e. theories that involve embedding our 4D spacetime continuum in a larger space (which could be a higher dimensional space or could be a non-dimensional topological space of some sort).   However, this begs the question of what this large space is like -- what rules govern "dynamics" in this space?   In my paper on Euryphysics, I give some rough ideas in this direction, but again nothing fully satisfying.

It would be nice if mind dynamics -- both in a traditional AI setting and in a more out-there euryphysical setting -- could be modeled on dynamical theories in physics, which are based on ideas like stationary action.   After all, if as Peirce said "matter is just mind hide-bound with habit" then perhaps the laws of matter are in some way simplifications or specializations of the laws of mind -- and  maybe there are laws of mind with roughly analogous form to some of the current laws of physics.

A Few Comments on Friston's Free Energy Ideas

Friston's "free energy principle" represents one well-known effort in the direction of modeling cognition using physics-ish principles.  It seems to me that Friston's ideas have some fundamental shortcomings -- but reviewing these shortcomings has some value for understanding how to take a more workable approach.

I should clarify that my own thinking described in this blog post was not inspired by Friston's thinking to any degree, but more so by long-ago reading in the systems-theory literature -- i.e. reading stuff like Ilya Prigogine's Order out of Chaos] and Eric Jantsch's The Self-Organizing Universe  and Hermann Haken's Synergetics.    These authors represented a tradition within the complex-systems research community, of using far-from-equilibrium thermodynamics as a guide for thinking about life, the universe and everything.  

Friston's "free energy principle" seems to have a somewhat similar conceptual orientation, but confusingly to me, doesn't seem to incorporate the lessons of far-from-equilibrium thermodynamics that thoroughly, being based more on equilibrium-thermodynamics-ish ideas.  

I haven't read everything Friston has written, but have skimmed various papers of his over the years, and recently looked at the much-discussed papers The Markov blankets of life: autonomy, active inference and the free energy principle

My general confusion about Friston's ideas is largely the same as that expressed by the authors of blog posts such as

As the latter post notes, regarding perception, Friston basically posits that neural and cognitive systems are engaged with trying to model the world they live in, and do so by looking for models with maximum probability conditioned on the data they've observed.   This is a useful but not adventurous perceptive, and one can formulate it in terms of trying to find models with  minimum KL-divergence to reality, which is one among many ways to describe Bayesian inference ... and which can be mathematically viewed as attempting to minimize a certain "free energy" function.

Friston then attempts to extend this principle to action via a notion of "active inference", and this is where things get dodgier.   As the above-linked "Markov Blankets" paper puts it,

"Active inference is a cornerstone of the free energy principle. This principle states that for organisms to maintain their integrity they must minimize variational free energy.  Variational free energy bounds surprise because the former can be shown to be either greater than or equal to the latter. It follows that any organism that minimizes free energy thereby reduces surprise—which is the same as saying that such an organism maximizes evidence for its own model, i.e. its own existence

...

This interpretation means that changing internal states is equivalent to inferring the most probable, hidden causes of sensory signals in terms of expectations about states of the environment

...

[A] biological system must possess a generative model with temporal depth, which, in turn, implies that it can sample among different options and select the option that has the greatest (expected) evidence or least (expected) free energy. The options sampled from are intuitively probabilistic and future oriented. Hence, living systems are able to ‘free’ themselves from their proximal conditions by making inferences about probabilistic future states and acting so as to minimize the expected surprise (i.e. uncertainty) associated with those possible future states. This capacity connects biological qua homeostatic systems with autonomy, as the latter denotes an organism’s capacity to regulate its internal milieu in the face of an ever-changing environment. This means that if a system is autonomous it must also be adaptive, where adaptivity refers to an ability to operate differentially in certain circumstances.

...

The key difference between mere and adaptive active inference rests upon selecting among different actions based upon deep (temporal) generative models that minimize the free energy expected under different courses of action.

This suggests that living systems can transcend their immediate present state and work towards occupying states with a free energy minimum."

If you are a math/physics oriented person and find the above quotes frustratingly vague, unfortunately you will find that the rest of the paper is equally vague on the confusing points, and Friston's other papers are also.   

What it sounds like to me (doing some "active inference" myself to try to understand what the paper is trying to say) is that active inference is being portrayed as a process by which cognitive systems take actions aimed at putting themselves in situations that will be minimally surprising, i.e. in which they will have the most accurate models of reality.    If taken literally this cannot be true, as it would predict that intelligent systems systematically seek simpler situations they can model better -- which is obviously not a full description of human motivation, for instance.   We do have a motivation to put ourselves in comprehensible, accurately model-able situations -- but we also have other motivations, such as the desire to perceive novelty and to challenge ourselves, which sometimes contradict our will to have a comprehensible environment.

The main thing that jumps out at me when reading what Friston and colleagues write about active inference is that it's too much about states and not enough about paths.   To model far-from-equilibrium thermodynamics using energy-based formalisms, one needs to think about paths and path entropies and such, not just about things like " work[ing] towards occupying states with a free energy minimum."    Instead of thinking about ideas like " selecting among different actions based upon deep (temporal) generative models that minimize the free energy expected under different courses of action." in terms of states with free energy  minimum, one needs to be thinking about action selection in terms of stationarity of action functions evaluated along multiple paths.
 
Energetics for Far-From-Equilibrium Thermodynamics

It seems clear that equilibrium thermodynamics isn’t really what we want to use as a guide for cognitive information processing.  Fortunately, the recent thermodynamics literature contains some quite interesting results regarding path entropy in far-from-equilibrium thermodynamics.


David Rogers and Susan Rempe in A First and Second Law for Nonequilibrium Thermodynamics: Maximum Entropy Derivation of the Fluctuation-Dissipation Theorem and Entropy Production Functionals" describe explicitly the far from equilibrium “path free energy”, but only for the case of processes with short memory, i.e. state at time i+1 depends on state i but not earlier ones (which is often fine but not totally general). 

The following table from Rogers and Rempe summarizes some key points concisely.





Conceptually, the key point is that we need to think not about the entropy of a state, but about the "caliber" of a path -- a normalization of the number of ways that path can be realized.   This then leads to the notion of the free energy of a certain path.    

It follows from this body of work that ideas like "free energy minimization" need to be re-thought dynamically rather than statically.   One needs to think about systems as following paths with differential probability based on the corresponding path free energies.    This is in line with the "Maximum Caliber principle"  which is a generalization of the Maximum Entropy principle to dynamical systems (both first proposed in clear form by E.T. Jaynes, though Maximum Entropy has been more widely developed than Maximum Caliber so far).

Extending these notions further, Diego Gonzalez outlines a Hamiltonian formalism that is equivalent to path entropy maximization, building on math from his earlier paper  Inference of trajectories over a time-dependent phase space distribution.

Action Selection and Active Inference

Harking back to Friston for a moment, it follows that the dynamics of an intelligent system, should be viewed, not as an attempt by an intelligent system to find a state with minimum free energy or surprisingness etc., but rather as a process of a system evolving dynamically along paths chosen probabilistically to have stationary path free energy.  

But  of course, this would be just as true for an unintelligent system as for an intelligent system -- it's not a principle of intelligence but just a restatement of how physics works (in far from equilibrium cases; in equilibrium cases one can collapse paths to states).   

If we want to say something unique about intelligent systems in this context, we can look at the goals that an intelligent system is trying to achieve.   We may say that, along each potential path of the system's evolution, its various goals will be achieved to a certain degree.   The system then has can be viewed to have a certain utility distribution across paths -- some paths are more desirable to it than others.   A guiding principle of action selection would then be: To take an action A so that, conditioned on action A, the predicted probability distribution across paths is as close as possible to the distribution implied by the system's goals.

This principle of action selection can be formalized as KL-divergence minimization if one wishes, and in that sense it can be formulated as a "free energy minimization" principle.   But it's a "free energy" defined across ensembles of paths, not across states.

A side note is, it's important to understand that the desirability of a path to an intelligent system need not be expressible as the expected future utility at all moments of time along that path.   The desirability of a path may be some more holistic function of everything that happens along that path.    Considering only expected utility as a form of goal leads to various pathologies related to wireheading, as I argued in a long-ago blog post on ultimate orgasms and such.


Algorithmic Thermodynamics

Now let's dig a little deeper.   Can we apply these same ideas beyond the realm of physics, to more general types of processes that change over time?

I am inspired by a general Whiteheadean notion of procesess as fundamental things.   However, to keep things concrete, for now I'm going to provisionally assume that the "processes" involved can be formulated as computer programs, in some standard Turing-equivalent framework, or maybe a quantum-computing framework.   I think the same ideas actually apply more broadly, but -- one step at a time...

Let us start with Kohtaro Tadaki's truly beautiful, simple, elegant paper titled A statistical mechanical interpretation of algorithmic information theory  

Section 6 of Tadaki outlines a majorly aesthetic, obvious-in-hindsight parallel between algorithmic information theory and equilibrium thermodynamics.   There is seen to be a natural mapping between temperature in thermodynamics and compression ratio in algorithmic information theory.   A natural notion of "algorithmic free energy"  is formulated, as a sort of weighted program-length over all possible computer programs (where the weights depend on the temperature).

The following table (drawn from Tadaki's presentation here) summarizes the key  mappings in Tadaki's theory





To ground the mappings he outlines, Tadaki gives a  simple statistical mechanical interpretation to algorithmic information theory.   He models an optimal computer as decoding equipment at the receiving end of a noiseless binary communication channel.   In this context, he regards programs for this computer as codewords (finite binary strings) and regards computation results (also finite binary strings) as decoded “symbols.”    For simplicity he assumes that the infinite binary string sent through the channel -- constituting a series of codewords in a prefix-free code is generated by infinitely repeated tosses of a fair coin.   Based on this simple reductive model, Tadaki formulates computation-theoretic analogues to core constructs of traditional equilibrium thermodynamics.  

Now let's start putting some pieces together.

Perhaps the most useful observation I will make in this blog post is:   It seems one could port the path-entropy based treatment of far-from-equilibrium thermodynamics (as seen in the papers I've linked above) to Tadaki's algorithmic-information context, by looking at sources emitting bits that are not independent of each other but rather have some probabilistic dependencies..

By doing so, one would obtain an “algorithmic energy” function that measures the energy of an algorithmic process over a period of time -- without assuming that it’s a memoryless process like Tadaki does in his paper.

To get this to work, so far as I can limn without doing all the math (which I don't have time for at the moment, alas), one needs to assume that the knowledge one has of the dependencies among the bits produced by the process is given the form of expectations…  e.g. that we know the average value of f_k(x_{i+1}, x_i} for various observables f_k ….  Plus one needs to make some other slightly funny assumptions that are probably replaceable (the paper assumes “the number of possible transitions does not depend on the starting point”… but I wonder if this could be replaced by some assumption about causality…)

If I'm not mistaken, this should give us something like Friston’s free energy principle that actually works and has meaning….  I.e. we have a rigorous sense in which complex algorithmic systems are minimizing free energy.   The catch is that it’s an algorithmic path energy -- but hey...

More precisely, relative to an observer S who is observing a system S1 in a  certain way (by tabulating conditional probabilities of “how often some event of type A occurs at time T+s, given some event of type B occurred at time T”) … we may say the evolution of S1 in S’s perspective obeys an energy minimization principle, where energy is defined algorithmic-informationally (following my proposed, not-yet-fleshed-out non-equilibrium generalization of Tadaki’s approach)…

Into the Quantum Rabbit Hole...

Now that we've gone this far, we may as well plunge in a bit deeper, right?

Tadaki deals w/ classical computers but -- gesticulating only moderately wildly -- it seems one could generalize his approach to quantum computers OK.  

Then one is looking at series of qubits rather than bits, and instead of tabulating conditional probabilities one is tabulating amplitudes.  

The maximum entropy principle is replaced with the stationary quantropy principle and one still has the situation that: Relative to S who is observing S1 using some standard linear quantum observables, S1 may be said to evolve according to a stationary quantropy trajectory, where quantropy is here defined via generalizing the non-equilibrium generalization of Tadaki’s algorithmic-informational entropy via replacing the real values w/ complex values

So we may well get a kind of free-energy principle for quantum systems also.

If we want to model cognitive stuff using bits or qubits, then we have here a physics-ish theory of cognitive stuff….  Or at least a sketch of the start of one…

Out Toward the Eurycosm

One of the motivations for these investigations was some discussions on higher-dimensional and more broadly eurycosmic models of psi.  If there are non-physical dimensions that connect spatiotemporally distant entities, then what are the dynamical laws in these dimensions?   If we can model them as information dimensions, then maybe the dynamics should be modeled as I’m alluding here…

Physics dynamics should be recoverable as a special case of algorithmic-information dynamics where one adds special constraints.   I.e. the constraints posed by spatial structure and special relatively etc. should reflect themselves in the conditional probabilities observed btw various classes of bits or qubits.  

Then the linear symmetries of spacetime structure should mean that when you calculate maximum-path-algorithmic-information distributions relative to these physics constraints, you end up getting maximum-Shannon-path-entropy distributions.   Because macrophysics results from doing computing using ensembles of randomly chosen computer programs (i.e. chosen subject to given constraints…).

Suppose we want to model a eurycosm that works according to a principle like Peirce's "tendency to take habits" aka Smolin's Precedence Principle aka Sheldrake's morphic resonance?   Well then, one can assume that the probability distribution underlying the emanation of codewords in Tadaki's model obeys this sort of principle.   I.e., one can assume that the prior probability of a certain subsequence is higher if that subsequence, or another subsequence with some of the same patterns in that sequence, have occurred earlier in the overall sequence.   Of course there are many ways to modify Tadaki's precise computational model, and many ways to formalize the notion that "subsequences with historically more frequent patterns should be more frequent going forward."    But conceptually this is quite straightforward.

One is forced however to answer the following question.   Suppose we assume that the probability of pattern P occurring in a subsequence beginning at time T is in some way proportional to the intensity with which P has occurred as a pattern in the subsequence prior to time T.   What language of processes are we using to formalize the patterns P?   If -- in line with the framework I articulate in The Hidden Pattern and elsewhere -- we formalize a pattern P in X as a process P that produces X and is simpler than X -- what is the language in which patterns are expressed?    What is the implicit programming language of our corner of the eurycosm?  

For simplicity I have been following Tadaki's conventional Turing machine based computational models here -- with a brief gesture toward quantum computing -- but of course  the broad approach outlined here goes beyond these computing paradigms.   What if we ported Tadaki's ideas to series of bits emanated by, say, a hypercomputer like the Zeno Machine?   Then we don't just get a single infinite bit string as output, but a more complex ordinal construction with infinite bit strings of infinite bit strings etc. -- but the math could be worked.   If the size of a Zeno Machine program can be quantified by a single real number, then one can assess Zeno Machine programs as patterns in data, and one can define concepts like compression ratio and algorithmic entropy and energy.   The paradigm sketched here is not tied to a Turing Machine model of eurycosmic processes, though TMs are certainly easier for initial sketches and calculations than ZMs or even weirder things.

I have definitely raised more questions than I've answered in this long and winding blog post.   My goal has been to indicate a direction for research and thinking, one that seems not a huge leap from the current state of research in various fields, but perhaps dramatic in its utility and implications.


 




Sunday, July 14, 2019

"Deriving" Quantum Logic from Reason, Beauty and Freedom

One Basic Principle of this Corner of the Eurycosm: The Universe Maximizes Freedom Given Constraints of Reason and Beauty

I was musing a bit about the basic concept at the heart of quantum logic and quantum probability: That a particular observer, when reasoning about properties of a system that it cannot in principle ever observe, should use quantum logic / quantum probabilities instead of classical ones.

I kept wondering: Why should this be the case?

Then it hit me: It’s just the maximum-entropy principle on a higher level!

The universe tends to maximize entropy/uncertainty as best it can given the imposed constraints.   And quantum amplitude (complex probability) distributions are in a way “more uncertain” than classical ones.   So if the universe is maximizing entropy it should be using quantum probabilities wherever possible.

A little more formally, let’s assume that an observer should reason about their (observable or indirectly assessable) universe in a way that is:


  1. logically consistent: the observations made at one place or time should be logically consistent with those made at other places and times
  2. pleasantly symmetric: the ways uncertainty and information values are measured should obey natural-seeming symmetries, as laid out e.g. by Knuth and Skilling in their paper on Foundations of Inference [https://arxiv.org/abs/1008.4831] , and followup work on quantum inference [https://arxiv.org/abs/1712.09725]
  3. maximally entropic: having maximum uncertainty given other imposed constraints.  Anything else is assuming more than necessary.  This is basically an Occam’s Razor type assumption.


(I note that logical consistency is closely tied with the potential for useful abstraction.  In an inconsistent perceived/modeled world, one can't generalize via the methodology of making a formal abstraction and then deriving implications of that formal abstraction for specific situations (because one can't trust the "deriving" part).... In procedural terms, if a process (specified in a certain language L) starting from a certain seed produces a certain result, then we need it still to be the case later and elsewhere that the same process from the same seed will generate the same result ... if that doesn't work then "pattern recognition" doesn't work so well....   So this sort of induction involving patterns expressed in a language L appears equivalent to logical consistency according to Curry-Howard type correspondences.)

To put the core point more philosophico-poetically, these three assumptions basically amount to declaring that an observer’s subjective universe should display the three properties of:


  1. Reason
  2. Beauty
  3. Freedom


Who could argue with that?

How do reason, beauty and freedom lead to quantum logic?

I’m short on time as usual so I’m going to run through this pretty fast and loose.   Obviously all this needs to be written out much more rigorously, and some hidden rocks may emerge.   Let’s just pretend we’re discussing this over a beer and a joint with some jazz in the background…

We know basic quantum mechanics can be derived from a principle of stationary quantropy (complex valued entropy) [https://arxiv.org/abs/1311.0813], just as basic classical physics can be derived from a principle of stationary entropy …

Quantropy ties in naturally with  Youssef’s complex-valued truth values [https://arxiv.org/abs/hep-th/0110253], though one can also interpret/analyze it otherwise…

It seems clear that modeling a system using complex truth values in a sense reflects MORE uncertainty than modeling a system using real truth values.   What I mean is: The complex truth values allow system properties to have the same values they would if modeled w/ real truth values, but also additional values.

Think about the double-slit experiment: the quantum case allows the electrons to hit the same spots they would in the classical case, but also other spots.

On the whole, there will be greater logical entropy [https://arxiv.org/abs/1902.00741] for the quantum case than the classical case, i.e. the percentage of pairs of {property-value assignments for the system} that are considered different will be greater.   Double-slit experiment is a clear example here as well.

So, suppose we had the meta-principle: When modeling any system’s properties, use an adequately symmetric information-theoretic formalism that A) maximizes uncertainty in one’s model of the system, B) will not, in any possible future reality, lead to logical contradictions with future observations.

By these principles — Reason, Beauty and Freedom — one finds that


  • for systems properties whose values cannot in principle be observed by you, you should use quantum logic, complex truth values, etc. in preference to regular probabilities (because these have greater uncertainty and there is no problematic contradiction here)
  • for system properties whose values CAN in principle be observed by you, you can’t use the complex truth values because in the possible realities where you observe the system state, you may come up with conclusions that would contradict some of the complex truth-value assignments


(E.g. in the double-slit experiment, in the cases where you can in principle observe the electron paths, the quantum assumptions can’t be used as they will lead to conclusions contradictory to observation…)

A pending question here is why not use quaternionic or octonionic truth values, which Youssef shows also display many of the pleasant symmetries needed to provide reasonable measure of probability and information.  The answer has to be that these lack some basic symmetry properties we need to have a workable universe….  This seems plausibly true but needs more detailed elaboration…

So from the three meta-principles


  1. logical consistency of our models of the world at various times
  2.  measurement of uncertainty according to a formalism obeying certain nice symmetry axioms
  3.  maximization of uncertainty in our models, subject to the constraints of our observation


we can derive the conclusion that quantum logic / complex probability should be used for those things an observer in principle can’t measure, whereas classical real probability should be used for those things they can…

That is, some key aspects our world seem to be derivable from the principle that: The Universe Maximizes Freedom Given Constraints of Reason and Beauty

What is the use of this train of thought?

I’m not sure yet.  But it seems interesting to ground the peculiarity of quantum mechanics in something more fundamental.

The weird uncertainty of quantum mechanics may seem a bit less weird if one sees it as coming from a principle of assuming the maximum uncertainty one can, consistent with principles of consistency and symmetry. 

Assuming the  maximum uncertainty one can, is simply a matter of not assuming more than is necessary.  Which seems extremely natural — even if some of its consequences, like quantum logic, can seem less than natural if (as evolution has primed us humans to do) you bring the wrong initial biases to thinking about them.

Tuesday, July 09, 2019

The Simulation Hypothesis -- Not Nearly Crazy Enough to Be True






The "Simulation Hypothesis", the idea that our universe is some sort of computer simulation, has been getting more and more airtime lately.  

The rising popularity of the meme is not surprising since virtual reality and associated tech have been steadily advancing, and at the same time physicists have further advanced the formal parallels between physics equations and computation theory.    

The notion of the universe as a computer simulation does bring to the fore some important philosophical and scientific concepts that are generally overlooked.  

However, in various online and real-world conversations I have been hearing various versions of the simulation hypothesis that don't make a lot of sense from a scientific or rational point of view.   So I wanted to write down briefly what does and doesn't make sense to me in the simulation-hypothesis vein...

One thing that has gotten on my nerves is hearing the simulation hypothesis used to advocate for religious themes and concepts -- often in ways that profoundly stretch logic.   There are some deep correspondences between the insights of mystical wisdom traditions, and the lessons of modern physics and computation theory -- but I have heard people talk about the simulation hypothesis in ways that reach way beyond these correspondences, in a ways that fallaciously makes it seem like the science and math give evidence for religious themes like the existence of a vaguely anthropomorphic "creator" of our universe.  This is, I suppose, what has led some commentators like AGI researcher Eray Ozkural to label the simulation hypothesis a new form of creationism (the link to his article "Simulation Argument and Existential AI Risk: New Age Creationism?" seems to be down at the moment).

The idea that our universe might be a computer simulation is not a new one, and appeared in the science fiction literature many times throughout the second half of the previous century.   Oxford philosopher Nick Bostrom's essay titled "The Simulation Argument" is generally credited with introducing the idea to the modern science and technology community.    Now Rizwan Virk's book titled "The Simulation Hypothesis" is spreading the concept to an even wider audience.   Which is part of what motivated me to write a few words here on the topic.

I don't intend to review Virk's book here, because frankly I only skimmed it.   It seems to cover a large variety of interesting topics related to the simulation hypothesis, and the bits and pieces I read were smoothly written and accurate enough. 

Fundamentally, I think the Simulation Hypothesis as it's generally being discussed is not nearly crazy enough to be true.  But it does dance around some interesting issues.

Bostrom's Rhetorical Trickery

I have considerable respect for Nick Bostrom's rhetorical and analytical abilities, and I've worked with him briefly in the past when we were both involved in the World Transhumanist Association, and when we organized a conference on AI ethics together at his Future of Humanity Institute.   However, one issue I have with some of Nick's work is his tendency to pull the high school debating-team trick of arguing that something is POSSIBLE and then afterward speaking as if he has proved this thing was LIKELY.   He did this in his book Superintelligence, arguing for the possibility of superintelligent AI systems that annihilate humanity or turn the universe into a vast mass of paperclips -- but then afterward speaking as if he had argued such outcomes were reasonably likely or even plausible.   Similarly, in his treatment of the simulation hypothesis, he makes a very clear argument as to why we might well be living in a computer simulation -- but then projects a tone of emphatic authority, making it seem to the naive reader like he has somehow shown this is  a reasonably probable hypothesis.

Formally what Bostrom's essay argues is that

... at least one of the following propositions is true: (1) the human species is very likely to go extinct before reaching a “posthuman” stage; (2) any posthuman civilization is extremely unlikely to run a significant number of simulations of their evolutionary history (or variations thereof); (3) we are almost certainly living in a computer simulation.

The basic argument goes like this: Our universe has been around 14 billion years or so, and in that time-period a number of alien civilizations have likely arisen in various star systems and galaxies... and many of these civilizations have probably created advanced technologies, including computer systems capable of hosting massive simulated virtual-reality universes.   (Formally, he argues something like this follows if we assume (1) and (2) are false.)   So if we look at the history of our universe, we have one base universe and maybe 100 or 1000 or 1000000 simulated universes created by prior alien civilizations.   So what are the odds that we live in the base universe rather than one of the simulations?  Very low.  Odds seem high that, unless (1) or (2) is true, we live in one of the simulations.

The obvious logical problem with this argument is: If we live in a simulation programmed by some alien species, then the 14 billion year history of our universe is FAKE, it's just part of that simulation ... so that all reasoning based on this 14 billion year history is just reasoning about what kind of preferences regarding fake evidence were possessed by the aliens who programmed the simulation we're living in.   So how do we reason about that?   We need to place a probability distribution over the various possible motivational systems and technological infrastructures of various alien species?
(For a more detailed, slightly different run-through of this refutation of Bostrom's line of argument, see this essay from a Stanford University course).

Another way to look at it is: Formally, the problem with Bostrom's argument is that the confidence with which we can know the probability of (1) or (2) is very low if indeed we live in a simulation.   Thus all his argument really shows is that we can't confidently know the probabilities of (1) and (2) are high -- because if we do know this, we can derive as a conclusion that the confidences with which we know these probabilities are low.

Bostrom's argument is essentially self-refuting: What it demonstrates is mostly just that we have no frickin' idea about the foundational nature of the universe we live in.   Which is certainly true, but is not what he claims to be demonstrating.  


An Array of Speculative Hypotheses

To think seriously about the simulation hypothesis, we have to clearly distinguish between a few different interesting, speculative ideas about the nature of our world.  

One is the idea that our universe exists as a subset of some larger space, which has different properties than our universe.   So that the elementary particles that seem to constitute the fundamental building-blocks of our physical universe, and the 3 dimensions of space and one dimension of time that seem to parametrize our physical experience, are not the totality of existence -- but only one little corner of some broader meta-cosmos.  

Another is the idea that our universe exists as a subset of some larger space, which has different properties than our universe, and in which there is some sort of coherent, purposeful individual mind or society of individual minds, who created our universe for some reason.

Another is that our universe has some close resemblance to part or all of the larger space that contains it, thus being in some sense a "simulation" of this greater containing space...

It is a valid philosophical point that any of these ideas could turn out to be the reality.    As philosophy, one implication here is that maybe we shouldn't take our physical universe quite as seriously as we generally do -- if it's just a tiny little corner in a broader meta-cosmos. 

One is reminded of the tiny little Who empire in Dr. Seuss's kids' book "Horton Hears a Who."   From the point of view of the Whos down there in Whoville, their lives and buildings and such are very important.   But from Horton the Elephant's view, they're just living in a tiny little speck within a much bigger world.

From a science or engineering view, these ideas are only really interesting if there's some way to gather data about the broader meta-cosmos, or hack out of our limited universe into this broader meta-cosmos, or something like that.   This possibility has been explored in endless science fiction stories, and also in the movie The Matrix -- in which there are not only anthropomorphic creators behind the simulated universe we live in, but also fairly simple and emotionally satisfying ways of hacking out of the simulation into the meta-world ... which ends up looking, surprise surprise, a lot like our own simulated world.  

The Matrix films also echo Christian themes in very transparent ways -- the process of saving the lives and minds of everyone in the simulation bottoms down to finding one savior, one Messiah type human, with unique powers to bridge the gap between simulation and reality.   This is good entertainment, partly because it resonates so well with various of our historical and cultural tropes, but it's a bit unfortunate when these themes leak out of the entertainment world and into the arena of supposedly serious and thoughtful scientific and philosophical discourse.

In a 2017 article, I put forth some of my own speculations about what sort of broader space our physical universe might be embedded in.   I called this broader space a Eurycosm ("eury" = wider), and attempted to explore what properties such a Eurycosm might have in order to explain some of the more confusing aspects of our physical and psychological universe, such as ESP, precognition, remote viewing reincarnation, mediumistic seances, and so forth.   I don't want to bog down this article with a discussion of these phenomena, so I'll just point the reader who may be interested to explore scientific evidence in this regard to a list ofreferences I posted some time ago.   For now, my point is just: If you believe that some of these "paranormal" phenomena are sometimes real, then it's worth considering that they may be ways to partially hack out of our conventional 4D physical universe into some sort of broader containing space.

As it happens, my own speculations about what might happen in a Eurycosm, a broader space in which our own physical universe is embedded, have nothing to do with any creator or programmer "out there" who programmed or designed our universe.    I'm more interested to understand what kinds of information-theoretic "laws" might govern dynamics in this sort of containing space.

What seems to be happening in many discussions I hear regarding the simulation hypothesis is: The realization that our 4D physical universe might not be all there is to existence, that there might be some sort of broader world beyond it, is getting all fuzzed up with the hypothesis that our 4D physical universe is somehow a "simulation" of something, and/or that our universe is somehow created by some alien programmer in some other reality.

What is a "simulation" after all?  Normally that word refers to an imitation of something else, created to resemble that thing which it simulates.   What is the evidence, or rational reason for thinking, our universe is an imitation or approximation of something else?

Simulations like the ones we run in our computers today, are built by human beings for specific purposes -- like exploring scientific hypotheses, or making entertaining games.    Again, what is the evidence, or rational reason for thinking, that there is some programmer or creator or game designer underlying our universe?   If the only evidence or reason is Bostrom's argument about prior alien civilizations, then the answer is: Basically nothing.

It's an emotionally appealing idea if you come from a Christian background, clearly.   And it's been funky idea for storytelling since basically the dawn of humanity, in one form or another.   I told my kids a bunch of simulation-hypothesis bedtime stories when they were young; hopefully it didn't twist their minds too badly.   My son Zebulon, when he was 14, wrote a novel about a character on a mission to find the creators of the simulation we live in, so as specifically to track down the graphic designer who had created the simulation, so as to hold a gun to his head and force him to modify the graphics behind our universe to make people less ugly.   Later on he became a Sufi, a mystical tradition which views the physical universe as insubstantial in much subtler ways.

There is good mathematics and physics behind the notion that our physical universe can be modeled as a sort of computer -- where the laws of physics are a sort of "computer program" iterating our universe through one step after the next.    This is not the only way to model our universe, but it seems a valid one that may be useful for some purposes.  

There is good philosophy behind the notion that our apparently-so-solid physical reality is not necessarily foundationally real, and may be just a tiny aspect of a broader reality.   This is not a new point but it's a good one.   Plato's parable of the cave drove this home to the Greeks long ago, and as Rizwan Virk notes these themes have a long history in Indian and Chinese philosophy, and before that in various shamanic traditions.   Virk reviews some of these predecessors in his book.

But there is nothing but funky entertainment and rampant wishful thinking behind the idea that our universe is a simulation of some other thing, or that there is some alien programmer or other vaguely anthropomorphic "creator" behind the origin or maintenance of our universe.

We Probably Have Very Little Idea What Is Going On

I have two dogs at home, and I often reflect on what they think I am doing when I'm sitting at my computer typing.  They think I'm sitting there, guarding some of my valued objects and wiggling my fingers peculiarly.   They have no idea that I'm controlling computational processes on faraway compute clouds, or talking to colleagues about mathematical and software structures.  

Similarly, once we create AGI software 1000 times smarter than we are, this software will understand aspects of the universe that are opaque to our little human minds.   Perhaps we will merge with this AGI software, and then the new superintelligent versions of ourselves will understand these additional aspects of the universe as well.    Perhaps we will then figure out how to hack out of our current 4D spacetime continuum into some broader space.   Perhaps at that point, all of these concepts I'm discussing here will seem to my future-self like absolute ridiculous nonsense.

I have a lot of respect for the limitations of human intelligence, and a fairly strong confidence that we currently understand a very minimal percentage of the overall universe.   To the extent that discussion of the simulation hypothesis points in this direction, it's possibly valuable and productive.   We shouldn't be taking the 4D spacetime continuum current physics models as somehow fundamentally real, we shouldn't be assuming that it delimits reality in some ultimate and cosmic sense.

However, we also shouldn't be taking seriously the idea that there is some guy, or girl, or alien, or society or whatever "out there" who programmed a "simulation" in which our universe is running.   Yes, this is possible.   A lot of things are possible.  There is no reason to think this is decently probable.

I can see that, for some people, the notion of a powerful anthropomorphic creator is deeply reassuring.   Freud understood this tendency fairly well -- there's an inner child in all of us who would like there to be some big, reliable Daddy or Mommy responsible for everything and able to take care of everything.   Some bad things may happen, some good things will happen, and in the end Mom and Dad understand more than we do and will make sure it all comes out OK in the end.   Nick Bostrom, for all his brilliance, seems repeatedly drawn to themes of centralized control and wisdom.   Wouldn't it be reassuring if, as he suggests in Superintelligence, the UN would take over the creation of AGI and hire some elite vetted AI gurus to make sure it's developed in an appropriate way?   If we can't have a Christian God watching over us and assuring us a glorious afterlife, can't we at least have an alien programmer monitoring the simulation we're running in?  Can't the alien programmer at least be really good looking, let's say, maybe like a Hollywood movie star?

As far as I can tell, given my current sorely limited human mind, the universe seems to be a lot more about open-ended intelligence, a concept my friend Weaver at the Global Brain Institute has expertly articulated.   The universe -- both our 4D physical spacetime and whatever broader spaces exist beyond -- seems to be a complex, self-organizing system without any central purpose or any centralized creator or controller.   Think the creative self-organizing ocean in Lem's Solaris, rather than bug-eyed monsters coming down in spaceships to enslave us or stick needles into our bellybuttons.

So the simulation hypothesis takes many forms.   In its Bostromian form, or in the form I often hear it in casual conversations, it is mostly bullshit -- but still, it does highlight some interesting issues.   It's a worthwhile thought experiment but in the end it's most valuable as a pointer toward other, deeper ideas.   The reality of our universe is almost surely way crazier than any story about simulations or creators, and almost surely way beyond our current imaginations.