Quantum mechanics remains intuitively mysterious, in spite
of its practical successes. It is,
however, very mathematically elegant.
A literature has arisen attempting to explain why,
mathematically speaking, quantum mechanics is actually completely natural and
inevitable. One aspect of this
literature has remained intuitively unsatisfactory to me: the argument as to
why complex numbers, rather than real numbers, should be used to quantify
uncertainty.
Recently, though, I think I found an argument that feels
right to me. Here I will sketch that
argument. I haven't done all the math to
prove the details (due to lack of time, as usual), so this will be sketchy and
there might be something wrong here. (I
almost always put together an argument at this hand-wavy level before plunging
into doing a proof. But in recent years
I'm so overwhelmed with other things to do, that I often just leave it at the
hand-wavy argument and don't bother to do the proof. This isn't as fully satisfying, but life in
our current realm of scarcity is full of such trade-offs...)
Part of the background here is Saul Youssef's formulation ofquantum theory in terms of exotic probabilities. Youssef argues (physicist style, not
mathematician style) that the Kolmogorov axioms for probabilities, if one
removes the axiom that probabilities be real numbers, still hold for complex,
quaternionic and octonionic probabilities.
He then argues that if we assume probabilities are complex numbers, the
Schrodinger equation essentially follows.
His complex-number probabilities are subtly different from the
amplitudes normally used in quantum mechanics.
If we buy Youssef's formulation (as being, at least, one
meaningful version of the QM truth), then the question "why QM"
pretty much comes down to "why complex probabilities"? This is what I think I may see a novel
argument for -- based on tweaking some outstanding work by Knuth and Skillingon the algebraic foundations of inference.
Why Tuple Probabilities?
Most of my argument in this post will be mathematical (with big
gaps where the details are lacking).
But I'll start with a more philosophical-ish point.
Let's assume the same basic setup regarding observers and
observations that I used in my previous blog post, on maximum entropy.
I will make a conceptual argument why in some cases it makes
sense to think of probabilities as tuples with more than one entry.
The key question at hand is: How should O1 reason about the
observations that O can distinguish but O1 cannot? For
instance, suppose O1 cannot distinguish obs1 from obs2, at least not in some
particular context (e.g. in the context of having observed obs3).
Suppose then there are two observations in bin t1, obsA and
obsB. Either of these could be obs1, and
then the other would be obs2. We have
an utter lack of knowledge here.
One way to phrase this situation is to postulate a number of
different parallel “universes”, so that instead of a proposition
P = “obsA = obs1”
having a truth value on its own, it will have a relative
truth value
tv(P,U)
where U is a certain universe.
The use of the word "universe" here is
questionable. But that's a fairly boring
point, I think -- the application of human language to these rarefied domains
is bound to be confusing. Had Everett
referred to the multiple universes of quantum theory as "potentiality
planes" or some such his theory would have been much less popular even
with the exact same content. "Many
potentiality planes hypothesis" just doesn't have the same zing as
"Many worlds" or "Many universes"! But instead of saying there are many
worlds/universes, you could just as well say that there are many potentiality
planes, and the oddity of quantum theory is that it doesn't explain how any of
them becomes actual -- it just shifts around the weights of the different
potentialities. So either actuality is a
bogus concept, or it needs to be explained by something other than quantum
theory (in the Everett approach, as opposed to e.g. the von Neumann approach in
which there is a "collapse" operation that translates certain potentialities
into actualities).
Verbiage aside, the
above gives rise to the question: What are the logical requirements for these
values tv(P,U)?
It seems to me that, via looking at the basic commonsensical
symmetry properties of the tv(P,U) we can show that these must be complex
numbers, thus arriving at a sort of abstract logical derivation of Youssef’s
complex truth values.
We start out with the requirement that sometimes tv(P,U) and
tv(P,U1) should be different ... so that the dependence on U is not degenerate.
Tuplizing Knuth and Skilling
The rest of my argument follows Knuth and Skilling's
beautiful paper "Foundations of Inference." What I would like to do is follow their
arguments, but looking at valuations that are tuples (e.g. pairs) of reals
rather than single reals. This section
of the blog post will only make sense to you after you read their paper
carefully.
I note that Goyal, Knuth and Skilling have used a different method
to argue why complex quantum amplitudes exist, based on extending their
elementary symmetry arguments from "Foundations of Inference" in a
different way than I suggest here. You
can find their paper here.
I think that is a fantastic paper,
but yet I feel like it doesn't quite get at the essence. I think one can get complex probabilities
without veering that far from the very nice ideas in "Foundations of
Inference," and without introducing so many other complications. However, most likely the way to make the ideas in this blog post fully rigorous would be to borrow and slightly tweak a bunch of the math from the Goyal, Knuth and Skilling paper. We're all sorta rearranging the same basic math in related ways.
To avoid problems with Wordpress plugins, I have deviated in
notation from Knuth and Skilling a little here:
·
where they use a plus in a circle, I use +.
·
where they use an x with a circle in it, I use
x.
·
where they use the typical "times"
symbol for direct product, I use X
·
where they use a period with a circle around it,
I use o.
Mostly: their "circle
around" an operator has become my "period after."
Consider a space of tuples of
ordered lattice elements, e.g.
x = (x1, x2, ..., xk)
where each xi is a lattice element
Define a partial order on such
tuples via
x < y iff { x1 < y1 and .... and xk
Note, this is not a total order. For instance, one complex number x is less
than another complex number y if both the real and imaginary parts of x are
less than the corresponding parts of y.
If the real part of x is bigger than the real part of y, but the complex
part of x is smaller than the complex part of y, then x and y are not
comparable according to this partial order.
Next, define join on tuples as
e.g.
(x1,x2) OR (y1, y2) = (x1 OR y1 , x2 OR y2)
and define cross on tuples as e.g.
(x1, x2) X (y1, y2) = (x1 X y1, x2 X y2)
Next, define a valuation on tuples, where x^ denotes the value
associated with a tuple. Suppose that
values are tuples of real numbers.
We will define addition on value
tuples via
x^ +. y^ = x OR y
and multiplication on tuples via
x^ *. y^ = x X y
Consider a chain [x,t] of two
tuples so that x
We may suppose chaining is
associative, so that
[[x, y], [y, z]] , [z, t] =
[x, y], [[y, z], [z, t]]
We may associate each chain with a value tuple p(x|t);
associativity then implies
p(x|z) = p(x|y) .o p(y,|z)
where .o represents a composition operator.
In terms of chains, what the partiality of the order <
means is that sometimes two lattice-tuples can't be arranged in a chain in
either order -- they are in this sense "logically incommensurable";
neither one implies nor refutes the other.
That is Knuth and Skilling's setup, ported to tuples of
lattice elements rather than individual lattice elements, in what seems to me
the most straightforward way
Now the question is how many of their arguments carry over
to the case of tuples I'm considering here.
I haven't had time to do the calculations, but after some thinking, my
intuitive conclusion is that they probably all should. (If I'm wrong please tell me why!)
Symmetries 1-5 and Axioms 1-5 seem to all work OK, based on
a quick think. The symmetries involving
< must be considered as restricted to those cases where the partial order
< actually holds between the tuples involved.
And now I verge into some sketchier
educated guesses.
First it seems to me that their Associative Theorem
(Appendix A) should still work OK on tuples.
It should work because if there is a different isomorphism Psi
regarding each component of a tuple, then the tuple of these component-wise
isomorphisms should be an isomorphism on tuples of numbers. Plus acts componentwise anyway.
Second (the bigger leap), it seems to me that their Multiplication
Theorem (Appendix B) should work for complex numbers, much like it does for
real numbers. But their proof will
obviously need to be altered a lot to handle this case -- i.e. the case where the
function Psi in their multiplication functional equation maps into pairs of
reals rather than single reals.
It's clear that the complex exponential fulfills this
pairwise version of their "multiplication equation." So one
direction works, trivially. Uniqueness
is the trickier part.
If the Multiplication Theorem holds for pair-valued Psi,
yielding the complex exponential, this would explain very satisfyingly (to me) why
quantifying propositions with pairs leads to complex number values.
Basically: We want multiplication to be morphic to direct
product, but to do that on pairs you arrive at complex numbers (because it's
the only way to multiply pairs that has the needed symmetries -- or so I am
conjecturing....).
But why complex probabilities, rather than quaternionic or
octonionic ones? The latter would not
be associative, hence could not be morphic to the direct product. The
argument to rule out quaternionic probabilities may be subtler, as
commutativity is not strictly needed for Knuth and Skilling's arguments. On the other hand, I suspect that given the
weaker sort of ordering I've used for my tuples, commutativity may end up being
needed to make some of the arguments work.
This needs more detailed digging.
So -- in brief -- the use of complex numbers emerges from
the realization that single numbers are not enough, but that if we go too far beyond
the complex numbers, we can't have the (join, direct product) algebra
anymore. But the (join, direct product)
algorithm is assumed as elemental: join is just set union, and the direct
product is just the elementary "taking of all combinations" of
elements in two sets.
Summary
To recap:
We start with basic symmetries of lattices, because logical
propositions are basic, and propositions form a lattice.
We want to associate tuples of numbers with tuples of
lattice elements, because it's nice to be able to measure and compare things
quantitatively.
We want to combine these number tuples in ways that are morphic
to join and direct product.
But to get this morphism to work for the case where the
tuples are pairs, we get the complex numbers.
And we (I think) can't get it to
work for tuples bigger than pairs.
But we need tuples that are at least pairs, to model the
case where multiple possibilities cannot be distinguished and must be
considered in parallel.
So we must value propositions with complex numbers.
Quod Et Handwavium....
(QEH. I like that!
Surely not original....)
Propositional Logic as Pre-Physics
I am reminded of something I read when I was 16 years old, back
in the early 1980s, reading through Gravitation, the classic General Relativity
text by Misner, Thorne and Wheeler. (I
didn't understand it fully the first time through -- I gradually grokked it
better over the next 2.5 years as I went through undergrad and grad courses on
differential geometry -- but it was good for me to get a basic view of GR first
to guide my study of differential geometry.)
One of the nice aspects that book, at least for me on that first reading,
was the large number of digressive asides in inset boxes. One of these asides regarded Wheeler's
speculative idea of "propositional logic as pregeometry". He was speculating that, somehow, the geometry of spacetime would emerge from
the large-scale statistics of logical propositions. This notion has stuck in my mind ever since.
The emergence of general relativity -- and hence the
geometry of spacetime -- from large-scale statistics has been a topic of lots
of recent attention ("gravity as entropy", etc. -- I have exploited
this in my speculations on causal webs, for example). Large-scale statistics of what? Of underlying quantum reality. But of course Wheeler already knew very well
that quantum mechanics was modeled using Boolean lattices, and that lattice
structure modeled the structure of logical propositions.
In the ordinary quantum logic framework, one looks at meet
and join operations and constructs a non-Boolean logic in this way. In Youssef's exotic probability framework, one
sticks with the Boolean lattice, and then puts complex valuations on the
elements of the Boolean lattice.
One thing that Knuth and Skilling show is that the algebra
of joins and direct products is important.
They show that this algebra morphs directly to the algebra of numerical
sums and products on (real-valued) probabilities. What I suggest (but have only hand-waved
about, not yet actually shown rigorously) is that, if one looks at tuples of
lattice elements, then this algebra (joins and direct products) maps directly
to the algebra of numerical sums and products on complex-valued probabilities. Thus getting Youssef's exotic probability
formulation of quantum mechanics out of some basic symmetries.
I still feel Wheeler's intuition was basically right, but,
as often occurs, getting the details right involves multiple complexities...
No comments:
Post a Comment