The Multiverse According to Ben: In What Sense Does Deep Learning Reflect the Laws of Physics?

Saturday, September 10, 2016

In What Sense Does Deep Learning Reflect the Laws of Physics?

“Technology Review” is making a fuss about an article by Lin and Tegmark on why deep learning works. To wit:

Physicists have discovered what makes neural networks so extraordinarily powerful
Nobody understands why deep neural networks are so good at solving complex problems. Now physicists say the secret is buried in the laws of physics

It's a nice article, but as often happens, the conclusion is a bit more limited -- and rather less original -- than the popular media account suggests...

Stripping away the math, the basic idea they propose in their paper is a simple and obvious one: That the physical universe has a certain bias encoded into it, regarding what patterns tend to occur in the universe…. Some mathematically possible patterns are common in our physical universe, others are less common.

As one example, since the laws of physics limit communication between distant points but the universe is spread out, there tend to arise patterns involving multiple variables, many of which are only loosely dependent on each other.

As another example, hierarchical patterns are uncommonly common in our universe — because the laws of physics, at least in the regimes we’re accustomed to, tend to lead to the emergence of hierarchical structures (e.g. think particles building up atoms building up molecules building up compounds building up cells building up organisms building up ecosystems…).

Since the physical universe has certain habitual biases regarding what sorts of patterns tend to occur in it, it follows that a pattern recognition system that is biased to recognize THESE types of patterns, is going to be more efficient than one that has different biases. It’s going to be inefficient for a pattern recognition system to spend a lot of time searching physical-world data for possible patterns that are extremely unlikely to occur in our physical universe, due to the nature of the laws of physics.

So -- this is a quite valid point, but not at all a new point — for instance I made that same point in this paper a few years ago (presented at an IEEE conference on Human-Level Intelligence in Singapore, and published in the conference proceedings... and mostly reprinted in my book Engineering General Intelligence as part of the early preliminary material)…

Now my mathematical formalization of this idea was quite different than Lin and Tegmark’s, since I tend to be more abstract-mathy and computer-sciency than physicsy … what I said formally is

MIND-WORLD CORRESPONDENCE PRINCIPLE: For an organism with a reasonably high level of intelligence in a certain world, relative to a certain set of goals, the mind-world path transfer function is a goal-weighted approximate functor

Formalism aside, the basic idea here is that: If you have a system that is supposed to achieve a high degree of goal-achievement in a world with a certain habitual structure, then the best way for this system to do so using limited resources is to internally contain structures that are morphic to the habitual structures in the world*

I explicitly introduced the example of hierarchical structure in the world — and pointed out that intelligent systems trying to achieve goals in a hierarchical world will do best, using limited resources, if they internally have a hierarchical structure (in a way that manifests itself specifically in their goal-seeking behavior).

Deep neural networks are an example of a kind of system that manifests hierarchical structure internally in this way.

Certainly I am not claiming any sort of priority regarding this general conceptual point, though — I am sure others made that same point way before I did, expressing it in different language...

One also shouldn’t overestimate the importance of this sort of point, though. Lin and Tegmark point out that "properties such as symmetry, locality, compositionality and polynomial log-probability” come out of the laws of physics, and also are easily encoded into the structure of neural networks. This is all true and good … but of course self-organizing systems add a lot of complexity to the picture, so many patterns in the portion and level of the physical universe that is relevant to us, do NOT actually display these properties… which is why simply-structured neural networks like deep neural networks are not actually adequate for AGI....

Specifically, we may note that current deep neural networks do best at recognizing patterns in sensory data, which makes sense because sensory data (as opposed to stuff that is more explicitly constructed by mind and society) is more transparently and directly structured via “physical law.”

It's cool to see the popular media, and more and more scientists from various disciplines, finally paying attention to these deep and important ideas.... But as more attention comes, we have to ward off oversimplification. Tegmark and Lin are solid thinkers and smart people, and they know it's not so simple as "deep neural nets are the key to intelligence because they reflect aspects of the laws of physics" -- and they may well even know that diverse others have made very similar points to theirs dozens of times over the preceding decades. Let's just remember these are subtle matters, and there is still much to be understood -- and any one special class of algorithms and structures, like deep neural networks, is only going to be one modest part of the AGI picture, conceptually or pragmatically.

17 comments:

Jef Allbright said...: This comment has been removed by the author.; 12:44 PM
Jef Allbright said...: This comment has been removed by the author.; 12:45 PM
Jef Allbright said...: For the same reason, our most coherent of morality will be hierarchical in the orthogonal domains of values and instrumental methods.; 1:29 PM
marcalpv said...: Basically we are trying to build an interpolating function between input and output variables. So we observe that intermediate variables aid the process. Why should some variables not be a function of other variables?; 1:50 PM
Unknown said...: If memory serves, you once in an interview questioned correlation between intelligence and fitness. Many of your arguments have me considering the probability that normative complexity over time brings about greater intelligence as a logical conclusion. Your blog is always a good read. Thanks for sharing.; 4:55 AM
Robin de Lange said...: Excellent reply! Will come back to your blog for my research on VR as a learning tool.; 10:36 AM
marcalpv said...: Here is an alternative to deep learning that treats it as an interpolating function
least-squares-fe-for-ann; 12:19 PM
hema said...: Excellent idea. Thank you for sharing the useful information. Share more updates.
Deep Learning with Tensorflow Online Course
Pytest Online Training; 6:31 AM
Anjudevan said...: Nice article, its very informative content..thanks for sharing...Waiting for the next update.

LoadRunner Training in Chennai
Loadrunner Course in Chennai
Best Loadrunner Training Institute in Chennai; 2:41 AM
Artificial intelligence Course in Pune said...: I read this blog, a very interesting blog and well explain about deep learning and AI. Keep sharing such type of post.

Artificial intelligence Classes in Pune; 8:24 AM
Lokeswari said...: Nice blog, it's so knowledgeable, informative, and good looking site. I appreciate your hard work. Good job. Thank you for this wonderful sharing with us. Keep Sharing.

web designing course in chennai | online internships for civil engineering students | online internship for mechanical engineering | online internship for mba students | online internship for computer science students | online internship for biotech students | internships for ece students | internship for electrical engineering student | internship for ece students; 7:09 AM
periyannan said...: Nice blog, it's so knowledgeable, informative, and good looking site. I appreciate your hard work. Good job. Thank you for this wonderful sharing with us. Keep Sharing....

what internship is all about
where to do internship
what internship should i do
How will internship benefit you
internship providing companies in chennai
internship program in chennai
How internship works
internship permission letter
internship with training
internship meaning in tamil; 2:47 AM
sharmi kaashiv infotech said...: great article...
its really informative...

internship completeion letter , internship certificate online , internship offering companies , internship offer letter , internship acceptance letter , internship and apprenticeship difference , how many internships should i do , internship and inplant training difference , internship guidelines for students , why internship is necessary; 6:19 AM
Sarvesh said...: Jump into a great learning experience of learning AWS Training in Chennai from Infycle Technologies, the finest software training Institute in Chennai. Also, a proper place to learn other technical courses like Cyber Security, Graphic Design and Animation, Block Security, Java, Cyber Security, Oracle, Python, Big data, Azure, Python, Manual and Automation Testing, DevOps, Medical Coding etc., and here we provide well-experienced trainers with excellent training to the freshers. And we also provide 100+ Live Practical Sessions with Real-Time scenarios which helps the students in learning the technical stuff easily and they are able to crack interviews in top MNC’s with an amazing package. for more queries approach us on 7504633633, 7502633633.; 2:46 AM
santosh said...: Thanks for providing Great Information , keep posting Java Classes In Pune; 3:15 AM
Anonymous said...: Thanks for sharing.
Data science course in Nagpur; 3:12 AM
gauri said...: Machine literacy is trendy content in academia and business; new ways are always being created. Indeed for specialists, the speed and intricacy of the field make it delicate to keep up with new ways.
Retrogression
This system falls under the supervised literacy division of ML. Retrogression helps prognosticate the specific numerical value grounded on the former data set. For illustration, projecting any property’s price is grounded on any analogous property’s previous pricing data.
The most introductory fashion is direct retrogression. The most habituated algorithm for nonstop data is this one. still, it restricts itself to a direct relationship and only considers the dependent variable’s mean. Time series analysis and trend soothsaying are two operations of direct retrogression. On the base of literal data, it can read unborn deals. . Machine learning classes in pune
Machine learning training in pune Machine learning training in pune

Machine learning classes in pune; 11:59 PM