Saturday, September 10, 2016

In What Sense Does Deep Learning Reflect the Laws of Physics?


“Technology Review” is making a fuss about an article by Lin and Tegmark on why deep learning works.   To wit:

Physicists have discovered what makes neural networks so extraordinarily powerful
Nobody understands why deep neural networks are so good at solving complex problems. Now physicists say the secret is buried in the laws of physics

It's a nice article, but as often happens, the conclusion is a bit more limited -- and rather less original -- than the popular media account suggests...

Stripping away the math, the basic idea they propose in their paper is a simple and obvious one: That the physical universe has a certain bias encoded into it, regarding what patterns tend to occur in the universe….   Some mathematically possible patterns are common in our physical universe, others are less common.

As one example, since the laws of physics limit communication between distant points but the universe is spread out, there tend to arise patterns involving multiple variables, many of which are only loosely dependent on each other.

As another example, hierarchical patterns are uncommonly common in our universe — because the laws of physics, at least in the regimes we’re accustomed to, tend to lead to the emergence of hierarchical structures (e.g. think particles building up atoms building up molecules building up compounds building up cells building up organisms building up ecosystems…).

Since the physical universe has certain habitual biases regarding what sorts of patterns tend to occur in it, it follows that a pattern recognition system that is biased to recognize THESE types of patterns, is going to be more efficient than one that has different biases.   It’s going to be inefficient for a pattern recognition system to spend a lot of time searching physical-world data for possible patterns that are extremely unlikely to occur in our physical universe, due to the nature of the laws of physics.

So -- this is a quite valid point, but not at all a new point — for instance I made that same point in this paper a few years ago  (presented at an IEEE conference on Human-Level Intelligence in Singapore, and published in the conference proceedings... and mostly reprinted in my book Engineering General Intelligence as part of the early preliminary material)…

Now my mathematical formalization of this idea was quite different than Lin and Tegmark’s, since I tend to be more abstract-mathy and computer-sciency than physicsy … what I said formally is

MIND-WORLD CORRESPONDENCE PRINCIPLE: For an organism with a reasonably high level of intelligence in a certain world, relative to a certain set of goals, the mind-world path transfer function is a goal-weighted approximate functor

Formalism aside, the basic idea here is that: If you have a system that is supposed to achieve a high degree of goal-achievement in a world with a certain habitual structure, then the best way for this system to do so using limited resources is to internally contain structures that are morphic to the habitual structures in the world*

I explicitly introduced the example of hierarchical structure in the world — and pointed out that intelligent systems trying to achieve goals in a hierarchical world will do best, using limited resources, if they internally have a hierarchical structure (in a way that manifests itself specifically in their goal-seeking behavior).

Deep neural networks are an example of a kind of system that manifests hierarchical structure internally in this way.

Certainly I am not claiming any sort of priority regarding this general conceptual point, though — I am sure others made that same point way before I did, expressing it in different language...

One also shouldn’t overestimate the importance of this sort of point, though.  Lin and Tegmark point out that "properties such as symmetry, locality, compositionality and polynomial log-probability” come out of the laws of physics, and also are easily encoded into the structure of neural networks. This is all true and good … but of course self-organizing systems add a lot of complexity to the picture, so many patterns in the portion and level of the physical universe that is relevant to us, do NOT actually display these properties… which is why simply-structured neural networks like deep neural networks are not actually adequate for AGI....

Specifically, we may note that current deep neural networks do best at recognizing patterns in sensory data, which makes sense because sensory data (as opposed to stuff that is more explicitly constructed by mind and society) is more transparently  and directly structured via “physical law.”

It's cool to see the popular media, and more and more scientists from various disciplines, finally paying attention to these deep and important ideas....   But as more attention comes, we have to ward off oversimplification.  Tegmark and Lin are solid thinkers and smart people, and they know it's not so simple as "deep neural nets are the key to intelligence because they reflect aspects of the laws of physics" -- and they  may well even know that diverse others have made very similar points to theirs dozens of times over the preceding decades.  Let's just remember these are subtle matters, and there is still much to be understood -- and any one special class of algorithms and structures, like deep neural networks, is only going to be one modest part of the AGI picture, conceptually or pragmatically.  

17 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. For the same reason, our most coherent of morality will be hierarchical in the orthogonal domains of values and instrumental methods.

    ReplyDelete
  4. Basically we are trying to build an interpolating function between input and output variables. So we observe that intermediate variables aid the process. Why should some variables not be a function of other variables?

    ReplyDelete
  5. If memory serves, you once in an interview questioned correlation between intelligence and fitness. Many of your arguments have me considering the probability that normative complexity over time brings about greater intelligence as a logical conclusion. Your blog is always a good read. Thanks for sharing.

    ReplyDelete
  6. Excellent reply! Will come back to your blog for my research on VR as a learning tool.

    ReplyDelete
  7. Here is an alternative to deep learning that treats it as an interpolating function
    least-squares-fe-for-ann

    ReplyDelete
  8. Excellent idea. Thank you for sharing the useful information. Share more updates.
    Deep Learning with Tensorflow Online Course
    Pytest Online Training

    ReplyDelete
  9. Nice article, its very informative content..thanks for sharing...Waiting for the next update.

    LoadRunner Training in Chennai
    Loadrunner Course in Chennai
    Best Loadrunner Training Institute in Chennai

    ReplyDelete
  10. I read this blog, a very interesting blog and well explain about deep learning and AI. Keep sharing such type of post.

    Artificial intelligence Classes in Pune

    ReplyDelete
  11. Jump into a great learning experience of learning AWS Training in Chennai from Infycle Technologies, the finest software training Institute in Chennai. Also, a proper place to learn other technical courses like Cyber Security, Graphic Design and Animation, Block Security, Java, Cyber Security, Oracle, Python, Big data, Azure, Python, Manual and Automation Testing, DevOps, Medical Coding etc., and here we provide well-experienced trainers with excellent training to the freshers. And we also provide 100+ Live Practical Sessions with Real-Time scenarios which helps the students in learning the technical stuff easily and they are able to crack interviews in top MNC’s with an amazing package. for more queries approach us on 7504633633, 7502633633.

    ReplyDelete
  12. Thanks for providing Great Information , keep posting Java Classes In Pune

    ReplyDelete
  13. Machine literacy is trendy content in academia and business; new ways are always being created. Indeed for specialists, the speed and intricacy of the field make it delicate to keep up with new ways.
    Retrogression
    This system falls under the supervised literacy division of ML. Retrogression helps prognosticate the specific numerical value grounded on the former data set. For illustration, projecting any property’s price is grounded on any analogous property’s previous pricing data.
    The most introductory fashion is direct retrogression. The most habituated algorithm for nonstop data is this one. still, it restricts itself to a direct relationship and only considers the dependent variable’s mean. Time series analysis and trend soothsaying are two operations of direct retrogression. On the base of literal data, it can read unborn deals. . Machine learning classes in pune
    Machine learning training in pune Machine learning training in pune


    Machine learning classes in pune

    ReplyDelete