Saturday, March 22, 2014

Lessons from Deep Mind & Vicarious

Recently we've seen a bunch of Silicon Valley money going into "deep learning" oriented AI startups -- an exciting trend for everyone in the AI field.  Even for those of us who don't particularly aspire to join the Silicon Valley crowd, the symbolic value of these developments is dramatic.   Clearly AI is getting some love once again.

The most recent news is a US$40M investment from Mark Zuckerberg, Elon Musk, Ashton Kutcher and others into Vicarious Systems, a "deep learning computer vision" firm led by Dileep George, who was previously Jeff Hawkins' lead researcher at Numenta.

A couple months ago, the big story was Google acquiring London deep reinforcement learning firm Deep Mind for something like US$500M.   Many have rumored this was largely an "acqui-hire", but with 60 employees or so, that would set the price per employee at close to US$10M, way above the $1M-$2M value assigned to a Silicon Valley engineer in a typical acqui-hire transaction.   Clearly a tightly-knit team of machine learning theory and implementation experts is worth a lot to Google these days, dramatically more than a comparable team of application programmers.

Both of these are good companies led by great researchers, whom I've admired in the past.   I've met Deep Mind's primary founder, Demis Hassabis, at a few conferences, and found him to have an excellent vision of AGI, plus a deep knowledge of neuroscience and computing.   One of Deep Mind's other founders, Shane Legg, worked for me at Webmind Inc. during 1999-2001.   I know Dileep George less well; but we had some interesting conversations last summer in Beijing, when at my invitation he came to speak at the AGI-13 conference in Beijing.

Vicarious's focus so far has been on visual object recognition --- identifying what are the objects in a picture.  As Dileep described his progress at AGI-13: Once they crack object recognition, they will move onto recognizing events in videos. Once they crack that, they will move on to other aspects of intelligence.   Dileep, like his mentor Jeff Hawkins, believes that perceptual data processing is the key to general intelligence... and that vision is the paradigm case of human perceptual data processing...

Zuckerberg's investment in Vicarious makes a lot of sense to me.  Given Facebook's large trove of pictures and the nature of their business, it seems they would have great value for software that can effectively identify objects in pictures.

Note that Facebook just made a big announcement about the amazing success of their face recognition software, which they saddled with the probably suboptimal name "Deep Face" (a bit Linda Lovelace, no?).  If you dig into the research paper behind the press release, you'll see that DeepFace actually uses a standard, well known "textbook" AI algorithm (convolutional neural nets) -- but they deployed it across a huge amount of data, hence their unprecedented success...

Lessons to Learn?

So what can other AGI entrepreneurs learn from these recent big-$$ infusions to Deep Mind (via acquisition) and Vicarious (by investment)?

The main lesson I take from this is the obvious one, that a great really working demo (not a quasi faked up demo like one often sees) goes a long way...

Not long ago Vicarious beat CAPTCHA -- an accomplishment very easy for any Internet user to understand

On the other hand, the Deep Mind demo that impressed Larry Page was the ability to beat simple video games via reinforcement learning

Note that (analogously to IBM Watson), both of these demos involve making the AI meet a challenge that was not defined by the AI makers themselves, but was rather judiciously plucked from the space of challenges posed by the human world....

I.e.: doing something easily visually appreciable, that previously only humans could do...

Clearly Deep Mind and Vicarious did not excel particularly in terms of business model, as compared to many other firms out there...

Other, also fairly obvious points from these acquisitions are:
  1. For an acquihire-flavored acquisition at a high price, you want a team of engineers in a First World country, who look like the profile of people the acquiring company would want to hire.
  2. Having well-connected, appropriately respected early investors goes a long way.  Vicarious and Deep Mind both had Founders Fund investment.   Of course FF investment didn't save Halcyon Molecular, so it's no guarantee, but having the right early-stage investors is certainly valuable..


Bubble or Start of a Surge?

And so it goes.  These are interesting times for AI, indeed.    

A cynic could say it's the start of a new AI bubble -- that this wave of hype and money will be followed by disappointment in the meager results obtained by all the effort and expense, and then another "AI winter" will set in.

But I personally don't think so.   Whether or not the Vicarious and Deep Mind teams and technologies pay off big-time for their corporate investors (and I think they do have a decent chance to, given the brilliant people and effective organizations involved), I think the time is now ripe for AI technology to have a big impact on the world. 
DeepFace is going to be valuable for Facebook; just as machine learning and NLP are already being valuable for Google in their core search and ads businesses, and will doubtless deliver even more value with the infusion of the Deep Mind team, not to mention Ray Kurzweil's efforts as a Google Director of Engineering.

The love that Silicon Valley tech firms are giving AI is going to help spur many others all around the world to put energy into AI -- including, increasingly, AI projects verging on AGI -- and the results are going to be amazing.


Are Current Deep Learning Methods Enough for AGI?

Another lesson we have learned recently is that contemporary "deep learning" based machine learning algorithms, scaled up on current-day big data and big hardware, can solve a lot of hard problems.

Facebook has now pretty convincingly solved face recognition, via a simple convolutional neural net, dramatically scaled.   Self-driving cars are not here yet -- but a self-driving car can, I suspect, be achieved via a narrow-AI integration of various components, without any general intelligence underlying.   IBM Watson beat Jeopardy, and a similar approach can likely succeed in other specialized domains like medical diagnosis (which was actually addressed fairly well by simpler expert systems decades ago, even without Watson's capability to extract information from large bodies of text).  Vicarious, or others, can probably solve the object recognition problem pretty well, even with a system that doesn't understand much about the objects it's recognizing -- "just" by recognizing patterns in massive image databases.

Machine translation is harder than the above two areas, but if one is after translation of newspaper text or similar, I suppose it may ultimately be achievable via statistical ML methods.  Although, the rate of improvement of Google Translate has not been that amazing in recent years -- it may have hit a limit in terms of what can be done by these methods.  The MT community is looking more at hybrid methods these days.

It would be understandable to conclude from these recent achievements, that these statistical machine learning / deep learning algorithms basically have the AI problem solved, and focus on different sorts of Artificial General Intelligence architectures is unnecessary.

But such a conclusion would not be correct.   It's important to note that all these problems I've just mentioned are ones that have been focused on lately, precisely because they  can be addressed fairly effectively by narrow-AI statistical machine learning methods on today's big data/hardware...

If you picked other problems like 
  • being a bicycle messenger on a crowded New York Street
  • writing a newspaper article on a newly developing situation
  • learning a new language based on real-world experience
  • identifying the most meaningful human events, among all the interactions between people in a large crowded room
then you would find that today's statistical / ML methods aren't so useful...

In terms of my own work with OpenCog, my goal is not to outdo CNNs or statistical MT on the particular problems for which they were developed.  The goal is to address general intelligence...

The recent successes  of deep learning technology and other machine learning / statistical learning approaches are exciting, in some cases amazing.  Yet these technologies address only certain aspects of the broader AI problem.

One hopes that the enthusiasm and resource allocation that the successes of these algorithms are bringing, will cause more attention, excitement and funding to flow into the AI and AGI worlds as a whole, enabling more rapid progress on all the different aspects of the AGI problem.


Unknown said...
This comment has been removed by the author.
Unknown said...


There is an air of optimism here. Do you think that this might be a stepping stone toward your "Sputnik moment"?

Dean Horak said...


I agree that for the most part the investments and advances are narrow AI only and will be unlikely to address AGI domain problems. However, I'm not so sure that is the case for Vicarious. While their technology is held close to the vest, an understanding of Dileep's research history and statements made in the years since his split from Numenta gives us some strong clues into what they are doing. As I read it, they are focusing their Recursive Cortical Network on vision initially because the vision system in the brain is one of the best understood and researched functional regions. However, the algorithms developed for this region are likely to be adaptable to other functional regions at some future point. I see this as a plausible approach to AGI - building up various disparate functional attributes, conserving methods as much as possible, extending as needed, until coverage of all major aspects are addressed.

Additionally, by the time this coverage of higher level cognitive features are achieved, the narrow AI technologies such as basic speech, handwriting, language and facial recognition will have advanced to the point that all the hard work will have been done, so that the AGI layer can simply be the icing on the cake.

Unknown said...

I like Tory Wright's question. Sputnik moment for AGI?

focus2000x said...

Neither Vicarious (RCN) nor DeepMind/Google (reinforcement
trick) nor OpenCog (DeSTIN) nor Numenta (HTM) nor any other
deep-learning-derivative (be it CNN-based or not) is capable
of solving common vision problems.

None of those systems scale enough to provide a generic
multidimensional correlator of the input data required.

All of them are just glorified low-hanging fruit pickers.

Unknown said...

Facebook also bought oculus rift (a vr headset) for 2 billion. Gaming is a huge market and I would love to see game characters in VR that are semi-intelligent!

Anonymous said...

If you actually bother to read Deep Face paper - you would have realised that Facebook did not use convnets. On the other hand, paper simultaneously(same week) published by Face++ did use convnets and outperformed Deep Face on the same dataset.

Anonymous said...

all purely "quantitative" (statistical, neural net, deep or otherwise) are not even close to being a paradigm for reasoning, language understanding, knowledge acquisition, not to mention real AI. How can so-called scientists dismiss thousands of years of foundational work in logic? Amazing. A new AI winter indeed... little knowledge is so dangerous

Mel said...

Hi ggreat reading your blog