To follow this blog by email, give your address here...

Friday, December 02, 2005

More Venting about Scientific Narrowmindedness and Superintelligent Guinea Pigs

I spent the day giving a talk about bioinformatics to some smart medical researchers and then meeting with them discussing their research and how advanced narrow-AI informatics tools could be applied to help out with it.

AAARRRGGHHH!!! Amazing how difficult it is to get even clever, motivated, knowledgeable biologists to understand math/CS methods. The techniques I presented to them (a bunch of Biomind stuff) would genuinely help with their research, and are already implemented in stable software -- there's nothing too fanciful here. But the "understanding" barrier is really hard to break through -- and I'm not that bad at explaining things; in fact I've often been told I'm really good at it....

We'll publish a bunch of bioinformatics papers during the next year and eventually, in a few more years, the techniques we're using (analyzing microarray and SNP and clinical data via learning ensembles of classification rules; then data mining these rule ensembles, and clustering genes together based on whether they tend to occur in the same high-accuracy classification rules, etc.) will become accepted by 1% or 5% of biomedical researchers, I suppose. And in 10 years probably it will all be considered commonplace: no one will imagine analyzing genetics data without using such techniques....

Whether Biomind will manage to get rich during this process is a whole other story -- it's well-known that the innovative companies at the early stage of a revolution often lose out financially to companies that enter the game later once all the important ideas have already been developed. But finances aside, I'm confident that eventually, little by little, the approach I'm taking to genetic data analysis will pervade and transform the field, even if the effect is subtle and broad enough that I don't get that much credit for it....

And yet, though this Biomind stuff is complex enough to baffle most bioinformaticists and to be really tough to sell, it's REALLY REALLY SIMPLE compared to the Novamente AI design, which is one or two orders of magnitude subtler. I don't think I'm being egomaniacal when I say that no one else has really appreciated most of the subtlety in the Novamente design -- not even the other members of the Novamente team, many of whom have understood a lot. Which is verrrry different from the situation with Biomind: while the Biomind methods are too deep for most biologists, or most academic journal referees who review our papers, to understand, everyone on the Biomind team fully "gets" the algorithms and ideas.

Whether the subtlety of the Novamente design ever gets to be manifested in reality remains to be determined -- getting funding to pay a small team to build the Novamente system according to the design remains problematic, and I am open to the possibility that it will never happen, dooming me (as I've joked before) to a sort of Babbagedom. What little funding there is for AGI-ish research tends to go to folks who are better at marketing than I am, and who are willing to tell investors the story that there's some kind of simple path to AGI. Well, I don't think there is a simple path. There's at least one complex path (Novamente) and probably many other complex paths as well; and eventually someone will follow one of them if we don't annihilate ourselves first. AGI is very possible with 3-8 years effort by a small, dedicated, brilliant software team following a good design (like Novamente), but if the world can't even understand relatively simple stuff like Biomind, getting any understanding for something like Novamente is obviously going to continue to be a real uphill battle!

Relatedly, a couple weeks ago I had some long conversations with some potential investors in Novamente. But the investors ended up not making any serious investment offer -- for a variety of reasons, but I think one of them was that the Novamente design was too complex for them to easily grok. If I'd been able to offer them some easily comprehensible apparent path to AGI, I bet they would have invested. Just like it would be easier to sell Biomind to biologists if they could grok the algorithms as well as the Biomind technical team. Urrrghh!

Urrrgghhh!! urrrgghh!! ... Well, I'll keep pushing. There are plenty of investors out there. And the insights keep coming: interestingly, in the last few days a lot of beautiful parallels have emerged between some of our commercial narrow-AI work in computational linguistics and our more fundamental work in AGI (relating to making Novamente learn simple things in the AGI-SIM simulation world). It turns out that there are nice mathematical and conceptual parallels between algorithms for learning semantic rules from corpuses of texts, and the process of learning the functions of physical objects in the world. These parallels tell us a lot about how language learning works -- specifically, about how structures for manipulating language may emerge developmentally from structures for manipulating images of physical objects. This is exactly the sort of thing I want to be thinking about right now: now that the Novamente design is solid (though many details remain to be worked out, these are best worked out in the course of implementation and testing), I need to be thinking about "AGI developmental psychology," about how the learning process can be optimally tuned and tailored. But instead, to pay the bills and send the kids to college yadda yadda yadda, I'm trying to sell vastly simpler algorithms to biologists who don't want to understand why it's not clever to hunt for biomarkers for a complex disease by running an experiment with only 4 Cases and 4 Controls. (Answer: because complex diseases have biomarkers that are combinations of genes or mutations rather than individual genes/mutations, and to learn combinational rules distinguishing one category from another, a larger body of data is needed.)

Ooops! I've been blogging too long, I promised Scheherazade I would go play with her guinea pigs with her. Well, in a way the guinea pigs are a relief after dealing with humans all day ... at least I don't expect them to understand anything. Guinea pigs are really nice. Maybe a superintelligent guinea pig would be the ultimate Friendly AI. I can't remember ever seeing a guinea pig do anything mean, though occasionally they can be a bit fearful and defensive....

3 comments:

Arthur said...

Somebody ought to put some tags on Artificial General Intelligence (Cognitive Technologies), just like with The Singularity Is Near and the AI4U textbook.

Joel said...

There seems to be many wide tracks of research that are open for people who can grokk two or more fields and gain insights from combining them. I also think this is where laot of innovation tends to come from, since truly original ideas are quite rare but many ideas have applications outside of the field in which they orginated.

Patrick Dugan said...

You're certainly correct about the Novamente design beign very complex, but in a sense you are an artist in your sole integrated understanding of it. As for marketing Novamente and simpler apps, a good excercise is to try and get the gist of the method and function into one easy phrase in the language of your target audience, something like "simulates genetic patterns of epidemeological iterations and tracks emerging problems statistically" which isn't that great actually, but its a sell line biologists might be able to digest. Likewise if I were to sell a ground breaking game to the public entertianment market, I wouldn't talk about the algorithms that actually make it groundbreaking so much as the feel of the experience.

Sometimes is helps to bend the truth into a bite-sized pretzel.