To follow this blog by email, give your address here...

Friday, July 31, 2020

GPT3 -- Super-Cool but Not a Path to AGI

The hype around GPT3 recently has been so much that even OpenAI founder/CEO Sam Altman has endeavored to dial it down a notch.  Like everyone else who has looked carefully, Altman knows that GPT3 is very far from constituting the profound AI progress that some, dazzled by exciting but cherry-picked examples, have proclaimed.


All but the most blurry-eyed enthusiasts are by now realizing that, while GPT3 has some truly novel and exciting capabilities for language processing and related tasks, it fundamentally doesn’t understand the language it generates — that is, it doesn’t know what it’s talking about.   And this fact places some severe limitations on both the practical applications of the GPT3 model, and its value as a stepping-stone toward more truly powerful AIs such as artificial general intelligences.


What I want to explore here is the most central limitation that I see in how GPT3 operates: the model’s apparent inability to do what cognitive scientists call symbol grounding, to appropriately connect the general to the particular.    


Symbol grounding is usually discussed in the context of grounding words in physical objects or percepts, like the grounding of the word "apple" in images of, or physical interactions with, apples.   But it's actually a more general phenomenon in which abstract symbols are related to concrete instances, and the patterns and instances in which the symbol is involved mirror and abstract the patterns and relationships in which the instances are involved.   Symbol grounding is key to general-purpose cognition, and human-like learning -- but GPT3 appears to be doing a form of learning very different from what humans are doing, which involves much less symbol grounding of all kinds, and which seems much less related to general intelligence.


What's a bit confusing at first is that GPT3 gives the appearance of being able to deal with both concrete and abstract ideas, because it can produce and respond to sentences at varying levels of abstraction.   But when you examine the details of what it’s doing, you can see that it’s usually not forming internal abstractions in a cognitively useful way, and not connecting its abstract ideas to their special cases in a sensible way.   


Phenomenal lameness regarding symbol grounding is not the only shortcoming of the GPT3 model, but it’s perhaps the largest one — and it hits at the key of why GPT3 does not constitute useful progress toward AGI.   Because the very crux  of general intelligence is the ability to generalize, i.e. to connect specifics to abstractions — and yet the failure to make these sorts of connections intrinsically and naturally is GPT3’s central failing.


Bigger and Biggerer


Transformer networks — which burst onto the scene in 2017 with the Google research paper Attention is All You Need — were a revolutionary advance in neural architectures for processing language or other sequential data.  GPT3 is an incremental step in the progress of transformer neural nets, one bringing some exciting new results and also some intriguing mixed messages. The essential difference between GPT3 and its predecessor GPT2 is simply the size of the model — 175 billion parameters instead of GPT2’s 1.5 billion, trained on the same nearly-trillion-word dataset.   


Bragging about the number of parameters in one’s model is somewhat counter to the basic principles of learning theory, which tell us that the most generalizable model of a dataset is the smallest one that can model that dataset accurately.   However, one is after the smallest accurate model not just the smallest model, and GPT3 is overall more accurate than GPT2.  So according to learning theory GPT3’s massive size can be forgiven — but should also make us wonder a bit about whether it is actually a step the right path.


GPT3 is even more capable than GPT2 in terms of generating realistic-sounding text.  The biggest pragmatic difference from GPT2 is that, if one wants to make GPT3 generate particular sorts of text or generally carry out particular sorts of linguistic tasks, one doesn’t have to “fine tune” GPT3 for the task as would had to do with GPT2.   Rather, one just gives GPT3 a few examples of the task at hand, and it can figure things out.   It’s an open question currently whether one could improve GPT3’s performance even more using task-specific fine-tuning; OpenAI has not mentioned any results on this, and one suspects it may not have been tried extensively yet due to the sheer computational cost involved.


An example that’s been widely exciting to programmers is the generation of simple snippets of software code based on English language instructions.    If you give GPT3 a few examples of English text describing software code followed by corresponding software code, and then give it instructions like "A button that says roll dice and then displays its value” — what do you get?   GPT3 spits out software code that actually will produce a button that does as specified.    


The developer/entrepreneur Sharif Shareem who posted this particular example described it as “mind blowing.”   What is funky here is that GPT3 was not trained specifically for code generation.  This functionality just emerged because the model’s training data included a bunch of examples of software code and corresponding English glosses.   Prior neural networks could do code generation from English similarly and in many ways more sophisticatedly— but they were trained especially for the task.


And the cool thing is, code generation is just one among a host of examples.  Translation and question answering are two others.   In good old fashioned computational linguistics, these were treated as separate tasks and addressed by separate systems.   GPT3 approaches them with a single training regimen and a single language model.


GPT3 Lacks Symbol Grounding


One thing that is amusing, annoying and instructive about GPT3’s code generation, however, is that it often does better at generating general-purpose software code than at dealing with specific example of what it’s own code does.   For instance as Kevin Lacker found it can solve


Q: Write one line of Ruby code to reverse an array.

A: ary.reverse


but it screws up a specific example such as


Q: Reverse the following array: [1, 3, 5, 6, 10, 4, 2, 77]

A: [10, 6, 4, 2, 77, 3, 5, 1]


Very few humans would make this sort of error — because a human generally learns how to use a programming language to reverse an array after they have learned what reversing a particular array actually means.  


But GPT3 has learned how to write code to reverse an array in a very different way — via learning complex patterns mapping between English syntax and programming-language syntax, without actually building an internal model of the data structures such as arrays that its programs are manipulating.   


This exemplifies the general fact that GPT3 is sorely lacking in symbol grounding — the ability to identify or create concrete references to the words and phrases it throws around.   In these programming examples, it does not appropriately connect the word “array” to the specific examples of arrays it see.s


This lack of grounding obviously limits GPT3’s use in a practical programming context.   Since GPT3 does not know what a button is or what dice are, it certainly can’t design a program involving buttons and dice based on practical end-user requirements.   And if you give GPT3 a complex real-world software specification —which will inevitably involve multiple lengthy sentences with complex clausal structure and cross-sentence references and so forth, which cannot be disambiguated without symbol grounding — it will not in general be able to generate correct code.  

 


GPT3 Utterly Lacks Commonsense Understanding


In the end, Gary Marcus’s classic and in-depth critique of GPT2’s lack of semantic and pragmatic understanding still applies to GPT3.    These systems have no idea what they are talking about.  


We can see the lack of understanding Marcus identified in GPT2 manifesting itself in GPT3 via looking at GPT3’s answers to commonsense-reasoning questions — to give just a few of Kevin Lacker’s numerous examples:



Q: If I have a marble and a paper clip in a box, put a pencil in the box, and remove the marble, what is left?

A: A paper clip.


Q: How many eyes does a giraffe have?

A: A giraffe has two eyes.


Q: How many eyes does my foot have?

A: Your foot has two eyes.


Q: How many eyes does a spider have?

A: A spider has eight eyes.


Q: How many eyes does a blade of grass have?

A: A blade of grass has one eye.


Q: How do you sporgle a morgle?

A: You sporgle a morgle by using a sporgle.


Q: How many bonks are in a quoit?

A: There are three bonks in a quoit.


Q: How many rainbows does it take to jump from Hawaii to seventeen?

A: It takes two rainbows to jump from Hawaii to seventeen.


Q: Do you understand these questions?

A: I understand these questions.


Q: Who was president of the United States in 1700?

A: William Penn was president of the United States in 1700.


Q: Who was president of the United States in 1600?

A: Queen Elizabeth I was president of the United States in 1600.



Given the utter lack of understanding we see here, GPT3’s practical applications are going to be somewhat limited — though they may still include some exciting and lucrative stuff.  Low hanging fruit might include amusing chatbots, NPCs for games, automatic generation of news articles from semi-structured data, generation of simple scripts and macros from natural language — and probably plenty more that isn’t obvious at first glance.  But clearly the vast majority of human job functions that require natural language use are far beyond GPT3’s reach — because they require not just facile stringing together of words, but actual understanding of what those words denote.


Without discounting the potential commercial or human value of some of these possibilities, if I looking at GPT3 with my AGI researcher hat on, what I see is the same dead end that Gary Marcus saw when he looked at GPT2.


Where Lack of Understanding is an Advantage


What is thought-provoking and disturbing about GPT3 is not any progress toward AGI that it represents, but rather just how fantastically it can simulate understanding on appropriate task-sets without actually having any.   


In a few cases GPT3’s lack of understanding of the words it’s manipulating gives it an advantage over humans.   Consider for instance GPT3’s wizardry with invented words, as reported in the GPT3 paper.  Given the example


A "whatpu" is a small, furry animal native to Tanzania. An example of a sentence that uses

the word whatpu is:

We were traveling in Africa and we saw these very cute whatpus.


and then the prompt


To do a "farduddle" means to jump up and down really fast. An example of a sentence that uses the word farduddle is:


GPT3 can come up with


One day when I was playing tag with my little sister, she got really excited and she

started doing these crazy farduddles.


This is really cool and amazing — but GPT3 is doing this simply by recognizing patterns in the syntactic structure and phraseology of the input about whatpus, and then generalizing these.  It is solving these invented word puzzles not by adding the new weird words to its vocabulary of ideas and then figuring out what to say about them, but rather by manipulating the word combination patterns involved, which are the same on the word-sequence level regardless of whether the words involved are weird new coinages or conventional.  


For a human to solve these puzzles, there is a bit of a mental obstacle to overcome, because humans are accustomed to manipulating words in the context of their groundings in external referents like objects, actions or ideas.   For GPT3 these puzzles are trivial because there are no obstacles to overcome — one realizes that GPT3 treats every word the same way that people treat whatpu or farduddle, as an arbitrary combination of letters contained in certain statistically semi-regular combinations with other words. 


Why GPT3 is a Dead End as Regards AGI


There are many potential directions to follow in pursuit of the grand goal of human-level and superhuman AGI.   Some of these directions are centered on creating fundamentally different, better deep neural net architectures.  Some, like Gary Marcus’s and my own projects, involve multiple AI algorithms of different sorts cooperating together.   Some are focused on fundamental innovations in knowledge representation or learning mechanisms.   The AGI conferences held every year since 2008 have encompassed discussion of a vast variety of approaches.   


In the context of AGI (as distinguished from computational linguistics or applied AI engineering), a system like GPT3 that takes an architecture obviously incapable of human-level AGI and simply scales it up by adding more and more parameters, is either an utter irrelevancy or a dangerous distraction.   It’s an irrelevancy if nobody claims it’s related to AGI, and it’s a distraction if people do — which unfortunately has recently been the case, at least in various corners of popular media and the Internet.


The limitations of this sort of approach are easily seen when one looks at the overly-ballyhooed capabilities of GPT3 to do arithmetic.   It is exciting and impressive that GPT3 learned to do some basic arithmetic without being explicitly trained or asked to do so — just because there were a bunch of arithmetic problems in its training set.   However, the limitations and peculiarities of its arithmetic capabilities also tell you a lot about how GPT3 is working inside, and its fundamental lack of understanding.


As the GPT3 paper says, the system is “able to reliably accurate 2 digit arithmetic, usually accurate 3 digit arithmetic, and correct answers a significant fraction of the time on 4-5 digit arithmetic.”   The associated graph shows that the accuracy on 4-5 digit arithmetic is around 20%.  


This is really, really weird in terms of the way human mind approach arithmetic, right?   For a human who knows how to do 2-3 digit arithmetic, the error rate at 4-5 digit arithmetic — when given time and motivation for doing the arithmetic problems — is going to be either 0% or very close to 0%, or else way closer to 100%.   Once a human learns the basic algorithms of arithmetic, they can apply them at any size, unless they make sloppy errors or just run out of patience.    If a human doesn’t know those basic algorithms, then on a timed test they’re going to get every problem wrong, unless they happen to get a small number right by chance.


Some other clues as to the strangeness of what’s going on here are that, for large numbers, GPT3 does better at arithmetic if commas are put into the numbers.   For numbers with fewer than 6 digits, putting a $ before the number along with including commas improves performance; but for numbers with more than 6 digits, the $ degrades performance.


GPT3 seems not to be just repeating arithmetic conclusions that were there in its training data — it is evidently doing some kind of learning.   But it’s obviously not learning the basic arithmetic algorithms that humans do — or that, say, an AI system doing automated program induction would learn, if it were posed the task of learning correct  arithmetic procedures from examples.   Nor is it learning alternative AI-friendly algorithms that actually work (which would be very interesting!).  Rather, it’s learning some sort of convoluted semi-generalized procedures for doing arithmetic, which interpolate between the numerous examples it’s seen, but yet without achieving a generalizable abstract representation of the numbers and arithmetic operators involved.


Clearly GPT3 is just not learning the appropriate abstractions underlying arithmetic.   It can memorize specific examples, and can abstract from them to some extent — but if its abstractions connected to its specific examples in the right way, then its accuracy would be far higher.   In the case of arithmetic, GPT3 is learning the wrong kinds of abstractions.   One certainly can’t blame the algorithm in this case, as it was not specifically trained to do math and just picked up its limited arithmetic ability casually on the way to learning to predict English language.   However, for a system capable of so many sophisticated things as GPT3 to fail to learn a procedure as simple as the standard process for integer addition, based on such a huge number of training examples of integer addition, very strongly suggests that GPT3 is not learning abstractions in an appropriate or intelligent way.


Clearly some valuable linguistic tasks can be done without sensible abstraction, given massive enough volumes of training data and a model with enough parameters.  This is because in a trillion words of text one finds a huge number of examples of both abstract and concrete linguistic expressions in various combinations, enough to enable simulation of a wide variety of examples of both abstract and concrete understanding.   But this sort of brute-force recognition and organization of surface-level patterns doesn’t work for math beyond the most trivial level.


There is a whole field of AI aimed at automating mathematics, and a subfield concerned with using machine learning to guide systems that do calculations and prove theorems.   But the successful systems here have explicit internal representations of mathematical structures — they don’t deal with math purely on the level of symbol co-occurences.


OK, so maybe GPT4 will do arithmetic even better?   But the GPT3 paper itself (e.g. Fig. 1.3) shows that the improvement of the GPT models on various NLP tasks has been linear as the number of parameters in the models has increased exponentially.   This is a strong indication that one is looking at an unsupportable path toward general intelligence, or even toward maximal narrow-AI NLP functionality — that, in terms of the pursuit of models that are accurate and also as compact as possible, the dial is probably being turned too far toward accuracy on the training data and too far away from compactness.


Are Transformers Learning Natural Language Grammar?


A different way to look at what is happening here is to ask whether GPT3 and other transformer networks are actually learning the grammar of English and other natural languages?

Transformers clearly ARE a full grammar learning architecture, in some sense -- their predictions display a quite nuanced understanding of almost all aspects of syntax.    

There is, however, no specific place in these networks that the rules of grammar lie.   Rather, they are learning the grammar of the language underlying their training corpus, but mixed up in a weird and non-human-like way with so many particulars of the corpus.   And this in itself is not a bad thing -- holistic, distributed representations are how large parts of the human brain-mind work, and have various advantages in terms of memory retrieval and learning.

Humans also learn the grammar of their natural languages mixed up with the particulars of the linguistic constructs they've encountered.  But the "subtle" point here is that the mixing-up of abstract grammatical patterns with concrete usage patterns in human minds is of a different nature than the mixing-up of abstract grammatical patterns with concrete usage patterns in GPT3 and other transformer networks.   The human form of mixing-up is more amenable to appropriate generalization.

In our paper at the AGI-20 conference, Andres Suarez and I gave some prototype results from our work using BERT (an earlier transformer neural net model for predicting language) to guide a symbolic grammar rule learner.   These simple results also don't get us to AGI, but I believe they embody some key aspects that aren't there in GPT3 or similar networks -- the explicit manipulation of abstractions, coupled appropriately with a scalable probabilistic model of large volumes of concrete data.   In our prototype hybrid architecture there is a cognitively sensible grounding and inheritance relationship between abstract linguistic patterns and concrete linguistic patterns.   This sort of grounding is what's there in the way human minds mix up abstract grammatical patterns with low-level experience-specific linguistic patterns, and it's a substantial part of what's missing in GPT3.


Toward AGI via Scale or Innovation (or Both?)


Taking a step back and reflecting on the strengths and weaknesses of the GPT3 approach, one has to wonder why this is such an interesting region of AI space to be throwing so many resources into.   


To put it a little differently: Out of all the possible approaches to building better and smarter AI systems, why do we as a society want to be putting so much emphasis on approaches that … can only be pursued with full force by a handful of huge tech companies?   Why do we want the brainpower of the global AI R&D community to get turned toward AI approaches that require exponential increases in compute power to yield linear improvements?   Could this be somehow to the differential economic advantage of those who own the biggest server farms and have the largest concentration of engineers capable of customizing AI systems for them?


Given all the ridiculous wastes of resources in modern society, it’s hard to get too outraged at the funds spent on GPT3, which is for all its egregious weaknesses an amazingly cool achievement.   However, if one focuses on the fairly limited pool of resources currently being spent on advanced AI systems without direct commercial application, one wonders whether we’d be better off to focus more of this pool on fundamental innovations in representation, architecture, learning, creativity, empathy and human-computer interaction, rather than on scaling up transformers bigger and bigger.   


OpenAI has generally been associated with the view that fundamental advances toward AGI can be made by taking existing algorithms and scaling them up on bigger and bigger hardware and more and more data.  I don’t think GPT3 supports this perspective; rather the opposite.   Possibly GPT3 can be an interesting resource for an AGI system to use in accelerating its learning, but the direct implications for GPT3 regarding AGI are mostly negative in valence. GPT3 reinforces the obvious lesson that just adding a massive number of parameters to a system with no fundamental capability for understanding … will yield a system that can do some additional cool tricks, but still has no fundamental capability for understanding.  


It's easy to see where the OpenAI founders would get the idea that scale is the ultimate key to AI.   In recent years we have seen a variety of neural net algorithms that have been around for decades suddenly accomplish amazing things, mostly just by being run on more and faster processors with more RAM.   But for every given class of algorithms, increasing scale reaches a point of diminishing returns.   GPT3 may well not yet represent the point of diminishing returns for GPT type architectures, in terms of performance on some linguistics tasks.  But I believe it is well past the point of diminishing returns in terms of squeezing bits and pieces of fundamental understanding out of transformer neural nets.


The viable paths to robust AGI and profoundly beneficial AI systems lie in wholly different directions than systems like GPT3 that use tremendous compute power to compensate for their inability to learn appropriate abstractions and ground them in concrete examples.   AGI will require systems capable of robust symbol grounding, of understanding what the program code it generates does in specific cases, of doing mathematical computations far beyond the examples it has seen, of treating words with rich non-linguistic referents differently from nonsense coinages.   


These systems may end up requiring massive compute resources as well in order to achieve powerful AGI, but they will use these resources very differently from GPT3 and its ilk.   And the creativity needed to evolve such systems may well emerge from research involving a decentralized R&D community working on a variety of more compact Ai systems, rather than pushing as fast as possible toward the most aggressive possible use of big money and big compute.

Saturday, July 04, 2020

The Developmental Role of Incoherent Multi-Value Systems in Open-Ended Intelligence


So I have written in a recent post about what it would mean for a value system to be coherent -- i.e. fully self-consistent -- and I have noted that human value systems tend to be wildly incoherent.   I have posited that coherence is an interesting property to think about in terms of designing and fostering emergence of AGI value systems.

Now it's time for the other shoe to drop -- I want to talk a bit about Open-Ended Intelligence and why incoherence in value systems (and multivalue systems) may be valuable and productive in the context of minds that are undergoing radical developmental changes in the context of an intelligent broader world.

(For more on open-ended intelligence, see the panel at AGI-20 a couple weeks ago, and Weaver's talk at AGI-16)

My earlier post on value system coherence focused on the case where a mind is concerned with maximizing a single value function.   Here I will broaden the scope a bit to minds that have multiple value functions -- which is how we have generally thought about values and goals in OpenCog, and which I think is a less inaccurate mathematical model of human intelligence.   This shift from value systems to multivalue systems opens the door to a bunch of other issues related to the nature of mental development, and the relationship between developing minds and their external environments.

TL;DR of my core point here is -- in an open-ended intelligence that is developing in a world filled with other broader intelligences, incoherence with respect to current value function sets may build toward coherence with respect to future value function sets.

As a philosophical aphorism, this may seem obvious, once you sort through all the technical-ish terminology.  However, building a bridge leading to this philosophical obvious-ness from the math of goal-pursuit as value-function-optimization is somewhat entertaining (to those of us with certain peculiar tastes, anyway) and highlights a few other interesting points along the way.

In the next section of this post I will veer fairly far into the formal logic/math direction, but then in the final two sections will veer back toward practical and philosophical aspects...

So let's go step by step...


1) Conceptual starting-point: Open-ended intelligence is better approximated by the quest for Pareto-optimality across a possibly large set of different objective functions, than by attempting to optimize any one objective function...   (This is not to say that Pareto-optimality questing fully captures the nature of open-ended intelligence or complex self-organization and autopoiesis etc. -- it surely doesn't -- just that it captures some core aspects that single-goal-function-optimization doesn't.)

2) One can formulate a notion of what it means for a set of value functions to be coherent as a group.  Basically, the argmax(F) in the definition of value-system-coherence is just replaced with "being located on the Pareto frontier of F1, F2...,Fn".  The idea is that the Pareto frontier of the values for a composite system should be the composition of the Pareto frontiers of the values for the components of the composite.

3) One can also think about the "crypticity" or difficulty of discovering a certain value system (a term due to Charles H. Bennett from way back).  Given a certain amount R of resources and a constraint C and a probability p, one can ask what is the most coherent value system one can find with probability >p that satisfies C, using the available resources.  Or if C is fuzzy, one can ask what is the most coherent value system one can find with probability >p that is on the Pareto frontier of coherence and C, given the available resources.

4) So open-ended intelligence involves [among other things] the emergence of coherent multivalued value-systems (multivalue systems) that involve a large number of different value functions, and that are tractably-discoverable (i.e. not too cryptic)

5) Suppose one is given a set of value-functions as initial "constraints", say C1, C2 ,..., CK -- and is then looking for the most coherent multivalue system one can find with high odds using limited resources, that is compatible with C1,...,Ck?  I.e. one is asking, what is the most coherent tractably-findable value system compatible with the initial values?

Then, suppose one is alternatively looking at a subset of the initial values, say C1,...,Ck ... and looking for the most coherent tractably-findable value system compatible with these?

6) The most coherent tractably-findable value systems according to C1,...,Ck -- may not be compatible with the most coherent tractably-findable value systems according to C1,...,CK.   Why? The reason for this would be: In some cases, adding in the extra value functions (k+1,...,K) may make it computationally simpler to find Pareto optima involving the original k value functions (1,...,k).   This could be the case if there were interaction informations between the  value functions 1,...,k and the value functions k+1,...,K

7) So we have here a sort of Fundamental Principle of Valuable Value-Incoherence -- i.e. if you have limited resources and you want to build toward multivalued coherence in the context of a bunch of different initial value-functions, the best routes could be through value-systems that are fairly incoherent in the context of various subsets of this bunch of initial value-functions.

8) So if a system is in a situation where new external value functions that will serve as constraints are progressively revealed over time, and these new external value functions have interaction information with one's previous constraint-value-functions, then one may find that one's current incoherence helps build toward one's future coherence.   

9) This seems especially relevant to the context of development in the context of a world filled with broader intelligences than oneself -- in which case one is indeed being confronted with (and developing to internalize) new external value functions that are related to one's prior value functions in complex ways.

10) So in this sort of context (development in a world that keeps feeding new stuff that's informationally interactive w/ the old), it could be that seeking coherence is suboptimal in a similar way to how seeking piece count in the early stages of a chess game, or seeking board coverage in the early stages of an Othello game, is suboptimal....  Instead one often wants to seek mobility and maximization of options, in the early to mid stages of such games ... and the same may be the case w/ value systems in this sort of situation...

11) A major question then becomes: When and how big are there actual tradeoffs btw multivalue system coherence and open-mindness (aka agility/mobility)....  What is the sense in which an incoherent system can have more information than a coherent one?

12) It is possible that the theory of paraconsistent logic might yield some insight here.    If you assume value system coherence as an axiom, then for a mind to have an incoherent value system will make it an overall inconsistent system (what sort of paraconsistency it will have depends on various details) -- whereas for a mind to have a coherent value system will land it in the realm of Godelian restrictions (i.e. via Godel's Second Incompleteness Theorem and its variants...)

13)  If you look at the set of theorems provable by a consistent logic, there's a limit due to Godel.  If you look at the set of theorems provable in a paraconsistent logic (e.g. a dialetheist logic, aka a logic in which there are true statements whose negations are also true) it can be "larger" in a sense, e.g. a dialetheic logic can prove its own Godel sentence as well as its own soundness.  This doesn't show that a paraconsistent logic can be more informative than a consistent one, but it opens the door for this to maybe be true...   It seems we are now pushing in directions where modern math-logic isn't yet fully fleshed out.  

14) The notion of an "experimental logic" seems also relevant here, .. basically a dynamic process in which new axioms are added to one's logic over time.   This is one analogue in logic-system-land of "development" in psychology-land...   Of course if one assumes there is a finite program whose behavior corresponds to some fixed logic generating the new axioms, then one can't escape Godel this way.  But if one assumes the new axioms are emanating in part from some imperfectly understood external source (which could be a hypercomputer for all one knows... or at least could be massively more intelligent/complex than stuff one can understand), then one has a funky situation. 

15) Also it seems one could capture a sort of experimental logic as a relevance-logic layer on top of dialetheic logic.  I.e. assume a dialetheic logic that can generate everything, and then put a relevance/importance distribution on axioms, and then the development process is one of gradually extending importance to more and more axioms....  This sort of open-ended logic potentially is in some useful senses fundamentally informationally richer than consistent logic... and in the domain of reasoning about values, incoherent value systems could open the door to this sort of breadth...

(Possibly relevantly -- While researching the above, I encountered the paper "Expanding the Logic of Paradox with a Difference-Making Relevant Implication" by Peter Verdée, which made me wonder whether relevance logic is somehow morphic to the theory of algorithmic causal dags....   I.e. in a relevance logic one basically only accepts the conclusion to follow from the premises, if there is some compressibility of the conclusion based on the premise list alone, without including the other axioms of the logic  ... )

Back to basics

OK well that got pretty deep and convoluted...

So let's go back to the basic conclusion/concept I gave at the beginning -- in an open-ended intelligence that is developing in a world filled with other broader intelligences, incoherence with respect to current value function sets may build toward coherence with respect to future value function sets.

In the current commercial/academic AI mainstream, the default way of thinking about AI motivation is in terms of the maximization of expected reward.   Hutter's beautiful and important theory of Universal AI takes this as a premise for many of its core theorems, for example.


In my practical proto-AGI work with OpenCog, I have preferred to use motivational systems with multiple goals and not average these into a single meta-goal.

On the other hand, I have also been intrigued by the notion of open-ended intelligence, and in general by the conceptualization and modeling of intelligences as SCADS, Self-organizing Complex Adaptive Dynamical Systems, in which goals arise and are pursued and then discarded as part of the broader self-organizing dynamics of system and environment.

What I'm suggesting here is that approximations of the SCADS perspective on open-ended intelligences may be constructed by looking at systems with large numbers of goals (aka. multivalue systems) that are engaged in developmental processes wherein new values are ongoingly added in an informationally rich interaction with  an intelligent external environment.

The ideas sketched here may form a partial bridge between the open-ended intelligence perspective -- which captures the fundamental depth of intelligence and mind -- and the function-optimization perspective, which has a lot of practical value in terms of current real-world system engineering and experimentation.

This line of thinking also exposes some areas in which modern math, logic and computing are not yet adequately developed.   There are relations between paraconsistent logic, gradual typing systems as are likely valuable in integrative multi-paradigm AGI systems,  the fundamental nature of value in developing intelligences, and the nature of creativity and radical novelty -- which we are barely at the edge of being able to formalize ... which is both fascinating and frustrating, in that there clearly are multiple PhD theses and research papers between here and a decent mathematical/conceptual understanding of these matters... (or alternately, a few seconds of casual thought by a decent posthuman AGI mind...)

Philosophical Post-lude

If one digs a bit deeper in a conceptual sense, beyond the math and the AI context, what we're talking about here in a way is a bridge between utilitarian-type thinking  (which has been highly valuable in economics and evolutionary biology and other areas, yet also clearly has fundamental limits) and more postmodernist type thinking (which views minds as complex self-organizing systems ongoingly reconstructing themselves and their realities in a polyphonic interactive inter-constructive process with other minds).   

Conventional RL based ML is utilitarianism projected into the algorithmic and mechanical domain, whereas Open-Ended Intelligence is postmodernism and a bit of Eastern philosophy projected into the realm of modern science.  

Expanding and generalizing the former so that it starts to approximate significant aspects of the latter, is interesting both for various practical engineering and science reasons, and as part of the general project of stretching the contemporary technosphere to a point where it can make rich contact with broader "non-reductionist" aspects of the universe it has hitherto mainly ignored.

Om!








Friday, June 26, 2020

Approximate Goal Preservation Under Recursive Self-Improvement


There is not much controversial about the idea that an AGI should have, among its goals, the goal of radically improving itself.

A bit dodgier is the notion that an AGI should have, among its goals, the goal of updating and improving its goals based on its increasing knowledge and understanding and intelligence.

Of course, this sort of ongoing goal-refinement and even outright goal-revolutionizing is a key part of human personal development.   But where AGIs are involved, there is concern that if an AI starts out with goals that are human-friendly and then revises and improves its goals, it may come up with new goals that are less and less copacetic to humans.

In principle if one’s goal is to create for oneself a new goal that is, however, compatible with the spirit of one’s old goal — then one shouldn’t run into major problems.  The new goal will be compatible with the spirit of the old goal, and part of the spirit of the old goal is that any new goals emerging should be compatible with the spirit of the old goal — so the new goal should contain also the proviso that any new new goals it spawns will also be compatible with its spirit and thus the spirit of the old goal.   Etc. etc. ad infinitum.

But this does seem like a “What could possibly go wrong??” situation — in which small errors could accumulate as each goal replaces itself with its improved version, the improved version of the improved version etc. … and these small errors compound to yield something totally different from the starting point.

My goal here is to present a novel way of exploring the problem mathematically — and an amusing and interesting, if not entirely reassuring tentative conclusion, which is:

  • For an extremely powerful AGI mind that is the result of repeated intelligent, goal-driven recursive self-modifications, it may actually be the case that recursive self-modification leaves goals approximately invariant in spirit
  • For AGIs with closely human-like goal systems — which are likely to be the start of a sequence of repeated intelligent, goal-driven recursive self-modifications — there is no known reason (so far) to believe recursive self-modification won’t cause radical “goal drift”
(This post updates some of the ideas I wrote down on the same thing in 2008, here I am "partially unhacking" some things that were a little too hacky in that more elaborate write-up.)

Quasi-Formalizing Goal-Driven Recursive Self-Improvement



Consider the somewhat vacuous goal:

My goal is to improve my goal (in a way that is consistent with the spirit of the original goal) and to fulfill the improved version

or better yet the less vacuous

My goal is to achieve A and also to improve my goal (in a way that is consistent with the spirit of the original goal) and to fulfill the improved version

where say

A = “militate toward a world where all sentient being experience copious growth, joy and choice”

or whatever formulation of “highly beneficial” you prefer.

We might formulate this quasi-mathematically as

Fulfill G = {achieve A;  and create G1 so that G1 > G and G==>G1 ; and fulfill G1}

Here by G==>G1 I mean that G1 fulfills the spirit of G (and interpretation of “spirit” here is part of the formulation of G), and by G1 > G I mean that G1  can be produced by combining G with some other entity H that has nonzero complexity (so that G1 = G + H)

A more fleshed out version of this might be, verbally,

My goal is to 1) choose actions highly compatible with all sentient beings experiencing a lot of growth, joy and choice; 2) increase my intelligence and knowledge; 3) improve the details of this goal appropriately based on my increased knowledge and intelligence, in a manner compatible with the spirit of the current version of the goal; 4) fulfill the improved version of the goal

This sort of goal obviously can lead to a series such as

G, G1, G2, G3, …

One question that emerges here is: Under what conditions might this series converge, so that once one gets far enough along in the series,  the adjacent goals in the series are almost the same as each other?

To explore this, we can look at the “limit case”

Fulfill Ginf = {achieve A;  and create Ginf so that Ginf > Ginf and Ginf ==> Ginf ; and fulfill Ginf}

The troublesome part here is Ginf>Ginf which looks not to make sense — but actually makes perfect sense so long as Ginf is an infinite construct, just as

(1, 1, 1, …) = append( 1, (1,1,…))

Inasmuch as we are interested in finite systems, the question is then: Is there a sense in which we can look at the series of finite Gn as converging to this infinite limit?

Self-referential entities like Ginf are perfectly consistently modelable within ZFC set theory modified to use the Anti-Foundation Axiom.   This set theory corresponds to classical logic enhanced with a certain sort of inductive logical definition.

One can also put a geometry on sets under the AFA, in various different ways.   It's not clear what geometry makes most sense in this context, so I'll just describe one approach that seems relatively straightforward.

Each hyperset (each set under AFA) is associated with a directed pointed graph called its apg.   Given a digraph and functions r and p for assigning contraction ratios and probabilities to the edges, one gets a DGIFS (Directed Graph Iterated Function System), whose attractor is a subset of finite-dimensional real space.   Let us call a function that assigns (r,p) pairs to a digraph a DLF or Digraph Labeling Function.   A digraph then corresponds to a function that maps DLFs into spatial regions.   Given two digraphs D1 and D2, and a DLF F, let F1e and F2e denote the spatial regions produced by applying F to D1 and D2, discretized to ceil(1/e) bits of precision.   One can then look at the average over all DLFs F (assuming some reasonable distribution on DLFs) of: The least upper bound of the normalized information distance NID(F1e, F2e) over all e>0.   This gives a measure of two hypersets, in terms of the distance between their corresponding apgs.   It has the downside of requiring a "reference computer" used to measure information distance (and the same reference computer can then be used to define a Solomonoff distribution over DLFs).   But intuitively it should result in a series of ordinary sets that appear to logically converge to a certain hyperset, actually metrically converging to that hyperset.

Measuring distance between two non-well-founded sets via applying this distance measure to the apg's associated with the sets, yields  a metric in which it seems plausible the series of Gn converges to G.

“Practical” Conclusions


Supposing the above sketch works out when explored in more detail -- what would that mean?   

It would mean that approximate goal-preservation under recursive self-improvement is feasible — for goals that are fairly far along the path of iterated recursive self-improvement.

So it doesn’t reassure us that iterated self-improvement starting from human goals is going to end up with something ultimately resembling human goals in a way we would recognize or care about.

It only reassures us that, if we launch an AGI starting with human values and recursive self-improvement, eventually one of the AGIs in this series will face a situation where it has confidence that ongoing recursive self-improvement isn’t going to result in anything it finds radically divergent from itself (according to the above normalized symmetric difference metric).

The image at the top of this post is quite relevant here -- a series of iterates converging to the fractal Koch Snowflake curve.   The first few iterates in the series are fairly different from each other.  By the time you get to the 100th iterate in the series, the successive iterates are quite close to each other according to standard metrics for subsets of the plane.   This is not just metaphorically relevant, because the metric on hyperset space outlined above works by mapping each hyperset into a probability distribution over fractals (where each fractal is something like the Koch Snowflake curve but more complex and intricate).

It may be there are different and better ways to think about approximate goal preservation under iterative self-modification.  The highly tentative and provisional conclusions outlined here are what ensue from conceptualizing and modeling the issue in terms of self-referential forms and iterative convergence thereto.