Friday, April 10, 2020

Can We "Discover" Semantic Primitives for Commonsense and Math via Semantic Relation Extraction from Corpora?

Once more wild-ish train of thought completely unrelated to anything immediately practical … I was thinking about Chalmers’ idea from Constructing the World  that the notion of universal semantic primitives underlying all human concepts might be rendered sensible by use of intensional logic … i.e. extensionally reducing all concepts to combinations of a few dozen primitives [plus raw perception/action primitives] is doomed to fail (as shown by eons of pedantic pickery in the analytical philosophy literature) but doing the reduction intensionally seems to basically work….

But in his book he argues why this is the case and gives lots of examples but doesn’t fully perform the reduction as that’s too big a job (there are a lot of concepts to intensionally reduce…)

So it occurred to me if we managed to do decent semantic-relation-extraction from large NL corpora, then if Chalmers is right, there would be a set of a few dozen concepts such that doing intensional-logic operations to combine them (plus perception/action primitives) would yield close approximations (small Intensional Difference) from any given concepts

In vector embedding space, it might mean that any concept can be expressed fairly closely via a combination of the embedding vectors from a few dozen concepts, using combinatory operators like vector sum and pointwise min …

As I recall it the intensional-combination operators used in Chalmer’s philosophical arguments don’t involve so much advanced quantifier-munging so basic fuzzy-propositional-logic operators might do it…

Now if we cross-correlate this with Lakoff and Nunez’s thoughts in “Where  Mathematics Comes From?” — where they argue that  math theorem proving is done largely by unconscious analogy to reasoning about everyday physical situations — then we get the idea that morphisms from common-sense domains to abstract domains guide math theorem-proving, and that these potential generators of the algebra of commonsense concepts, can be mapped into abstract math-patterns (e.g. math-domain-independent proof strategies/tactics) that serve as generators of proofs for human-friendly mathematics….

Which led me to wonder if one could form an interesting corpus from videos of math profs going thru proofs online at the whiteboard.  One would then capture the verbal explanations along with proofs, hopefully capturing some of the commonsense intuitions/analogies behind the proof steps… from such a corpus one might be able to mine some of the correspondences Lakoff and Nunez wrote about….

There won’t be a seq2seq model mapping mathematicians’ mutterings into full Mizar proofs, but there could be useful guidance for pruning theorem-prover activity in models of the conceptual flow in mathematician’s proof-accompanying verbalizations....

Can we direct proofs from premises to conclusions, via drawing a vector V pointing from the embedding vector for the premise to the embedding vector for the conclusion, and using say the midpoint of V as a subgoal for getting from premise to the conclusion ... and where the basis for the vector space is the primitive mathematical concepts that are the Lakoff-and-Nunez-ian morphic image of primitive everyday-human-world concepts?

Alas making this sort of thing work is 8 billion times harder than conceptualizing it.   But conceptualization is a start ;)