Amara Angelica pointed me to an article in IEEE Spectrum titled MoNETA: A Mind Made from Memristors

Fascinating indeed!

I'm often skeptical of hardware projects hyped as AI projects, but truth be told, I find this one an extremely exciting and promising project.

I think the memristor technology is amazing and may well play part in the coming AGI revolution.

Creating emulations of human brain microarchitecture is one fascinating application of memristors, though not the only one and not necessarily the most exciting one. Memristors can also be used to make a lot of other different AI architectures, not closely modeled after the human brain.

[For instance, one could implement a semantic network or an OpenCog-style AtomSpace (weighted labeled hypergraph) via memristors, where each node in the network has both memory and processor resident in it ... this is a massively parallel network implemented via memristors, but the nodes in the network aren't anything like neurons...]

And, though the memristors-for-AGI theme excites me, this other part of the article leaves me a bit more skeptical:

"

By the middle of next year, our researchers will be working with thousands of candidate animats at once, all with slight variations in their brain architectures. Playing intelligent designers, we'll cull the best ones from the bunch and keep tweaking them until they unquestionably master tasks like the water maze and other, progressively harder experiments. We'll watch each of these simulated animats interacting with its environment and evolving like a natural organism. We expect to eventually find the "cocktail" of brain areas and connections that achieves autonomous intelligent behavior.

"

I think the stated research program places too much emphasis on brain microarchitecture and not enough on higher-level cognitive architecture. The idea that a good cognitive architecture is going to be gotten to emerge via some simple artificial-life type experiments seems very naive to me. I suspect that, even with the power of memristors, designing a workable cognitive architecture is going to be a significant enterprise. And I also think that many existing cognitive architectures, like my own OpenCog or Stan Franklin's LIDA or Hawkins' or Arel's deep learning architectures, could be implemented on a memristor fabric without changing their underlying concepts or high-level algorithms or dataflow.

So: memristors for AI, yay!

But: memristors as enablers of a simplistic Alife approach to AGI ... well, I don't think so.

## Tuesday, November 23, 2010

### The Psi Debate Continues (Goertzel on Wagenmakers et al on Bem on precognition)

A few weeks ago I wrote an article for H+ Magazine about the exciting precognition results obtained by Daryl Bem at Cornell University.

Recently, some psi skeptics (Wagenmakers et al) have written a technical article disputing the validity of Bem's analyses of his data.

In this blog post I'll give my reaction to the Wagenmakers et al (WM from here on) paper.

It's a frustrating paper, because it makes some valid points -- yet it also confuses the matter by inappropriately accusing Bem of committing "fallacies" and by arguing that the authors' preconceptions against psi should be used to bias the data analysis.

The paper makes 3 key points, which I will quote in the form summarized here and then respond to one by one

POINT 1

"

Bem has published his own research methodology and encourages the formulation of hypotheses after data analysis. This form of post-hoc analysis makes it very difficult to determine accurate statistical significance. It also explains why Bem offers specific hypotheses that seem odd a priori, such as erotic images having a greater precognitive effect. Constructing hypotheses from the same data range used to test those hypotheses is a classic example of the Texas sharpshooter fallacy

"

MY RESPONSE

As WM note in their paper, this is actually how science is ordinarily done; Bem is just being honest and direct about it. Scientists typically run many exploratory experiments before finding the ones with results interesting enough to publish.

It's a meaningful point, and a reminder that science as typically practiced does not match some of the more naive notions of "scientific methodology". But it would also be impossibly cumbersome and expensive to follow the naive notion of scientific methodology and avoid exploratory work altogether, in psi or any other domain.

Ultimately this complaint against Bem's results is just another version of the "file drawer effect" hypothesis, which has been analyzed in great deal in the psi literature via meta-analyses across many experiments. The file drawer effect argument seems somewhat compelling when you look at a single experiment-set like Bem's, and becomes much less compelling when you look across the scope of all psi experiments reported, because the conclusion becomes that you'd need a huge number of carefully-run, unreported experiments to explain the total body of data.

BTW, the finding that erotic pictures give more precognitive response than other random pictures, doesn't seem terribly surprising, given the large role that sexuality plays in human psychology and evolution. If the finding were that pictures of cheese give more precognitive response than anything else, that would be more strange and surprising to me.

POINT 2

"

The paper uses the fallacy of the transposed conditional to make the case for psi powers. Essentially mixing up the difference between the probability of data given a hypothesis versus the probability of a hypothesis given data.

"

MY RESPONSE

This is a pretty silly criticism, much less worthy than the other points raised in the WM paper. Basically, when you read the discussion backing up this claim, the authors are saying that one should take into account the low a priori probability of psi in analyzing the data. OK, well ... one could just as well argue for taking to account the high a priori probability of psi given the results of prior meta-analyses or anecdotal reports of psi. Blehh.

Using the term "fallacy" here makes it seem, to people who just skim the WM paper or read only the abstract, as if Bem made some basic reasoning mistake. Yet when you actually read the WM paper, that is not what is being claimed. Rather they admit that he is following ordinary scientific methodology.

POINT 3

"

Wagenmakers' analysis of the data using a Bayesian t-test removes the significant effects claimed by Bem.

"

This is the most worthwhile point raised in the Wagenmakers et al paper.

Using a different sort of statistical test than Bem used, they re-analyze Bem's data and they find that, while the results are positive, they are not positive enough to pass the level of "statistical significance." They conclude that a somewhat larger sample size would be needed to conclude statistical significance using the test they used.

The question then becomes why to choose one statistical test over another. Indeed, it's common scientific practice to choose a statistical test that makes one's results appear significant, rather than others that do not. This is not peculiar to psi research, it's simply how science is typically done.

Near the end of their paper, WM point out that Bem's methodology is quite typical of scientific psychology research, and in fact more rigorous than most psychology papers published in good journals. What they don't note, but could have is that the same sort of methodology is used in pretty much every area of science.

They then make a series of suggestions regarding how psi research should be conducted, which would indeed increase the rigor of the research, but which a) are not followed in any branch of science, and b) would make psi research sufficiently cumbersome and expensive as to be almost impossible to conduct.

I didn't dig into the statistics deeply enough to assess the appropriateness of the particular test that WM applied (leading to their conclusion that Bem's results don't show statistical significance, for most of his experiments).

However, I am quite sure that if one applied this same Bayesian t-test to a meta-analysis over the large body of published psi experiments, one would get highly significant results. But then WM would likely raise other issues with the meta-analysis (e.g. the file drawer effect again).

Conclusion

I'll be curious to see the next part of the discussion, in which a psi-friendly statistician like Jessica Utts (or a statistician with no bias on the matter, but unbiased individuals seem very hard to come by where psi is concerned) discusses the appropriateness of WM's re-analysis of the data.

But until that, let's be clear on what WM have done. Basically, they've

The practical consequence of their latter point is that, if Bem's same experiments were done again with the same sort of results as obtained so far, then eventually a sufficient sample size would be accumulated to demonstrate significance according to WM's suggested test.

So when you peel away the rhetoric, what the WM critique really comes down to is: "Yes, his results look positive, but to pass the stricter statistical tests we suggest, one would need a larger sample size."

Of course, there is plenty of arbitrariness in our conventional criteria of significance anyway -- why do we like .05 so much, instead of .03 or .07?

So I really don't see too much meat in WM's criticism. Everyone wants to see replications of the experiments anyway, and no real invalidity in Bem's experiments, results or analyses was demonstrated.... The point made is merely that a stricter measure of significance would render these results (and an awful lot of other scientific results) insignificant until replication on a larger sample size was demonstrated. Which is an OK point -- but I'm still sorta curious to see a more careful, less obviously biased analysis of which is the best significance test to use in this case.

Recently, some psi skeptics (Wagenmakers et al) have written a technical article disputing the validity of Bem's analyses of his data.

In this blog post I'll give my reaction to the Wagenmakers et al (WM from here on) paper.

It's a frustrating paper, because it makes some valid points -- yet it also confuses the matter by inappropriately accusing Bem of committing "fallacies" and by arguing that the authors' preconceptions against psi should be used to bias the data analysis.

The paper makes 3 key points, which I will quote in the form summarized here and then respond to one by one

POINT 1

"

Bem has published his own research methodology and encourages the formulation of hypotheses after data analysis. This form of post-hoc analysis makes it very difficult to determine accurate statistical significance. It also explains why Bem offers specific hypotheses that seem odd a priori, such as erotic images having a greater precognitive effect. Constructing hypotheses from the same data range used to test those hypotheses is a classic example of the Texas sharpshooter fallacy

"

MY RESPONSE

As WM note in their paper, this is actually how science is ordinarily done; Bem is just being honest and direct about it. Scientists typically run many exploratory experiments before finding the ones with results interesting enough to publish.

It's a meaningful point, and a reminder that science as typically practiced does not match some of the more naive notions of "scientific methodology". But it would also be impossibly cumbersome and expensive to follow the naive notion of scientific methodology and avoid exploratory work altogether, in psi or any other domain.

Ultimately this complaint against Bem's results is just another version of the "file drawer effect" hypothesis, which has been analyzed in great deal in the psi literature via meta-analyses across many experiments. The file drawer effect argument seems somewhat compelling when you look at a single experiment-set like Bem's, and becomes much less compelling when you look across the scope of all psi experiments reported, because the conclusion becomes that you'd need a huge number of carefully-run, unreported experiments to explain the total body of data.

BTW, the finding that erotic pictures give more precognitive response than other random pictures, doesn't seem terribly surprising, given the large role that sexuality plays in human psychology and evolution. If the finding were that pictures of cheese give more precognitive response than anything else, that would be more strange and surprising to me.

POINT 2

"

The paper uses the fallacy of the transposed conditional to make the case for psi powers. Essentially mixing up the difference between the probability of data given a hypothesis versus the probability of a hypothesis given data.

"

MY RESPONSE

This is a pretty silly criticism, much less worthy than the other points raised in the WM paper. Basically, when you read the discussion backing up this claim, the authors are saying that one should take into account the low a priori probability of psi in analyzing the data. OK, well ... one could just as well argue for taking to account the high a priori probability of psi given the results of prior meta-analyses or anecdotal reports of psi. Blehh.

Using the term "fallacy" here makes it seem, to people who just skim the WM paper or read only the abstract, as if Bem made some basic reasoning mistake. Yet when you actually read the WM paper, that is not what is being claimed. Rather they admit that he is following ordinary scientific methodology.

POINT 3

"

Wagenmakers' analysis of the data using a Bayesian t-test removes the significant effects claimed by Bem.

"

This is the most worthwhile point raised in the Wagenmakers et al paper.

Using a different sort of statistical test than Bem used, they re-analyze Bem's data and they find that, while the results are positive, they are not positive enough to pass the level of "statistical significance." They conclude that a somewhat larger sample size would be needed to conclude statistical significance using the test they used.

The question then becomes why to choose one statistical test over another. Indeed, it's common scientific practice to choose a statistical test that makes one's results appear significant, rather than others that do not. This is not peculiar to psi research, it's simply how science is typically done.

Near the end of their paper, WM point out that Bem's methodology is quite typical of scientific psychology research, and in fact more rigorous than most psychology papers published in good journals. What they don't note, but could have is that the same sort of methodology is used in pretty much every area of science.

They then make a series of suggestions regarding how psi research should be conducted, which would indeed increase the rigor of the research, but which a) are not followed in any branch of science, and b) would make psi research sufficiently cumbersome and expensive as to be almost impossible to conduct.

I didn't dig into the statistics deeply enough to assess the appropriateness of the particular test that WM applied (leading to their conclusion that Bem's results don't show statistical significance, for most of his experiments).

However, I am quite sure that if one applied this same Bayesian t-test to a meta-analysis over the large body of published psi experiments, one would get highly significant results. But then WM would likely raise other issues with the meta-analysis (e.g. the file drawer effect again).

Conclusion

I'll be curious to see the next part of the discussion, in which a psi-friendly statistician like Jessica Utts (or a statistician with no bias on the matter, but unbiased individuals seem very hard to come by where psi is concerned) discusses the appropriateness of WM's re-analysis of the data.

But until that, let's be clear on what WM have done. Basically, they've

- raised the tired old, oft-refuted spectre of the file drawer effect, using a different verbiage from usual
- argued that one should analyze psi data using an a priori bias against it (and accused Bem of "fallacious" reasoning for not doing so)
- pointed out that if one uses a different statistical test than Bem did [though not questioning the validity of the statistical test Bem did use], one finds that his results, while positive, fall below the standard of statistical significance in most of his experiments

The practical consequence of their latter point is that, if Bem's same experiments were done again with the same sort of results as obtained so far, then eventually a sufficient sample size would be accumulated to demonstrate significance according to WM's suggested test.

So when you peel away the rhetoric, what the WM critique really comes down to is: "Yes, his results look positive, but to pass the stricter statistical tests we suggest, one would need a larger sample size."

Of course, there is plenty of arbitrariness in our conventional criteria of significance anyway -- why do we like .05 so much, instead of .03 or .07?

So I really don't see too much meat in WM's criticism. Everyone wants to see replications of the experiments anyway, and no real invalidity in Bem's experiments, results or analyses was demonstrated.... The point made is merely that a stricter measure of significance would render these results (and an awful lot of other scientific results) insignificant until replication on a larger sample size was demonstrated. Which is an OK point -- but I'm still sorta curious to see a more careful, less obviously biased analysis of which is the best significance test to use in this case.

Subscribe to:
Posts (Atom)