Comment on "A Filmed experiment on telephone telepathy with the Nolan Sisters 
Écrit par Florent Tournus 
Lundi, 11 Septembre 2006 15:08 
Attention : cet article a été soumis au « Journal of the Society for Psychical Research (JSPR) » pour publication. Il est donc susceptible de modifications liées au processus de revue. Dans cet article en anglais, Florent Tournus, de l'Observatoire Zététique, fait une critique argumentée du travail de Rupert Sheldrake, célèbre parapsychologue anglais. In a recent publication entitled “A Filmed experiment on telephone telepathy with the Nolan Sisters”, R. Sheldrake et al. (Sheldrake et al., 2004) report an experiment conducted with the Nolan Sisters in 2003 during a television show, designed to test the hypothesis of telepathy by telephone. According to their statistical analysis of the results of a few trials, the authors conclude “that the results support the hypothesis of telepathic communication”. Let us briefly recall the reader the experimental procedure described in the original publication of Sheldrake et al. (Sheldrake et al., 2004). Colleen, one of the five Nolan sisters, had to guess, just before picking up the phone, who among her four sisters was calling her. The experiment was filmed and according to the authors any “normal sensory clues” was impossible. 12 successive trials were planed, separated by a five minutes interval, the calling sister for each trial being randomly selected by the throw of a die. Colleen correctly guessed who was calling in 6 out of the 12 trials, leading the authors to the above conclusion. This abovechance result was claimed to be significant at the p=0.05 level, and consequently the null hypothesis (i.e. that the results are obtained by chance) was rejected in favor of the hypothesis of the highly “extraordinary” and controversial telepathy phenomenon. However, using the exact binomial test (which has here to be preferred over other statistical tests since within the null hypothesis the results follow exactly the wellknown binomial law), the correct p value corresponding to a 6/12 score is in fact p=0.0544. Since this value is strictly greater than 0.05, it is incorrect to conclude that “the results were significant at the p=0.05 level” and it is abusive to talk about “Colleen Nolan's abovechance success rate”. Moreover, there are some mistakes when the authors write “The positive result could simply have arisen by chance, but the odds against this explanation are 20:1”. Indeed, the first point is that this value is incorrect for two reasons. In probability theory, a probability p corresponds to the odds x:1, meaning x to 1 against, with x=(1p)/p. Therefore, the rounded p value (p=0.05) given in the article would correspond to 19:1 and not to 20:1. In addition, the true p value must be used, leading in fact to the odds against the explanation by chance of 17:1. Rejecting the null hypothesis because the odds against it are 20:1 is already “risky” (though it seems to be the standard in parapsychology...), but with the odds 17:1 it is even more venturous! The second point, which is more conceptual, is that the p value of a statistical test is not the probability for the null hypothesis (i.e. chance) to be true. The p value gives the probability to obtain, within the null hypothesis (by chance), a score greater or equal to the one observed (for a onetailed test, which is the case here). This is the only interpretation that can be made, even if of course, the smaller the p value is, the more reasonable it is to think that the null hypothesis is not the correct explanation. Furthermore, Sheldrake et al. write in their article that “on two occasions Colleen picked up the phone before making her guess, contrary to her instructions”. This is clearly a violation of the experimental procedure and the authors agree that “these two trials should be excluded from the total”. Nevertheless, they do not reconsider the statistical analysis, arguing that with 10 or 12 trials the success rate remains the same: 5 out of 10 instead of 6 out of 12. Using this “spurious argument”, the authors stick to their conclusion that “the results were significant at the p=0.05 level” and do not reject expression like “positive result” or “above chance success rate” as they should have done. It is indeed obvious that the same success rate can be significant for a given number of trials while it is insignificant for another (smaller) number of trials! In this case, although a score of 6/12 was almost significant (p just over 0.05) a score of 5/10, corresponding to the same success rate, is far from being significant at the 0.05 level. Calculations using the exact binomial test show that for such a result, the p value is p=0.0781. This value would have given the odds against the chance explanation (letting aside the fact that this interpretation of the p value is erroneous, as discussed above) of 12:1. This shows that with a more careful statistical analysis, Sheldrake et al. should have come to this kind of conclusion, stated in a more rigorous way: “the odds against the fact that chance would give such a high score for this experiment are 12:1”. The reader would then have made up his/her mind concerning the soundness of the conclusion that “the results support the hypothesis of a telepathic communication”.
