I was recently asked to co-guest-edit a special issue of Frontiers in Cognition on “failures to replicate.” I liked the idea of a special issue. I just didn’t think it had the right angle. What if someone had “successfully” replicated a study, would they not be allowed to submit? I was worried this would create a kind of reverse file drawer problem. Only if the replication was unsuccessful was it a candidate for publication. Others have expressed the same concern.
If you think about it, it makes sense. Nonreplications are in a superficial sense more informative than replications. Replications are like someone in the desert yelling: “Look over there, an oasis” followed by someone else yelling “Yes, I see it too.” A nonreplication is like the second person yelling “No, that’s not an oasis, it’s a mirage.”
At a deeper level, however, both are informative. The replication gives us greater confidence in the presence of an oasis. After all, how can we stake our lives on a single dehydrated member of our crew of explorers? The nonreplication decreases our confidence in the presence of an oasis and helps us in potentially wasting our resources (or even lives). Still, nonreplications seem sexier than replications. I fell for this myself when, in a previous post, I said “The most interesting effect occurred…” referring to the one nonreplication in the paper.
So how do we eliminate this inherent bias toward nonreplication? The highly useful Psychfiledrawer site lists replication attempts in psychology. Right now, there are about twice as many nonreplications as there are replications listed but it is still early in the game, so there is certainly no evidence of a nonreplication bias. On the contrary, the curious fact presents itself that the site reports a successful replication of Bem’s work on precognition (as well as an unsuccessful one). Moreover, we really have no idea what the percentage of findings is that will replicate. The Reproducibility Project will give us an estimate for the 2008 volumes of three different journals.
Still, there is a way to avoid bias and that is to use pre-registration. The steps required are nicely outlined here. Researchers register their replication attempt beforehand. They indicate why it is important to replicate a certain study, they perform power analyses, and they specify the research plan. This proposal is reviewed and if it checks out, the paper is provisionally accepted, regardless of the results. Provisionally accepted studies are carried out and the results are included in the paper. The full paper will then be reviewed to make sure the authors have delivered what they promised to do and for methodological accuracy and a fair discussion. The outcome of the experiment will play no role anywhere during the evaluation process.
The editors of Frontiers in Cognition liked our plan and so we are going to go ahead with it. I will provide more information and a call for proposals in my next post.
To close off with an anecdote, here is the labyrinthine route toward nonreplication that we once took. We discussed an interesting paper outside of our research area during a lab meeting. We developed ideas on how to tweak the paradigm described in the paper for our own studies on language.
Our first experiment, titled “Object 1” (maybe we had the precognition that this was the first in a series) was an abysmal failure. Not a failure to replicate—we weren’t even trying to replicate—just a bad experiment. Object 2 was not much better and then we realized we should probably move closer to the original experiment. This is what we did in successive steps in Object 3 through Object 12. By now we were pretty close to the original experiment. Object 13 was our final attempt: a very close replication. Again no effect. We gave up. Apparently, this paradigm was beyond our capabilities.
I discussed our failed attempts with a colleague at a conference. He said he had also had repeated failures to get the effect and then contacted the author (which we should have done as well, of course). He found out there was a critical aspect to the manipulation that was not mentioned in the paper. With this component, the effect proved reproducible.
The authors can be faulted for not including this component in the paper. It wasted a lot of our, the colleague’s, and probably a lot of other people’s time. But maybe the authors had simply forgotten to mention this critical detail or they were not aware of its critical role. This just goes to show that no detail is too trivial to mention in a method section.
There is another point and maybe it doesn’t reflect well on us. We went about it bass ackwards. Rather than taking the paradigm and run with it, we should have sat down and try an exact replication of the original finding first— Object 1 in this alternate universe. If we hadn’t been able to replicate the original finding, there probably would not have been alternate Object 2 through 13 and we would have had a lot of alternate time to run other experiments.