The Many Labs enterprise is on a roll. This week, a manuscript reporting Many Labs 3 materialized on the already invaluable Open Science Framework. The manuscript reports a large-scale investigation, involving 20 American and Canadian research teams, into the “end-of-semester effect.”
The lore among researchers is that subjects run at the end of the semester provide useless data. Effects that are found at the beginning of the semester somehow disappear or become smaller at the end. Often this is attributed to the notion that less-motivated/less-intelligent students procrastinate and postpone participation in experiments until the very last moment. Many Labs 3 notes that there is very little empirical evidence pertaining to the end-of-semester effect.
To address this shortcoming in the literature, Many Labs 3 set out to conduct 10 replications of known effects to examine the end-of-semester effect. Each experiment was performed twice by each of the 20 participating teams: once at the beginning of the semester and once at the end of the semester, each time with different subjects, of course.
It must have been a disappointment to the researchers involved that only 3 of the 10 effects replicated (maybe more about this in a later post) but Many Labs 3 remained undeterred and went ahead to examine the evidence for an end-of-semester effect. Long story short, there was none. Or in the words of the researchers:
It is possible that there are some conditions under which the time of semester impacts observed effects. However, it is unknown whether that impact is ever big enough to be meaningful
This made me wonder about the reasons for expecting an end-of-semester effect in the first place. Isn’t this just a fallacy born out of research practices that most of us now frown upon: running small samples, shelving studies with null effects, and optional stopping?
New projects are usually started at the beginning of a semester. Suppose the first (underpowered) study produces a significant effect. This can have multiple reasons:
(1) the effect is genuine;
(2) the researchers stopped when the effect was significant;
(3) the researchers massaged the data such that the effect was significant;
(4) it was a lucky shot;
(5) any combination of the above.
|How the end-of-semester effect might come about|
But what if the first study does not produce a significant effect? The authors probably conclude that the idea is not worth pursuing after all, shelve the study, and move on to a new idea. If it’s still early in the semester, they could run a study to test the new idea and the process might repeat itself.
Now let’s assume the second study yields a null effect, certainly not a remote possibility. At this juncture, the authors are the proud owners of a Study 1 with an effect but are saddled with a Study 2 without an effect. How did they get this lemon? Well, of course because of those good-for-nothing numbskulled students who wait until the end of the semester before signing up for an experiment! And thus the the “end-of semester fallacy” is born.