I've said a lot already about the type of theorizing and experimenting in this type of priming research, so I just want to keep it simple this time and concentrate on something that is currently under fire in the literature, even on the pages of Psychological Science itself, the p-value.
As the abstract indicates, there are four experiments. In each experiment, the key prediction is that exposure to the "time-prime" causes people to cheat less. Each prediction is evaluated on the basis of a p-value. In Experiment 1, the prediction was that subjects would cheat less in the "time-prime" condition than in the control condition. (There also was a money-prime condition, but this was not germane to the key hypothesis.) I've highlighted the key result.
In Experiment 2 the key hypothesis was that if priming time decreases cheating by making people reflect on who they are, cheating behavior in the latter condition would not differ between participants primed with money and those primed with time. However, participants who were told that the game was a test of intelligence would show the same effect observed in Experiment 1. So the authors predicted an interaction between reflection (reflection vs. no reflection) and type of prime (time vs. money). Here are the results.
In Experiment 3 the authors manipulated self-reflection in a literal way: subjects were or were not seated in front of a mirror and this was crossed with prime condition (money vs. time). Again, the key prediction involved an interaction.
Finally, in Experiment 4 the three priming conditions of Experiment 1 were used (money, time, control), which produced the following results.
So we have four experiments, each with their key prediction supported by a p-value between .04 and .05. How likely are these results?
This question can be answered with a method developed by Simonsohn, Simmons, and Nelson (in press). To quote from the abstract: Because scientists tend to report only studies (publication bias) or analyses (p-hacking) that “work”, readers must ask, “Are these effects true, or do they merely reflect selective reporting?” We introduce p-curve as a way to answer this question. P-curve is the distribution of statistically significant p-values for a set of studies (ps < .05).
Simonsohn and colleagues have developed a web app that makes it very easy to compute p-curves. I used that app to compute the p-curve for the four experiments, using the p-values for the key hypotheses.
So if I did everything correctly, the app concludes that the experiments in this study had no evidential value and were intensely p-hacked.
It is somewhat ironic that the second author of the Psych Science paper and the first author of the p-curve paper are at the same institution. This is illustrative of the current state of methodological flux that our field is in: radically different views of what constitutes evidence co-exist in institutions and journals (e.g., Psychological Science).