Wednesday, May 7, 2014

Are we Intuitively Cooperative (or are we Moving the Goalposts)?

Are we an intuitively cooperative species? A study that was published a few years ago in Nature suggests that indeed our initial inclination is to cooperate with others. We are only selfish if we are allowed to reflect.

How did the researchers obtain these (perhaps counterintuitive) results? Subjects were given an amount of money and had to decide how much of this money, if any, they wanted to contribute to a common project. The subjects were told that they collaborated on this project with three other unknown players whose contributions were not known. They were told that each of the four players received a bonus that was calculated as follows: (additional money – own contribution) + 2*(sum of the contributions)/4.

So you get the highest personal payoff by being selfish and contributing nothing to the common good, regardless of the total contribution of the other three players. A random half of the subjects were required to make a decision on the amount of their contribution within 10 seconds, whereas the other half of the subjects had to think and reflect at least 10 seconds before making their contribution.

The experiments showed an intuitive-cooperation effect. The mean contribution was significantly larger in the intuition condition than in the reflection condition. Hence the conclusion that we are selfish when given the opportunity to deliberate but cooperative when responding intuitively.

Enter my colleagues Peter Verkoeijen and Samantha Bouwmeester. (I wrote about another study by them in a previous post. Basically, the story is this. I have to walk past their office several times a day on my way to the coffee machine and when they have a paper coming out they won’t let me pass unless I promise to write a blog post about them.) They were surprised about these findings and decided to replicate them. They conducted several experiments but found no support for the intuitive collaboration scheme.

What did they do find? First of all, it turned out that only 10% of the subjects understood the payoff scheme. (Did you understand it right away?) This makes an interpretation of the original findings difficult. How can we say anything meaningful when the vast majority of subjects misunderstand the experiment?

Wait a minute! you might say. Perhaps the original study was run with a different subject pool. This is not the case however. One of the two original experiments that found the effect was run on Mechanical Turk. The replication attempts by my colleagues were also run on Mechanical Turk.

Verkoeijen and Bouwmeester were unable to find evidence for intituive cooperation in several experiments even ones that were very close to the original ones. An initial version of their manuscript was reviewed and an anonymous reviewer pointed out that the authors of the original paper, David Rand and his colleagues, in the meantime coincidentally had conducted studies in which they were also unable to replicate their own finding.

Rand and his colleagues had an interesting explanation for this. Mechanical Turk subjects have become familiar with this type of experiment and now will no longer act naively. The entire pool of subjects is now contaminated. There is no hope of finding the intuitive cooperation effect ever again in that crowdsourcing version of Chernobyl. Fortunately, the effect is still there if naïve subjects are used because the effect is moderated by naïveté.

To address the Chernobyl criticism, my colleagues conducted additional experiments. However, they found no evidence for the newfangled naïveté hypothesis. Turkers who classified themselves as not having participated in public-goods experiments before (they were told prior participation would not preclude them from getting paid this time around as well) showed no intuitive cooperation effect.

An anonymous reviewer of the second version of Verkoeijen and Bouwmeester’s manuscript moved the goalpost even a little further. The reviewer (was it the same one as before?) claimed that it is likely that the Turkers lied about having no experience with the experiment. Not only are the Turkers a heavily polluted bunch, they are also inveterate liars.

So in addition to the naïveté hypothesis, we now have the mendacity hypothesis. Such a line of reasoning opens the door to non-falsifiability, of course. Whenever you find the effect, the subjects must have been naïve and when you don’t they must have been lying about having no experience. The editor at PloS ONE  had the good sense not to let this concern block publication of Verkoeijen and Bouwmeester’s article.

The article includes a meta-analysis of the reported experiments. This analysis produced no evidence for the intuitive cooperation hypothesis. In fact, the aggregate effect is going in the opposite direction. In addition, there are several other unsuccessful replications of the intuitive cooperation effect performed by a Swedish group.



It looks like the discussion on intuitive cooperation has reached an impasse with some initial experiments by one group showing an effect while subsequent experiments from several groups have produced nonreplications. Where do we go from here?

Peter Verkoeijen and Samantha Bouwmeester have initiated a Registered Replication Report with Perspectives of Psychological Science. A number of labs will independently test the intuitive cooperation hypothesis according to a strict protocol to be developed in collaboration with the original authors. I cannot think of a better way to resolve the discussion and stop the goalposts from moving. And what's more important, I will be able to make it to coffee machine again.

Hello, old friend

10 comments:

  1. It's worth nothing that Rand and colleagues have a very recent paper that provides more support for intuitive cooperation overall as well as support for the naiveté hypothesis.

    http://static.squarespace.com/static/51ed234ae4b0867e2385d879/t/5356700be4b0b8b008782885/1398173707675/social-heuristics-shape-intuitive-cooperation.pdf

    ReplyDelete
  2. A couple of mturk-related notes:

    - You can’t rely on self-reported previous participation in a study if we want a reliable measure of previous participation. We find this in a multiple-wave study we conducted recently, and also find that the effects of non-naivety in our study are equally strong for people who did vs. did not correctly report previous participation.

    - There’s no reason to equate misreported previous participation with lying. Especially for people who are very active on mturk (ceteris paribus, those who are more likely to have previously participated in a given study), it can be hard to remember each and every single study they participated in.

    ReplyDelete
  3. Glad that you're interested in the relationship between intuition and cooperation! There are a few things I'd to add to this post:

    1. We have a paper out a few weeks ago (Rand et al 2014 Nature Communications) where we DO replicate the overall intuitive cooperation effect in a 15 study meta analysis of close to 7000 decisions.

    2. In this paper we also show the effect is robust to excluding people who did not understand the game payoffs.

    3. We further show that between 2011 and 2013, the size of the effect on Mturk got smaller and finally disappeared (cooperation under time delay did not change, while cooperation under time pressure steadily decreased).

    4. This pattern is what our mechanism, based on Social Heuristics, predicts given how much more popular economic game experiments became on mturk over that period. If intuitive cooperation stems from misapplication of typically advantageous strategies to less typical settings, then extensive experience with one shot anonymous games should undermine the effect.

    5. We then directly demonstrate the moderating effect of experience, reproducing the pattern seen over two years in a single study using experience moderator.

    6. We have also replicated the moderating effect of previous experience with econ games two other times:
    Rand et al 2012 Nature study 9, and Rand & Kraft-Todd 2013 SSRN "Reflection does not undermine self interested prosociality"

    7. A key difference between these three studies and the new PLoS ONE paper is that we assessed previous experience AFTER the study was done (at which point subjects have no incentive to misrepresent), whereas the PLoS ONE paper asks at the beginning, and makes participation in the world experiment contingent on being naive, which creates a clear incentive to say you are naive regardless of your true level of experience.

    8. Lastly, I would like to remind readers that our original 2012 Nature paper explicitly said that intuition would NOT favor cooperation for all people, and already there demonstrated two moderators. We more fully develop our theory, used to predict moderators, in the new 2014 Nature Communications paper.

    Thanks again for your interest, and I hope this comment is informative. (Also, apologies for any typos, I am on vacation and typing on my phone)

    David Rand

    ReplyDelete
    Replies
    1. In reaction to David Rand's points, we have a few things to add as well:

      Ad 1. With about 7000 observations statistical significance is hardly informative. Instead effect size should be taken into account to assess whether a replication attempt is successful (see for example Uri Simonsohn’s SSRN paper "Small Telescopes: Detectability and the Evaluation of Replication Results"). Our guess would be that this overall effect size (without the original study) is considerably smaller than effect size in the original study.

      Ad 3. The R-square of the model pertaining to the crucial interaction effect is equal to 0.003. So, the entire model can only account for 0.3% of the variance in contributions. Of course, the R-Square of the entire model as well as the predictors in the model are statistically significant. Yet, there are few results that would not be statistically significant with more than 5000 observations.

      Ad 4. In their Nature Communications paper (see Table 3 and Figure 3) date (i.e., the date on which a participant made his/her decision expressed as the number of days since Rand and colleagues initial study, published as Study 6 in their Nature 2012 paper) is used to measure a participant’s experience with public good games. We ask the readers of this blog to reflect a while on the question how valid date is as a measure of experience.

      Ad 5 & 6. We acknowledge the evidence pertaining to the experience hypothesis in our PLoS One paper. However, we have serious concerns, which can be found in the General Discussion of our paper.

      Ad 7. This is actually not a correct representation of the procedure. We informed MTurkers that the experiment (Experiment 3) involved a study in which a participant were to choose how much to keep for himself/herself versus contributing to benefit others. Subsequently, they were informed they were only allowed to take part in the experiment if they had never participated in studies like this before. After they finished the experiment, participants were again asked to indicate whether they had ever participated in studies like this one before. In order to prevent participants from being dishonest, they were informed they would receive their participation fee plus bonus independent of their answer. Considering there were no negative consequences associated with being honest, it seems unlikely that participants would lie in response to the post-experiment question. In fact, some participants were excluded from the experiment because they indicated after the experiment they were not naïve.

      To conclude, we would advise the readers of this blog to read the following papers and to form their own judgment about the robustness of the intuitive-cooperation effect and the experience hypothesis.

      Rand DG, Greene JD, Nowak MA (2012) Spontaneous giving and calculated greed. Nature 489: 427–430. doi: 10.1038/nature11467

      Rand DG, Peysakhovich A, Kraft-Todd GT, Newman GE, Wurzbacher O, et al. (2014) Social heuristics shape intuitive cooperation. Nat. Commun. 5:3677 doi: 10.1038/ncomms4677

      Tinghög G, Andersson D, Bonn C, Böttiger H, Josephson C, et al. (2013) Intuition and cooperation reconsidered. Nature 498 http://dx.doi.org/10.1038/nature12194 (2013). Note: this paper contains a series of replication attempts that failed to show the intuitive cooperation effect

      Verkoeijen PPJL, Bouwmeester S (2014) Does Intuition Cause Cooperation? PLoS ONE 9(5): e96654. doi: 10.1371/journal.pone.0096654

      Peter Verkoeijen & Samantha Bouwmeester

      Delete
    2. A couple of brief responses:

      i) I agree that effect sizes are very important. In our 2014 Nature Communications paper, we show that across 15 studies, the average effect size is a 20% increase in cooperation under time pressure relative to time delay. To me, that seems like a sizable effect. And that is INCLUDING the numerous later studies on MTurk where the effect has been shown to have entirely disappeared.

      ii) It is a red herring to discuss how 'robust' the intuitive cooperation effect is. From the very first with our 2012 Nature paper, we explicitly said the effect was NOT robust, in that it did not apply to all people; and we have presented a theory based on social heuristics that predicts when intuition should and should not promote cooperation. Specifically, in our 2012 paper, we showed that the effect did not apply to people who distrust daily life interaction partners, or who had previous experience with economic game experiments. We have subsequently replicated both of these moderators. So, the effect is by definition not robust.

      I also encourage readers to read the four papers Peter and Samantha listed above, as well as these additional papers, each of which provides additional evidence for the role of intuition in cooperative decision-making:

      Rand DG, Greene JD, Nowak MA (2013) Reply to ‘Intuition and cooperation reconsidered.’ Nature 498 E2–E3.

      Rand DG, Kraft-Todd, GT. Reflection does not undermine self-interested prosociality: Support for the Social Heuristics Hypothesis. Available at SSRN: http://ssrn.com/abstract=2297828

      Rand DG, Epstein ZG. Risking Your Life Without a Second Thought: Intuitive Decision-Making and Extreme Altruism. Available at SSRN: http://ssrn.com/abstract=2424036.

      Rand DG, Gruber J. Positive Emotion and (dis)Inhibition Interact to Predict Cooperative Behavior. Available at SSRN: http://ssrn.com/abstract=2429787

      Peysakhovich A, Rand DG. Habits of virtue: Creating cultures of cooperation and defection in the laboratory. Available at SSRN: http://ssrn.com/abstract=2294242. .

      Delete
  4. I'd just like to comment that the formula is nearly impossible to understand for a person who hasn't read the paper. What is "additional money"? I (who studied math at the Ph.D. level) spent five minutes examining that formula and still am not sure what it means.

    Regards, Bill

    ReplyDelete
  5. Hi Rolf, nice post. I agree with you that it seems very problematic, interpretation-wise, that so few participants (10%) indicate that they actually understood the payoff scheme. This also seems to invalidate the "getting experienced makes the effect go away" hypothesis, doesn't it? I mean if participants were really getting experienced, you would at least expect them to understand what they were making a decision about. Otherwise it would be hard to claim that they were cooperating with anything.

    Then again, the findings do seem in line with the way your typical robber would go about a heist. "You got 10 seconds to start filling up these bags with money..." instead of "Please wait 10 seconds and then start filling up these bags with money..." Perhaps robbers are simply aware of the fact that humans are intuitively cooperative. That is, when they are at gunpoint (which might of course also be a potential moderator of cooperativeness, but I'll leave that to the experts to figure out).

    ReplyDelete
    Replies
    1. I agree that the interpretation is problematic, I just want to add a note concerning the experience-understanding relation - it is quite possible to know from experience what is PGG about (i.e. your best option is to contribute nothing), even without understanding the exact payoff scheme.
      Some econ. experiments quiz their participants and do not let them continue until they answer correctly about payoffs in stylized scenarios, but that is like making everyone "experienced".

      Delete
    2. Hi Vranka,thanks for your reply. I'm not that familiar with the PGG paradigm and I think you might be right that experienced participants could just select what is optimal (in terms of payoff) without understanding why it is optimal. However, if this were the case, then the questions given after the experiment to assess comprehension becomes very uninformative. One example comprehension questions in the supplementary methods section of the 2014 Rand et al. paper literally asks participants to indicate what decission will result in the best outcome in terms of payoff. If this question can be answered correctly based solely on experience (and without actually understanding the payoff scheme) then I dont see how it can be used to say anything about comprehension.

      In my opinion, the disappearance of the intuitive cooperation effect is a nice example of the decline effect (perhaps due to regression to the mean). I look forward to the results of the registered replication. Assuming the replications will be run in different labs with naive subjects (and not on Mturk), the results should shed more light on the actual size of the effect.

      Delete
    3. Hi Mario,

      Thanks for your continued interest in this! With regards to the decline over time of the intuition effect on MTurk, it's very clearly NOT regression to the mean for two reasons:

      1) as we say in the paper "Critically, the initial [Rand et al 2012 Nature] experiment is not the only study to get a large effect [or the largest effect size study], as shown in Fig. 1; therefore this decrease in effect size is not simply a series of null results following the initial success of [Rand et al 2012], and the effect we are describing is not regression to the mean."

      2) we can get back to a large effect size (similar in size to the initial effect size) by restricting to naive subjects using an ex-post measure of experience (i.e. Figure 4).

      Delete