Tuesday, April 23, 2013

Social Priming: In Theory

You are walking into a room. There is a man sitting behind a table. You sit down across from him. The man sits higher than you, which makes you feel relatively powerless. But he gives you a mug of hot coffee. The warm mug makes you like the man a little more. You warm to him so to speak. He asks you about your relationship with your significant other. You lean on the table. It is wobbly, so you say that your relationship is very stable. You take a sip from the coffee. It is bitter. Now you think the man is a jerk for having asked you about your personal life. Then the man hands you the test. It is attached to a heavy clipboard, which makes you think the test is important. You’re probably not going to do well, because the cover sheet is red. But wait—what a relief!—on the first page is a picture of Einstein! Now you are going to ace the test. If only there wasn’t that lingering smell of the cleaning fluid that was used to sanitize the room. It makes you want to clean the crumbs, which must have been left by a previous test-taker, from the tabletop. You need to focus. Fortunately, there is a ray of sunlight coming through the window. It leaves a bright spot on the floor. At last you can concentrate on the test. The final question of the test asks you to form a sentence that includes the words gray, Florida, bingo, and pension. You leave the room, walking slowly…

These are just some findings that have been reported in the literature (well, most of them are; I made one up, guess which one) on social priming. But I don’t want to focus on the findings themselves in this post. What I want to do is find out what the theory behind them is. The picture suggested by social priming research is that we are constantly bombarded with a cacophony of cues in all sensory domains that push our behavior around in various ways. This cannot be true.

In a 2006 paper, John Bargh, by all accounts the major player in the area of social priming, arrived at very much the same conclusion. What have we been priming all these years?, he asks. To address the cacophony problem, Bargh suggests that all cues are not created equal. Cues related to goals trump other cues. For example (this is my example, not his), you may be walking slowly out of the room after having just formed a sentence that includes gray, Florida, bingo, and pension but as soon as someone yells FIRE!, you are bound to make a dash for the nearest exit. Your self-preservation goal has trumped whatever priming you may have received from the sentence-unscrambling task.

This makes sense. Bargh also provides a useful overview of the history of priming. Although priming is a concept from cognitive psychology, Bargh is right in criticizing classical cognitive science in its treatment of priming. Classical cognitive science has mostly been interested in priming words. For example, you recognize the word doctor faster after having just seen nurse than after having just seen bread.  Although this has provided useful insights into the organization of memory, inference generation, false memories, speech errors, and so on, there is no clear behavioral component. The behavior on the subject’s part is limited to pressing a button. Bargh does not think this counts as real behavior. And who can blame him? His goal is to examine how priming affects not just thinking but also action, a goal that has also been adopted in contemporary cognitive psychology and cognitive neuroscience.

Bargh observes another difference between classical cognitive psychology and social cognition. The classical priming experiment examines words as primes and as targets (the recognition of a word is often the dependent measure). In social cognition complex conceptual structures are primed that have action components associated with them. Whereas a classical priming experiment may want to investigate whether gray primes old, a social priming experiment wants to know whether priming with gray and old will influence the speed of subsequent action. Despite its strong points, the Bargh article is rather low on specifics regarding the mechanisms of priming and the representations that are involved.

Enter a recent Psychological Review paper by Stroeber and Thagard. They provide a computational model of social priming. A key concept in their model is constraint satisfaction. To illustrate this, let me introduce you to your long-lost cousin Lars from Sweden. He used to live on a small island in the middle of a lake that is frozen over much of the year. His close relatives live in villages all around the lake. Did I tell you he died? How sad, you just learned you had a lost relative and now you find out he’s already dead. Among Lars’ possessions was a very expensive grand piano, which is coveted by all of his relatives. They’ve put the piano on the ice and are now each trying to push the valuable musical instrument to their side of the lake. Björn and Bennie are very interested in the piano but being musicians, they are not very strong and they cannot get the piano to move in their preferred direction (if only they had Agneta and Frida to help them!). Their cousin Knut is a hockey player and is pushing the piano in a different direction. Other relatives are pushing in yet other directions. Which way will the piano go? It is basically the sum of all the force vectors. Because people will not be able to apply constant force, the piano’s path will not be a straight line—until the weakest relatives get tired. And then, slowly but surely, the piano will move in the direction of Knut’s log cabin.

That’s how constraint satisfaction works. Each relative constrains the path of the piano just like each cue constrains the course of action. Some cues will be stronger than others. And some cues will have longer-lasting effects than others. The system handles the cacophony of cues through constraint satisfaction. Sometimes a cue is so strong that it wins out immediately over all the others, as in the case when someone yells FIRE!. The cousin-Lars-analogy of this would be if someone donned an Iron Man suit and then started pushing the piano. The others might as well give up right away.

In line with Bargh’s notion of priming, the model assumes that primed concepts activate holistic representations of situations, which have psychological, cultural, and biological components. These layers mutually constrain each other. The way in which they do this is acquired during socialization. Because concepts have affective meanings, they can generate responses automatically. Affective meanings are organized in culturally-shared structures (meaning that responses will be similar across individuals). Members of a culture will try to maintain these structures, which produces a set of constraints.

So how does this cause behavior? Priming activates neural populations that act as “semantic pointers” to underlying sensorimotor and emotional representations. This is the biological component of the model. The idea is very much in the vein of Damasio’s convergence zones.

So the model is an account of how a simple prime in one modality may give rise to a range of responses, some purely cognitive, some emotional, and some behavioral.

I realize that I’m not doing the model much justice in this brief description but my point is that it apparently is possible to come up with a plausible model of social priming that is relatively detailed in parts and is consistent with current models of memory and action.

It is of course ironic that the model has been developed to explain findings that have proven so difficult to replicate and have raised so much controversy. Nevertheless, the model provides an interesting and rather detailed account of how social priming works in theory. Now it would be interesting to see if it can generate novel predictions that can be tested in rigorous experiments.

Wednesday, April 17, 2013

Should Diederik Stapel write a blog?

When I mentioned in yesterday’s post that Diederik Stapel was contemplating writing a blog, the response was not as negative as he and I (as well as one other person I had mentioned this to) had imagined.

It is clear that people are still angry at Stapel. But anger is not a very useful emotion at this point. It’s a little bit like “Praying for Boston.” Praying is not really helping anyone (as Richard Dawkins likes to remind us) but it may make people feel good about themselves. By the same token, rightful indignation about Stapel’s fraudulent behavior may make us feel good about ourselves (why, aren’t we ethical?) but is not helping to improve the way we do science.

Like everyone else, I was initially angry at Stapel. Well, first I was bewildered at the depth and the scope of his fraud and then came the anger. Because everyone else around me was also angry, I noticed that I was starting to get a little less angry. (Maybe there is a social psychological theory that can explain this phenomenon?)

I remember being asked by my university’s newspaper if I thought there were more cases like Stapel’s. I said I didn’t (because I didn’t). But only a month later, I found myself chairing a committee charged with investigating alleged fraud committed by Dirk Smeesters.

In my role of committee chair, anger was definitely not a useful emotion. The experience of serving on the Smeesters committee gradually made me take a more analytical perspective on the Stapel case. And what residual anger I might have felt was vented in lame jokes on Twitter at Stapel’s expense.

So back to the question at hand: Should Stapel write a blog? Various people who have responded to my post—via the social media on via email—have said they’d be interested to hear what he has to say.

If Stapel uses the blog to justify his prior actions, this would again fan the flames and rightfully so. Obviously, presenting “empirical” studies would also be an astoundingly bad idea.

But there are other things Stapel might write about. The cognitive psychologist Dermot Lynott (@DermotLynott) suggested a good analogy on Twitter. Stapel’s role could be similar to that of Frank Abagnale, a convicted fraudster turned government consultant. An interesting thought.

Tuesday, April 16, 2013

My conversation with Diederik Stapel

On April 1, I received an email from “Diederik Stapel” sent from a gmail address. Because of the date, I thought a former graduate student was pulling a prank on me. I had played a joke on her last year by using a fake gmail address and pretending to be an editor (well, I am an editor but I pretended to be a different one). I thought this was her getting back at me so my response was “Nice try, Lisa. What’s the date again?”

I received a response that showed some irritation about my lack of seriousness and responsiveness. Still thinking that it was a prank and that my former student was getting desperate because I wasn’t buying it, I then decided to wait until after April 1 to see what would happen. If the messages were somehow not from my student, then I would surely receive another one after April 1.

On April 3 I did indeed receive another email from Diederik Stapel to which I wrote a serious response. Stapel wanted to meet with me because I had “a refreshing perspective” on the current methodological crisis in psychology. I was curious to hear what he had to say. We agreed to meet in a café in Utrecht on April 11.

We had a very pleasant and interesting two-hour long conversation. Of course there were some barbs traded. Not unlike many other social psychologists, he thought that I was biased against social psychology (I disagreed) and that my jokes on Twitter at his expense were lame (guilty as charged).  I, on my part, was not letting him off the hook about his past transgressions (although he clearly wasn’t letting himself off the hook either).

Our free-flowing discussion centered on a few topics, some of which I will describe here. It is important to mention that Stapel has read this text and agrees with my posting it.

One topic we discussed was social priming research. I told Stapel I am genuinely puzzled about the large effects that are often reported in the literature combined with the far-fetchedness of some of the manipulations. Stapel seemed to resonate to this and gave a humoristic caricature (I hope) of an experiment that involves priming people with lamppost and then assess whether this influences the amount of light shining out of their eyes. On the other hand, Stapel also gave an interesting and compelling theoretical defense of the notion of social priming.

Another topic we discussed was the current publication culture. Stapel readily admitted that he really wasn’t under external pressure to submit lots of papers; he was the victim of his own ambition and vanity. We agreed that the publication culture and reward structure should be changed and that there should be a move toward a more qualitative assessment of research contributions rather than a mostly quantitative assessment (which in my experience is even more common in the Netherlands than they are in the United States).

A third topic was the role of experimentation. I mentioned to Stapel that many of his studies were interesting theoretical observations and might have been fine in their own right had they not been marred by the need or desire to present “empirical data.” This led to a discussion of whether an experimental approach is suited to address all theoretical issues in social psychology. Perhaps a hermeneutic approach would be preferable in some cases over (far-fetched) experiments. We did not end up with strong opinions on this but thought it was an issue worth debating.

A fourth topic was whether Stapel could contribute in any way to the current discussion. On the one hand, it is obvious that his credibility as an empirical researcher is non-existent at the moment. On the other hand, he is widely read and nobody questions his theoretical knowledge in the domain of social psychology. Furthermore, as he noted, he has first-hand experience with committing fraud and might have important insights to offer with regard to fraud prevention.

Stapel is entertaining the notion of starting a blog on a variety of topics. He expressed concern that people may not want to hear from him or that he might be further exposed to verbal aggression.  Many people will harbor resentment towards him. But there is always freedom of expression of course.

As Stapel well knows, the merits of what he has to say will solely reside in the persuasiveness of his arguments and not in the compelling force of “empirical data.”

After finishing our final cappuccino and espresso we said goodbye, agreeing to keep in touch.

I thank Diederik Stapel for feedback on a previous version of this post.

Saturday, April 13, 2013

Pre-publication Posting and Post-publication Review

There has been much discussion recently about the role of pre-publication posting and post-publication review. Do they have any roles to play in scientific communication and, if so, what roles precisely?

Let’s start with pre-publication posting. It is becoming more and more common for researchers to post papers online before they are published. There even are repositories for this. Some researchers post unpublished experiments on their own website. To be sure, like everything pre-review posting has its downside, as Brian Nosek recently found out when he encountered one of his own unpublished experiments—that he had posted on his own website—in a questionable open access journal not with himself but with four Pakistani researchers as authors. But the pros may outweigh the cons.

In my latest two posts I described a replication attempt we performed of a study by Vohs and Schooler (2008). Tania Lombrozo commented on my posts, calling them an example of pre- publication science gone wild: Zwaan's blog-reported findings might leave people wondering what to believe, especially if they don't appreciate the expert scrutiny that unpublished studies have yet to undergo. (It is too ironic not to mention that Lombrozo had declined to peer review a manuscript for the journal I am editing the week before her post.)

The questions Lombrozo poses in her post are legitimate ones. Is it appropriate to publish pre-review findings and how should these be weighed against published findings? There are two responses.

First, it is totally legitimate to report findings pre-publication. We do this all the time at conferences and colloquia. Pre-review posting is useful for the researcher because it is a fast way of receiving feedback that may strengthen the eventual submission to a journal and may lead to the correction of some errors. Of course, not every comment on a blog post is helpful but many are.

The second response is a question. Can we trust peer-reviewed findings? Are the original studies reported fully and correctly? Lombrozo seems to think so. She is wrong.

Let’s take the article by Vohs and Schooler as an example. For one, as I note in my post, the review process did not uncover, as I did in my replication attempt, that the first experiment in that paper used practicing Mormons as subjects. The article simply reports that the subjects were psychology undergraduates. This is potentially problematic because practicing Mormons are not your typical undergraduates and may have specific ideas about free will (see my previous post). The original article also did not report a lot of other things that I mention in my post (and do report about our own experiment).

But there is more, as I found out recently. The article also contains errors in the reporting of Experiment 2. Because the study was published in the 2008 volume of Psychological Science, it is part of the reproducibility project in which various researchers are performing replication attempts of findings published in the 2008 volumes of three journals, which include Psychological Science. The report about the replication attempt of the Vohs and Schooler study is currently being written but some results are already online. We learn for example that the original effect was not replicated, just as in our own study. But my attention was drawn by the following note (in cell BE46): The original author informed me that Study 2 had been analyzed incorrectly in the printed article, which had been corrected by a reader. The corrected analysis made the effect size smaller than stated…

Clearly, the reviewers for the journal must have missed this error; it was detected post-publication by “a reader.” The note says the error was corrected, but there is no record of this that I am aware of. Researchers trying to replicate study 2 from Vohs and Schooler are likely to base their power analyses on the wrong information, thinking that they need fewer subjects that they would actually be needing to have sufficient power.

This is just one example that the review process is not a 100% reliable filter. I am the first one to be thankful for all the hard work that reviewers put in—I rely on hundreds of them each year to make editorial decisions—but I do not think they can be expected to catch all errors in a manuscript.

So if we ask how pre-review findings should be evaluated relative to peer-reviewed findings, the answer is not so clear-cut. Peer-review evidently is no safeguard against crucial errors.

Here is another example, which is also discussed in a recent blogpost by Sanjay Srivastava. A recent article in PLoS ONE titled Does Science make you Moral? reported that priming with concepts related to science prompted more imagined and actual moral behavior. This (self-congratulatory) conclusion was based on four experiments. Because I am genuinely puzzled by the large effects in social priming studies (they use between-subjects designs and relatively few subjects per condition), I tend to read such papers with a specific focus, just like Srivastava did. When I computed the effect size for Study 2 (which was not reported), it turned out to be beyond any belief (even for this type of study). I then noticed that the effect size did not correspond to the F and p values reported in the paper.

I was about to write a comment only to notice that someone had already done so: Sanjay Srivastava. He had noticed the same problem I did as well as several others. The paper’s first author responded to the comment explaining that she had confused standard errors with standard deviations. The standard deviations reported in the paper were actually standard errors. Moreover, on her personal website she wrote that she had discovered she had made the same mistake in two other papers that were published in Psychological Science and the Journal of Personality and Social Psychology.

There are three observations to make here. (1) The correction by Sanjay Srivastava is a model of politeness in post-publication review. (2) It is a good thing that PLoS ONE allows for rapid corrections. (3) There should be some way to have the correction feature prominently in the original paper rather than in a sidebar. If not, the error and not the correct information will be propagated through the literature.

Back to the question of what should be believed: the peer-reviewed results or the pre-peer reviewed ones? As the two cases I just described demonstrate, we cannot fully trust the peer-reviewed results; Sanjay Srivastava makes very much the same point. A recent critical review of peer reviews can be found here.

It is foolish to view the published result as the only thing that counts simply because it was published. Science is not like soccer. In soccer a match result stands even if it is the product of a blatantly wrong referee call (e.g., a decision not to award a goal even though the ball was completely past the goal line). Science doesn’t work this way. We need to have a solid foundation of our scientific knowledge. We simply cannot say that once a paper is “in” the results ought to be believed. Post-publication review is important as is illustrated by the discussion in this blog

Can we dispense with traditional peer-review in the future? I think we might. We are probably in a transitional phase right now. Community-based evaluation is where we are heading.

This leaves open the question of what to make of the published results that currently exist in the literature. Because community-based evaluation is essentially open-ended—unlike traditional peer review—the foundation upon which we build our science may be solid in some places but weak—or weakening—in other places. Replication and community-based review are two tools at our disposal for continuously checking the structural integrity of our foundation. But this also means the numbers will keep changing.

What we need now is some reliable way to continuously gauge and keep a citable record of the current state of research findings as they are going through the mills of community review and replication. Giving prominence to findings as they were originally reported and published is clearly a mistake.

Update May 10, 2013: this post was reposted here.