How I Review: 2020 Edition

In 2020 I have decided to try to refine my reviews. The impetus for this is that I think I have greater clarity about what my role should be or more correctly what my role is not.

My wife once had an internship at a publishing company. Her job was to go through the bin of unsolicited submissions and be ruthless. The company could only publish a set number of books a year, and they solicited most of the books they published. Thus her role was to reject almost all submissions. I think many reviewers think they have this job too. Many reviewers also believe they are the defender of the purity of science. This is a role I used to play. I believed that my field was a disaster and only I could fix it by standing in the way of as many articles as I could. My aim was to expunge the various sins I saw my field committing. What hubris!

I no longer believe that is my role. Ultimately, I think the role of a reviewer is a) to detect fatal flaws (a flaw that no amount of revision would fix); b) to identify any fundamental issue that should prevent publishing of any type (e.g., plagiarism, etc.); and c) determine if the article would look out-of-place among other articles in the field.

Ultimately, the role of a reviewer is to catch malfeasance and monsters.

The role of determining the place of an articles position as important or impactful or paradigm shifting is held by readers.

With this refined sense of what a reviewer should be, I have aimed to introduce the following to my own reviews:

  1. My review distribution will become increasingly bi-modal focused on either outright rejection or acceptance/ minor conditional acceptance.
  2. When I reject, my reviews are short. I outline what I think the fatal flaw was and nothing more. If an article is unsalvageable, advising on how certain paragraphs should be phrased or how APA styling should have be handled is a waste of time and confusing to the authors. The language here should be clear. There is no “I think the authors should consider…” or “Have the authors thought of…”. I am also clear in the first sentence that I do not think the article should be accepted and that I do not believe a revision could resolve the fundamental flaws I see in the paper.
  3. If I give a recommendation of conditional acceptance, I am careful of distinguishing between the few things upon which I believe are the conditions of acceptance and areas I think might improve the article. I am clear that the latter are suggestions and the authors are free to ignore them. I then try to phrase these points as questions rather than commands.
  4. If an author refuses to adjust their article in relation to something I think they should adjust, and they give reasons that are not preposterous, I let it go. You have likely received a review from me if you read “I don’t agree with the authors’ position on this issue, but my job is not to make authors write the paper how I want it written. I suggest the paper move forward to publication.”

The Track Change Jerk

I was a track change jerk last week. Someone did something minor that I didn’t like. So I showed my distaste via the comment function in Word. I know better. 

Like most people, when working on a collaboration with a big team I eagerly await people’s comments on my comments. And while most you won’t admit it, I too get a thrill out of seeing some change I made via track changes accepted by the lead author. 

This means collaborative track changes are not low stakes. And yet we treat them like they are. I add comments on papers that are as dismissive as they are uninformative (“awkward sentence”, “this makes no sense”). I change whole sentences or paragraphs without once explaining why I thought what they had was wrong. I treat the comments section as if it is a conversation between me and the person without acknowledging that these comments will be visible to the whole team.

This is not a blog post aimed at self-flagellation. It is more a call for discussion. Do we need track change etiquette? And if we do what should it be? A few thoughts:

  1. Acknowledge that track changes in big teams are public documents and that it doesn’t hurt to be nice.
  2. Acknowledge that you are not a professional proof reader (yes that means you). So if you change something put a comment explaining why. I had a great colleague this week point out a split infinitive via comments but also acknowledged that he was not sure if they mattered anymore.
  3. Point out the superb bits. The academic mentality is so optimised around criticism we find it really hard to acknowledge good work. Recognising good work is as much a critical skill as is recognising bad work.
  4. If you have something controversial or sensitive to say, do it in person, via Skype, or—if you really really have to—via email. Don’t do it in track changes or comments.
  5. Before changing something ask “is this just my personal preference?”.

My commitment is to be a better track change colleague from here out.

The P Word

I came across a tweet last week by an academic at a conference (I can’t remember who). They were indignant that presenters were using the word ‘predict’ to describe a correlations. My first reaction was to sigh. Prediction has no causal connotation. When you go to the fair and some huckster offers to guess your age for a price, they are making a prediction based on your physical appearance. This prediction does not require a belief that your physical appearance caused your age. Such a belief is absurd. Yet prediction is still the right word.

This was my first reaction. My second, was to reflect on my use of prediction in reporting research results. While I believe ‘to predict’ requires no causal beliefs, it implies a certain level of accuracy. I have used the word predict to describe a correlation of .20. On reflection this seems wrong. Not because I am implying causation. I am not. But because I am implying a level of accuracy in predicting x from y. The implication is that by knowing a person’s value on x I can make a good guess at their value of y. But a correlation of .20 or even .40 as a basis of such a prediction would be atrociously inaccurate. Reporting such weak results using the word predict leads the public, who rightly read ‘predict’ as ‘predict accurately’, to overstate vastly the significance of the finding.

The social sciences are well known for being terrible at prediction.

Predictive accuracy is often woeful, even on the data that the researchers used to build the statistical model (see here for example). The social sciences often seem to not know about the importance of, let alone test, the predictive accuracy of a model on unseen data. Really, the only metric of predictive accuracy that matters.

In his fantastic paper, Shmueli argues that the social sciences have neglected the prediction component of science in favor of a complete focus on explanation. Mostly this is because of the mistaken belief that explanation is synonymous with prediction. 

And here lies the problem. The social sciences are scathing on anyone who uses the word prediction outside of RCT research. But this fit of pique is misdirected. The willing skeptic at the fair may say “I have $10 with your name on it, if you can guess my age to within a year”. So too we should call authors on their use of “predict “ when their models are scarcely better than chance.

Use Causal Language: Ignore the Haters

Correlation is not causation. So comes the inevitable refrain in response to anyone who presents a correlational study as evidence in a debate. There is good reason for this. People have long extrapolated from correlation to causation. Bad science and often bad policy follows. But a healthy respect for what can what we can claim about causality has given way to abject fear of any language that even hints at causality. 

There is no danger in being overly cautious I hear you say. 

But there have been unintentional consequences that have resulted from the barring of causal language. First, few social scientists now understand much about causality, mistakenly thinking it is that which comes from an RCT. Second, theory has become sloppy. Why waste time constructing a detailed theory of why x leads to y when a reviewer will make you tear it up.

Evidence that something has gone wrong

The biggest evidence I see that something is amiss is how reviewers and writers now interact. It is not uncommon to have a reviewer demand a writer remove all causal language from their manuscript. I have seen this include purging the word ‘effect’ from a manuscript entirely; even named theories are not immune (the Big-Fish-Little-Pond effect becomes the Big-Fish -Little-Pond association). But authors should advance causal theories in introductions!

Reviewers also display a lack of understanding about causation when they claim only an RCT can provide evidence of causality. RCTs neither provide definitive evidence of causation nor are they only way of providing evidence of causality.  

Writers also make mistakes. Writers of papers I have reviewed refuse to explain how x leads to y because they didn’t do an RCT. One wonders, if they think this way, why they bothered to do the study at all. And if they are so scared of advancing a causal explanation why use a regression model that so strongly suggests that x leads to y?

Setting the record straight

 In Hunting causes and using them Nancy Cartwright emphasizes causality is not a single thing (it is also worth reading her book Evidence Based Policy). So heterogeneous are the things we call causality that we might be better to abandon the term entirely. We likely need to match method and evidence to the type of causality we are chasing. 

Judea Pearl in The book of why claims science would become worse not better if we were to believe that only RCTs have the power to provide evidence of causation. In such a world how would we know that smoking causes cancer? 

The danger in the current social science landscape comes from a belief that causation is a dichotomy. If you did an RCT, you can advance causal claims. If you didn’t, you can’t. But causality is not a dichotomy. RCTs often can’t provide evidence of causation and sometimes provide poor evidence. RCTs are critical but we both need to be more conservative—RCTs provide some evidence sometimes—and be more liberal in allowing other designs (regression of discontinuity, instrumental variables, Granger causality) to provide evidence of causality.

What to do about it

  1. Treat causality as a spectrum where researchers can marshal evidence that push the needle toward a causal interpretation or away from it.
  2. View no single piece of evidence an inconvertible evidence of causality.
  3. Write about clear and simple causal mechanisms in introductions and literature reviews.
  4. In the method section, social science should have a section on the degree to which the results can provide evidence of causation; perhaps also the type of causation they have in mind. This should include a discussion of design, context, strength of theory, and methodology. In other words, researchers should have to make a case for their specific research rather than relying on general social science tropes.
  5. As Cartwright suggests, we should replace general terms like cause with more specific terms like repel, pull, excite, or suppress that give a better idea of what is being claimed.

Person Centered Analysis: Where are all the People?

I hate academic conferences. What seems like a chance for free travel to an exotic location turns out to be an endless bore in a stuffy room. For an introvert, the need to be constantly ‘on’ when talking to students, peers, and that big name you are desperate to collaborate with is tiring. The point being, I am not usually in the best of moods when at conferences. Which is probably why I found a particular presentation so irksome.

Why so much Person-centered Seems so Hollow to Me

The presenter at this hot and stuffy conference gets up and smugly states that previous, crappy, social science has used a variable-centered approach to research. He, however, would use a person-centered approach. The motivation was, I confess, solid.

Person-centered analysis starts with the assumption that within any large group of people there are likely smaller distinct groups (within a school there are jocks, goths, nerds, etc). Too much research treats humans as a bunch of mini clones that are driven by the same processes and differ only in degree. I can get behind this sentiment.

I was surprised then to not hear mention of a single person for the rest of the presentation. No explanation was given of how people in the different groups think, believe, feel, or act differently from each other. Nor was there discussion about whether people chose to be a member of their group or whether they were forced into it. Did they jump or were they pushed? Instead, the entire presentation focused on various configurations of variables. This was not, to me at least, person-centered.

This is a disturbing trend in person-centered research. The almost total absence of people.

The overall impression I get from most person-centered analysis is that people believe human diversity has been ill-treated by regression like approaches. But many researchers assume that by applying things like cluster analysis or similar they will magically fix this problem. In my experience, researchers don’t seem to put in a lot of thought into how these approaches better represent real people or what the results are really saying about them. Researchers tend not to describe a prototypical human from each of the researcher’s groups. Researchers also apply little imagination to what drives people in different groups.

Greater attention to this could truly transform the social sciences. A truly person-centered ontology and epistemology could serve disadvantaged groups better. Researchers could better acknowledge that the experience of, say, an Indigenous girl is qualitatively different from a South East Asian Australian boy. But to do this, person-centeredness needs to be about more than methods. And it needs to be motivated less by an appeal to what it isn’t (e.g., “Unlike previous research we use person-centered approaches” is not a convincing rationale).

Give me person-centered person-centered analysis

A move in the right direction would be to consider what Rob Brockman and I have recently called the four S’s of person-centered analysis:

  1. Specificity. Once you have your groups, can you describe what makes these groups distinct? By this I don’t mean a profile graph of variables used to create the groups. I mean a deeper insight to what these groups of people are like. What do their members do? How do they think? What do they want?
  2. Selectivity. How do people end up in these groups? By what process does a person end up in group A and not group B? Were people born into different groups? Did some person, institution, or cultural practice push them into their group? Or is their group membership their choice?
  3. Sensitivity. Do these same groups occur in different samples? If not, why not? Do differences in groupings across—for example—countries illuminate how people’s context shapes grouping, or do differences just reflect unreliable research findings?
  4. Superiority. The beauty of cluster analysis is that it will always return the number of groups you asked for. And like a Rorschach test, it is easy to make something out of whatever the computer gives you. Researchers should attempt to show that their groups tell us something we did not already know. And researchers need to show us that groups really differ from each other qualitatively rather than merely quantitatively.

STEM Gender Gaps in Motivation, Interest, and Self-belief are Huge Right?

We recently had a meta-analysis on STEM gender differences in motivation, interest, and self-belief in Educational Psychology Review. We could not be more thrilled. And a big thank you to my former PhD student Brooke for all her work on this. The results are in the paper poster download below. But first some context for why there is a download in the first place.

I have been thinking about using Kudos for new papers and this seemed like a good paper to give it a try. I spent longer than I like setting up a design brief for this. But now it is done, I have a template in In Design I can use for all new papers as well as themes for ggplot and a standard color pallet. My design choices were:

  1. Use of three colors only; all blues. I think this is elegant but is also advantageous for me as I am color blind.
  2. For plots I have modified the economist white theme from ggthemes. So here on out all my plots will be consistent.
  3. I used a combination serif and san-serif set of fonts the work nicely together. I chose Avenir book and EB Garamond. I am not super happy with these but I don’t like the idea of paying $400 for the fonts I really want. I may want to swap out EB Garamond for Nanum Myeong to have a more crisp feel. Not sure yet.

Anyway, you can see the result here:

Comments welcome; particularly on fonts, general look, and plot theme as I will want to role these out for other papers. I still need a lot of work on distilling the message of my papers down to 100 or so Sticky words. And my In Design skills are weak (though I think I am getting better with my R to Illustrator workflow).

Motivating Research

I enjoy being a reviewer. It is my chance to be anonymously self-righteous. One of my pet peeves is researchers that motivate their writing by academic circle jerking. This includes opening sentences that start with “researchers have yet to consider”, “we aim to resolve a tension in the literature”, “we are the first to”, or “we aim to integrate”. Such openings almost guarantee the remaining paper will focus on esoteric issues there will be precious little of substance on how actual people think, feel, or behave.

So you can imagine my surprise when a reviewer proclaimed that is exactly what I was doing. On reflection they were right. I concentrated my whole opening on winning theoretical points—researchers were focusing on the wrong thing and were making false assumptions and I would put them right. This was interesting to me. But it wasn’t person centred nor do I think it would be interesting to more than maybe a handful of people. My focus was on proving researchers wrong, rather than focusing on the main issues:

  1. Scientists, and thus policy makers and not-for-profits, assume that poor kids are deficit in academic motivation, interests, and self-beliefs. That make policy and develop interventions based on this assumption.
  2. A whole pile of money is being wasted on running motivation, interest, and self-belief interventions for disadvantaged children. This is money that could be spent on advocating for better educational policy that really serves poor children.

This was a good reminder that applied research should always start with why. But that ‘why’ should be for a broad audience—people that could use the research in practical and theoretical ways. In my case, my ‘why’ should have been focused on policy makers. Policy makers need empirical evidence to guide them when deciding how to use a limited budget to create an education system that works for all. They need to know what to focus on. But equally, they need research that tells them what to avoid if they want to make best use of their limited resources. I should have written my research with that as the most important concern.

User Stories

 

I presented my blog to my writing circle last week. The feedback; who is this blog actually for? They challenged me to write a set of user stories to make this clear. After much procrastination I realised that the blog was, more of less, for me. A chance to yell at the clouds. And there is little point to that. But I think I have something to say and I think there are people that might find what I have to say useful—maybe even interesting. Here I present to you, dear reader, my user stories.

Brief Interlude: What are you Talking About?

But first, as this is academia and not software development, a brief interlude on what user stories are. There is a movement in software development called agile or scrum. I won’t go into the messy details here other than to say this is a way we run many of our teams at Institute for Positive Psychology and Education. The bit I want to talk about is the dedicated focus on end users of the content we produce. To do this, we write short (1-2 sentence) stories about a particular person and what problem they would like solved. The team then sets about solving that problem. For example, we might consider the problem of a education minister who is unsure whether to increase the number of selective high-schools. We then go about conducting research that could inform that decision.

My User Stories

  1. My reader is a social scientist who worries they aren’t smart enough. They read their bosses impenetrable prose and worry that their simple writing will never achieve this level of ‘elevation’ (good, I hope it never does!).
  2. My reader wants their work to impact people. They want to do research that people can use, not research that merely sits in some journal few will ever read.

This is me. My writing still drips with false complexity, with affected sophistication. And I wonder if any of the people I research could read what I write and apply it to their life in some tangible way. Maybe as I try to wake up from this social science stupor, I might have something interesting to share with you along the way.