How tough should reviewers be?

December 18, 2014

By Justin Esarey

At lunch with two colleagues the other day, an interesting question came up: how often should we as reviewers aim to give favorable reviews (conditional acceptances and strong revise-and-resubmit recommendations) to articles at selective, high-prestige journals?

It’s a complicated question, and I’m not aiming to cover every possible angle. Rather, I’m assuming, as a part of this question, that reviewers and journal editors are aiming to publish a certain proportion of submitted articles that represent the best research being produced at that time. For example, the American Political Science Review might aim to publish 8-10% of the articles that it receives (presumably the best 8-10%!). To start off, I’m also assuming that unanimous support from three reviewers is necessary and sufficient to receive an invitation to revise and resubmit; I’ll relax this assumption later. For ease of interpretation, I assume that all articles invited to R&R are published.

What I want to know is: if reviewers agree with the journal editor’s target, how often should they grant strong reviews to articles?

The answer is surprising to me: presuming that reviewer opinions are less-than-perfectly correlated and that unanimous reviewer approval is required for acceptance, reviewers should probably be giving positive reviews 25% of the time or more in order to achieve an overall acceptance rate of about 10%. 

How did I arrive at this answer? Using R, I simulated a review process wherein three reviews are generated. Each reviewer grants a favorable review with probability pr.accept; these reviews are correlated with coefficient rho between 0 and 0.98. I generated review outcomes for 2,000 papers using this process, then calculated the proportion of accepted papers under the system. The code looks like this (the entire replication code base is here):

library(copula)
 
rho seq(from=0, to=0.98, by=0.02)
pr.accept
 
acceptc()
for(k in 1:length(rho)){
reviews rCopula(2000, normalCopula(param=c(rho[k], rho[k], rho[k]), dim=3, dispstr="un"))
decisions apply(X=reviews, MARGIN=1, FUN=min)
 
# acceptance rate
accept[k] sum(decisions)/length(decisions)
}
 
plot(accept ~ rho, ylim=c(0, 0.45), col=gray(0.5), ylab = "proportion of accepted manuscripts", xlab = "correlation of reviewer opinions", main=c("How Tough Should Reviewers Be?", "3 Reviewers, Unanimity Approval Needed"))
y.fit predict(loess(accept~rho))
lines(y.fit~rho, lty=1)

I plot the outcome for three different individual reviewer pr.accept values below; the three pr.accept values are 50%, 25%, and 10%.

unanimity-review

What’s most interesting is that, the less-correlated that reviewer opinions are, the more frequently that individual reviewers should be inclined to grant a positive review in order to achieve the overall publication target. If reviewer opinions are not-at-all correlated, then only a little more than 10% of articles will actually receive an invitation to revise and resubmit if reviewers recommend R&R 50% of the time. If reviewer opinions are correlated at 0.6, then an individual reviewer approval rate of 25% corresponds to an overall publication rate of a little under 10%.

What if the editor is more actively involved in the process, and unanimity is not required? I added a fourth reviewer (the editor) to the simulation, and required that 3 out of the 4 reviews be positive in order for an article to be invited to R&R. This means that the editor and two reviewers, or all three reviewers, have to like an article in order for it to be published.

The results are below. As you can see, the result is that acceptances do go up. Now, if reviewer opinions are correlated at 0.6, slightly over 10% of papers are eventually published.

editor-review

I think the conclusion to draw from this analysis is that individual reviewers need not be extremely demanding in order for the overall system as a whole to be quite stringent. If a reviewer aims to accept 10% of papers on the theory that the journal wishes to accept 10% of papers, probably all s/he accomplishes is ensuring that his/her area is underrepresented in the overall distribution of publications.

Make no mistake–I don’t think my analysis indicates that reviewers should be less critical in the substantive evaluation of a manuscript, or that review standards should be lowered in some sense. Rather, I think that reviewers should recognize that achieving even majority support for a paper is quite challenging, and they should be individually more-willing to give papers with scholarly merit a chance to be published even if they don’t believe the paper is in their personal top 10% of publications. It might be better if reviewers instead aimed to accept papers in their personal top 25%, recognizing that the process as a whole will still filter out a great many of these papers.