The papers and web pages linked below are largely non-technical discussions of statistical and methodological issues.  If you have a 'must read' suggest for us, email

Scientists rise up against statistical significance: Valentin Amrhein, Sander Greenland, Blake McShane, Nature, March 20, 2019.

"When was the last time you heard a seminar speaker claim there was ‘no difference’ between two groups because the difference was ‘statistically non-significant’? ..."

Statistics notes: Absence of evidence is not evidence of absence BMJ 1995; 311 Douglas Altman.

  • In one of the most-cited of his series Statistics Notes in BMJ, Douglas Altman distinguishes the two scenarios in the title and explains what we can conclude when we fail to find the evidence we were looking for.

An Applied Statistician's Creed:  Marks R. Nester. Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 45, No. 4 (1996), pp. 401-410

  • Abstract: Hypothesis testing, as performed in the applied sciences, is criticized. The assumptions that the author believes should be axiomatic in all statistical analyses are listed. These assumptions render many hypothesis tests superfluous. The author argues that the image of statisticians will not improve until the nexus between hypothesis testing and statistics is broken.

The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant: Andrew Gelman and Hal Stern. The American Statistician, November 2006, Vol. 60, No. 4  

  • From the introduction: It is common to summarize statistical comparisons by declarations of statistical significance or nonsignificance. Here we discuss one problem with such declarations, namely that changes in statistical significance are often not themselves statistically significant. 

The ASA's Statement on p-Values: Context, Process, and Purpose, Ronald L. Wasserstein & Nicole A. Lazar (2016) The American Statistician, 70:2, 129-133.

  • The motivation for this statement from The American Statistical Association comes from the introduction to the paper: 

    In February 2014, George Cobb, Professor Emeritus of Mathematics and Statistics at Mount Holyoke College, posed these questions to an ASA discussion forum: 

    Q: Why do so many colleges and grad schools teach p = 0.05?

    A: Because that's still what the scientific community and journal editors use.

    Q: Why do so many people still use p = 0.05?

    A: Because that's what they were taught in college or grad school.

    Cobb's concern was a long-worrisome circularity in the sociology of science based on the use of bright lines such as p < 0.05: “We teach it because it's what we do; we do it because it's what we teach.” This concern was brought to the attention of the ASA Board.

The Immortal Time Bias, James Hanley and Bethany FosterInternational Journal of Epidemiology, Volume 43, Issue 3, 1 June 2014, Pages 949–961

  • This example is taken from the opening to the paper:
    • Some time ago, while conducting research on U.S. presidents, I noticed that those who became president at earlier ages tended to die younger. This informal observation led me to scattered sources that provided occasional empirical parallels and some possibilities for the theoretical underpinning of what I have come to call the precocity-longevity hypothesis. Simply stated, the hypothesis is that those who reach career peaks earlier tend to have shorter lives. (Stewart JH McCann. Personality and Social Psychology Bulletin2001;27:1429–39)
  • If you are not sure what methodological blunder this entails, then click on the link above and learn. 
  • One of the BRU biostatisticians - Ella Hustzi - has written about this in reply to a rather famous paper: Do Oscar Winners Live Longer than Less Successful Peers?A Reanalysis of the Evidence 

How to randomize, Andrew J. Vickers, Journal of the Society for Integrative Oncology, 2006 ; 4(4): 194–198. 

  • Abstract:  Randomized trials are an important method for deciding whether integrative oncology therapies do more good than harm. Many investigators do not pay sufficient attention to randomization procedures and several studies have shown that only a fraction of trial reports describe randomization adequately ...

This short paper reviews the aims of randomization in clinical trials and arrives at the conclusion that the only safe way to ensure that these aims are met is through computer-based randomization.

In "Redefine Statistical Significance", Daniel J. Benjamin, James O. Berger, and 70 more authors make a proposal that can be summarized in one sentence: "We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005. "

In "Justify Your Alpha", Daniel Lakens, Federico G. Adolfi, and 86 more authors rebut the previous paper and argue that  no single alpha value suits all purposes and that researchers should transparently report all study design choices, including choice of alpha.