4.5 Problems with the concept of statistical significance
The approach of summarizing by statistical significance has five pitfalls:
Statistical significance is not the same as practical importance
Non-significance is not the same as zero
The difference between ‘significant’ and ‘not significant’ is not itself statistically significant. (E.g. 25+-10 is significant with p-value ~ 0.012 , while 10+-10 is not significant, p-value ~ 0.32 )
The statistical significance filter - with small studies and small effects, significant effects must be big.
Researcher degrees of freedom, p-hacking, and forking paths
P-hacking is when the researcher intentionally fishes for ‘significance’ by trying multiple analysis approaches. However, even a well-intentioned researcher, who is computing a single test, can be unintentionally ‘fishing’ in the ‘garden of forking paths’ since the choice of that test and other choices made along the way in the analysis would likely have been different given different realized data. One scientific hypothesis can lead to many statistical hypotheses. Gelman and Loken 2014 is worth reading for more on this topic.