The Statistical Crisis in Science and What We Can Do About It
There is a statistical crisis in science, particularly in psychology where many celebrated findings have failed to replicate, and where careful analysis has revealed that many celebrated research projects were fatally flawed in their design in the sense of never having sufficiently accurate data to answer the questions they were attempting to resolve. The statistical methods which revolutionized science in the 1930s-1950s no longer seem to work in the 21st century. How can this be? It turns out that when effects are small and highly variable, the classical approach of black-box inference from randomized experiments or observational studies no longer works as advertised. We discuss the conceptual barriers that have allowed researchers to avoid confronting these issues, which arise not just in psychology but also in policy research, public health, and other fields. To do better, we recommend three steps: (a) designing studies based on a perspective of realism rather than gambling or hope, (b) higher quality data collection, and (c) data analysis that combines multiple sources of information.
Some of material in the talk appears in our recent papers, “The failure of null hypothesis significance testing when studying incremental changes, and what to do about it” (http://www.stat.columbia.edu/~gelman/research/published/incrementalism_3.pdf) and “Some natural solutions to the p-value communication problem—and why they won’t work" (http://www.stat.columbia.edu/~gelman/research/published/jasa_signif_2.pdf).