The Insignificance of Statistical Significance Testing


Replication is a cornerstone of science. If results from a study cannot be reproduced, they have no credibility. Scale is important here. Conducting the same study at the same time but at several different sites and getting comparable results is reassuring, but not nearly so convincing as having different investigators achieve similar results using different methods in different areas at different times. R. A. Fisher's idea of solid knowledge was not a single extremely significant result, but rather the ability of repeatedly getting results significant at 5% (Tukey 1969). Shaver (1993:304) observed that "The question of interest is whether an effect size of a magnitude judged to be important has been consistently obtained across valid replications. Whether any or all of the results are statistically significant is irrelevant." Replicated results automatically make statistical significance testing unnecessary (Bauernfeind 1968).

Individual studies rarely contain sufficient information to support a final conclusion about the truth or value of a hypothesis (Schmidt and Hunter 1997). Studies differ in design, measurement devices, samples included, weather conditions, and many other ways. This variability among studies is more pervasive in ecological situations than in, for example, the physical sciences (Ellison 1996). To have generality, results should be consistent under a wide variety of circumstances. Meta-analysis provides some tools for combining information from repeated studies (e.g., Hedges and Olkin 1985) and can reduce dependence on significance testing by examining replicated studies (Schmidt and Hunter 1997). Meta-analysis can be dangerously misleading, however, if nonsignificant results or results that did not conform to the conventional wisdom were less likely to have been published.

