Statistics for Wildlifers: How much and what kind?
What should a wildlifer know about statistics?
There are some basic things that come from statistical training. We illustrate
some of these, not to suggest that they form an exhaustive list, but simply
to exemplify key points. Again, these are based on our own experiences and represent
common sources of confusion among biologists. Some technical facts that should
be known are:
What a confidence interval is, and is not
The difference between an estimator and an estimate
What a least-squares estimator is. What a maximum-likelihood estimator
is. And that sometimes they are the same, but not always
The meaning of bias, precision, and accuracy
That some biased estimators actually may be better than unbiased estimators.
Some facts that reflect an understanding, often based on experience, are:
That most data are not distributed normally
That, nonetheless, most means are distributed nearly normally, even with
modest sample sizes
That parametric tests do not always require that data be normal
In most situations, estimation is more useful and appropriate than hypothesis
testing
Determining what are valid sample units is sometimes challenging. What
units are really independent? Distinguishing true replication from pseudoreplication
(Hurlbert 1984)
Sophisticated methods are not always better than simpler ones. Being complicated
may confuse more than clarify
A random sample may not be representative of the population from which
the sample was drawn
Some caveats about the practice of statistics are worthwhile to consider:
Results from unfamiliar statistical packages or statistical procedures
should be viewed with a certain amount of skepticism. This is true for familiar
packages, actually, but is especially important for unfamiliar ones. When
using new software or methods, it is often worthwhile to make up some data
with known properties, or find trusted data with known properties, and use
them to test the new tools.
Data dredging can be dangerous (Anderson et al. 2001). Avoid "beating
the data till they confess." Think of the questions you want to ask before
looking at the data. If you find some new and unexpected patterns in the
data, that is great, but use that occasion to develop a question to ask
of a fresh set of data rather than testing to see if the pattern is "significant"
with the data already in hand.
Some notions are simply good science:
Focus on analyzing the problem, not the data.
There likely is no single right analysis, or a single right model to use
(Burnham and Anderson 1998). Different analyses, or different models, may
appropriately be used in any situation. If different analyses give substantially
the same results, one has greater confidence in those results. This is true
especially if the analyses are based on different sets of assumptions.