USGS - science for a changing world

Northern Prairie Wildlife Research Center

  Home About NPWRC Our Science Staff Employment Contacts Common Questions About the Site

Using Known Populations of Pronghorn to Evaluate
Sampling Plans and Estimators


North Dakota Game and Fish Department personnel counted pronghorn in the 2 study areas by flying east-west linear strip transects that extended the length of the study area and were 0.8 km apart. Transects were searched 0.4 km on each side of the aircraft. Observers and pilots were experienced in surveys of pronghorn. Transects were 2.4-41.6 km long in the Bowman area and 1.6-64.0 km long in the Slope area. A Piper Super Cub was flown 96-128 km/hour at an altitude of 100-115 m. When the pilot or an observer sighted pronghorn, the aircraft circled the herd so that all pronghorn in the herd could be counted. Where pronghorn detectability might be lower due to heterogeneous habitat, areas were searched thoroughly at an altitude of 25 m. We recorded the number of pronghorn counted in each quarter section (0.65 km²) on field maps.

We used 2 surveys of the Bowman area, 1 in July 1979 and the other in July 1987, in which 201 and 630 pronghorn were seen, respectively, and a single July 1986 survey of the Slope area, in which 350 pronghorn were seen. We believe that counts were virtually exact, because of open terrain, narrow transect width, high visibility of pronghorn, and careful searching methods (Pojar et al. 1995). Nonetheless, because we could not determine visibility bias for the surveys, our results are conditional on observed distribution of pronghorn.

Sampling Plans

A sampling plan involves defining and selecting the sampling unit, choosing a sample size, and deciding on stratification. In addition, a population estimator must be selected. We selected combinations of sampling plans and estimators on the basis of previous use, suggestions by other researchers, or potential for producing valid estimates.

The sampling unit was a 0.8-km-wide linear transect variable in length (Table 1) according to size and shape of the study area or stratum. We examined 3 methods for selecting sampling units: (1) simple random sampling without replacement (SRS)(Cochran 1977:18), (2) probability proportional to size with replacement (PPS), and (3) systematic sampling (SYS). Under SRS, each sampling unit had an equal chance of being selected. With PPS sampling, the probability of choosing a sampling unit was proportional to the area of the sampling unit. With SYS, units were numbered 1 to M, where the total number of sampling units was M = mp, m was the sample size selected from M units, and p was the number of possible systematic samples. The first unit was randomly chosen from among the first p units, and then every p unit following was selected.

Table 1.  Size of study areas, number of sampling units (M), total count (N) of pronghorn, and variance of N for study areas with and without stratification in Bowman (1979 and 1987) and Slope counties (1986), North Dakota.
  First survey a Second survey b
Study area Area (km²) M Transect lengths (km) N Variance c N Variance
   Total 1,242 48 2.4-41.6 201 51.4 630 373.5
   Grassland stratum 486 30 7.2-27.3 185 61.4 355 311.1
   Mixed stratum 756 48 2.4-36.9 16 2.0 275 95.8
   Total 2,387 62 1.6-64.0 350 62.5    
   Grassland stratum 1,690 48 18.0-48.9 343 69.5    
   Mixed stratum 697 76 1.3-28.9 7 0.1    
a Jul 1979 for Bowman area, Jul 1986 for Slope area.
b Jul 1987 for Bowman area only.
c Population variance, σ² = GIF-Sigma(ni - µ)²/M, where ni is the count on unit i, µ is the population mean, and M is the total no. of transects.

We considered 3 levels of sampling intensity: 16, 33, and 50% of the total number of sampling units. Except in the stratified Slope area, the percentage of the area sampled was within 2% of sampling intensity.

We considered stratification and no stratification of study areas. On the basis of 1974 LANDSAT data, we stratified each study area into 2 vegetational types, grassland stratum and mixed stratum, thought to correspond to areas of high and low use, respectively, by pronghorn. Grassland stratum contained extensive grassland; the mixed stratum was composed of cultivated lands, badlands, and a small amount (10-14%) of grassland. We used the same stratification for both years in the Bowman area. The grassland stratum was smaller than the mixed in the Bowman area, but the reverse was true for the Slope area (Table 1).

Estimators of Abundance

Depending on the selection method, we evaluated 1-4 estimators of abundance: simple (Cochran 1977:22-26, 207, 224), probability proportional to size (pps; note use of lower case to distinguish the estimator from PPS sampling) (Cochran 1977:253-254), separate ratio, and combined ratio estimators (Cochran 1977:150-162). We used the area of the sampling unit as the auxiliary variable for the pps and ratio estimators. When the surveyed area was stratified, an abundance estimate (N-hatj) and its variance were calculated independently in each stratum. Estimated overall abundance (N-hat) and its variance were obtained by summing estimates across strata.

Once a sample size, m, had been selected, the number of sampling units chosen from each stratum could be determined in many ways. Stratum sample sizes, mj, may be allocated in a way that yields the minimum variance of the estimate, but this optimal allocation depended on the selection method and estimator used and on unknown population parameters (Cochran 1977:172). Optimal allocation with SRS using the simple estimator required that population variance of the count in each stratum be known (Cochran 1977:97-98). We tested an approximation of an optimal allocation:

Equation 1

where P-hat, was the estimated proportion of pronghorn in stratum j, and Mj, was the total number of sampling units in stratum j. This method was optimal if sampling was SRS with the simple estimator and P-hatj, (or equivalently N-hatj) was proportional to the population variance of the count in the j th stratum. The method was similar to that used by Siniff and Skoog (1964) and places greater sampling intensity where abundance is thought to be greater. For our evaluations, we asked a biologist familiar with western North Dakota, but who had not seen the pronghorn data, to estimate the proportion of pronghorn in each stratum. We used the same allocation method for all combinations of sampling plans and estimators and were able to compare our calculated sample sizes with the true optimal sample sizes because we had a known distribution of counts.

Evaluation of Sampling Plans and Estimators

For each of the 3 known population distributions (Bowman area, 1979, 1987; Slope area, 1986), we drew 1,000 random samples of the specified size according to the specified selection method. For example, there were 48 transects in the Bowman area; for a simple random sample of 33% intensity, we randomly drew 16 transects with equal probability and without replacement. For systematic sampling, we drew all possible samples.

We compared combinations of sampling plans and estimators on the basis of 3 criteria: accuracy of the estimator, confidence interval coverage, and cost. Accuracy of the estimators, N-hat, was of primary importance for estimating abundance, N. A useful measurement of accuracy is the mean square error (MSE), which is the variance of the estimator plus the squared bias. For all simulations, the percent difference between MSE and variance was < 1 %, so MSE approximated variance. If variance was equal to MSE, then there was no bias and accuracy was the same as precision. We used the CV

Equation 2

as a measure of precision, facilitating comparisons across study areas and years. The smaller the CV, the more precise the estimator. For the simple and pps estimators, we could calculate the exact CV, but for the ratio estimators we used the estimated CV

Equation 3

where r was the number of repetitions of the simulation, and GIF-Var was the estimated variance of the population estimate for the i th simulation.

The coverage of usual 95% confidence intervals was an important criterion to consider. For each simulation, we constructed nominal 95% confidence intervals:

Equation 4

where t was the 0.975 percentile of Student's t distribution with m - 1 df with no stratification and m1 + m2 - 2 df with stratification. For each combination of sampling plan and estimator, we calculated the confidence interval coverage as the percentage of confidence intervals containing N.

For simplicity, we calculated cost for each simulated survey as the sum of the lengths of the transects and the travel distances between transects. These costs were averaged across simulations under a particular sampling plan to get the cost for that plan.

The large number of simulations we used ensured repeatability of results. To measure the performance of simulations, we calculated the CV of estimates of CV, coverage, and cost for a number of sampling plans and estimators. We did not perform significance tests because all comparisons would have been significant (P < 0.001) due to the large number of simulations.

Previous Section -- Study Areas
Return to Contents
Next Section -- Results

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo logo U.S. Department of the Interior | U.S. Geological Survey
Page Contact Information: Webmaster
Page Last Modified: Saturday, 02-Feb-2013 05:55:07 EST
Sioux Falls, SD [sdww55]