USGS - science for a changing world

Northern Prairie Wildlife Research Center

  Home About NPWRC Our Science Staff Employment Contacts Common Questions About the Site

A Comprehensive Review of Observational and Site Evaluation Data of Migrant Whooping Cranes in the United States, 1943-99


Data Set Processing

Data were originally stored as dBase IV files, with separate files for each state. We imported files into Excel and appended files from each state to create 1 file of observations and 1 file of site evaluations. We used SAS (SAS Institute, Inc. 1990) to search for and correct errors in identification codes such as non-matching years or seasons, duplicate main and sub-observation codes, and missing records in either observation or site evaluation files. Cranes that had arrived at Aransas NWR in fall traditionally were categorized as "winter" observations, including those arrivals in October and November. We categorized as "winter" observations a few cranes that were sighted in late December and January in areas away from Aransas NWR. Similarly, we categorized a few records occurring after 15 June in North Dakota as "summer" observations. For this report, we ignored those observations determined to be in "winter" or "summer."

Observation Number System

All sightings of whooping cranes reported to the USFWS Grand Island office were uniquely coded using year (e.g., 79) and season (A = spring [January-June] and B = fall [July-December]), in sequence of their reporting within the calendar year; this code is referred to as the "main observation" code. Confirmed, probable, and unconfirmed sightings were assigned such codes. If multiple sightings of a single crane or group of cranes occurred within an area and period (e.g., the same bird[s] were recorded during 1 or more days at several locations), an alphabetic character denoting "sub-observation" was added at the end of the main observation code to identify these additional, associated sighting locations (e.g., 98A-25A, 98A-25B, 98-25C, etc.). Up to 12 such sub-observations occurred for any 1 main observation.

Only confirmed sightings were included in the 2 subsequent data sets developed from incidental sightings of whooping cranes: the observation data set (OBSERVATIONS) and site evaluations data set (EVALUATIONS). The observation data set includes basic information about whooping crane sightings for the period 1943-99, including dates, locations (description and legal system description), and numbers of adults and juveniles. The site evaluation data set reflects a subset of locations from the observation data, where biologists collected specific habitat information during the period 1977-99. See Appendices for detailed information about the variables within each data set.

We merged OBSERVATIONS and EVALUATIONS by state, main observation, and sub-observation codes and corrected problems noted during the merging process. Some inconsistencies that occurred in recording were checked and corrected as necessary (e.g., wrong season descriptor, difference in county recorded for observation vs. site evaluation data). We also added some specific records so that all unique locations, based on legal descriptions in each original database (e.g., another section or quarter-section), were represented. This expanded the merged SAS database (ALL.SD2) to a total of 1876 records. We used SAS to examine the data set for unusual records, missing data, etc. We found mostly minor errors in the data sets (e.g., variable entered off by 1 field, mistyped, errors in wetland size class). Overall, the data sets were quite clean considering their size and period of development.

We created new codes for numeric observation and site identification based on year, season (spring, fall), main observation number, site (separate area for the same group of birds), and location (based on legal description). For example: 199710101001 includes year (columns 1-4), season (column 5; 1 = spring, 2 = fall), observation number within that year (maintaining original main observation number; columns 6-8), sub-observation number (previously recorded as A,B,C, etc.; columns 9-10), and location number (where there were multiple locations for that main observation and sub-observation code; columns 11-12). This code, referred to as LOCAT_ID, was developed to provide a consistent code format for referencing information at various levels (year, main observation, sub-observation, location).


Locations had been recorded using the legal description system (county, township, range, section, quarter, quarter-of-quarter) to the detail possible with the information provided. All records had a county noted, but earlier records often lacked township-range-section information because of imprecise descriptions and absence of maps. We used copies of all maps that had been submitted with the observations or site evaluations to proof legal-system locations. We found some errors, but these occurred at a relatively low rate (Table 1). Most errors were due to inconsistencies in recording quarters within quarter-sections (e.g., NW quarter of SE quarter) in the data set. Normally, one reports such third-order locations as "section 1, NW quarter of SE quarter." However, this standard did not readily work in the database format. Most records in the database had data only for first- (SECTION) or second-order (QUARTER) information; for these records, the QUARTER column appropriately represented quarter-section information. However, when third-order locations were recorded, the third-order was recorded in the QUARTER column and the second-order location was recorded in QUARTER-OF-QUARTER column. We corrected these to ensure consistent order level within each column.

Table 1.  Occurrence of errors in legal descriptions of site locations, by state, as determined by comparison to mapped crane locations. N = number of location records having mapped information.
State N Occurrence of errors
Township Range Section Quarter
Kansas 90 2 3 3 2
Montana 7 0 1 0 0
Nebraska 356 7 0 4 59
North Dakota 51 0 0 0 1
Oklahoma 31 0 2 3 0
South Dakota 53 1 1 2 1
Texas 3 0 0 0 0
All States 591 10

Sightings recorded during early years (1943-74) and in some geographic areas had less detailed information or no location information at all. Most reporting forms included driving directions or written descriptions (e.g., distance from a town, bridge, or named lake or feature). We used these descriptions to verify or add location information to the best of our ability, given the information provided. In some cases, we were only able to narrow the location down to a general township/range area (scale of location = 93.2 km² [36 mi²]). By this method, we were able to add quite a few locations to the database. In order to differentiate locations based on our confidence of the original data source, we added a variable (SCALE) to denote the accuracy or scale (2.6 km² [1 mi²] for the most accurate locations, 93.2 km² for those located somewhere within a township) of each location.

We used the corrected legal descriptions to convert the locations to x and y coordinates using the Albers equal area map projection; this is the most commonly used projection for U.S. Geological Survey (USGS) mapping data at scales less than 1:100,000. We used electronic data files available from public land survey system data files from USGS to convert legal descriptions to x and y coordinates for North Dakota, South Dakota, Nebraska, Kansas, and Oklahoma. For other states, we determined latitude and longitude from paper maps (to the nearest half degree) and converted these to x and y coordinates using MicroImages "Map Calculator" function (MicroImages, Inc., Lincoln, NE). We used these x and y coordinates for all mapping and graphics.

Site Evaluation Data and Codes

For characterizing habitats from site evaluation data, we used only those records from 1977-99. We split the site evaluation data into 3 files, 1 for feeding sites (FEEDEVAL), 1 for roosting sites (ROSTEVALL), and 1 for sites used for both feeding and roosting or where site use was unknown (DUALEVAL), which we refer to hereafter as dual-use sites. Some data were summarized for all site uses combined, but most data summarizations were conducted separately on ROSTEVAL, FEEDEVAL, and DUALEVAL.

A number of variables included entries with multiple numbers or alphanumeric characters within a single data field (e.g., feeding site description: "21,11X,Z"). We used SAS programs, and in some cases record-by-record hand editing, to divide such information into new habitat variables which could then be separately summarized. Specifically, we extracted the following variables from the description: water depth, wetland size, roost site description, feeding site description, adjacent habitat, primary potential food sources, and foods observed eaten. See sections below for details for each variable.

Howe (1987) reported on the habitat use, survival, and behavior of 27 whooping cranes (9 radio-marked and others associated with them) that were tracked between Wood Buffalo National Park and Aransas National Wildlife Refuges during 1981-84. Sixty-seven observations of these marked cranes were included in OBSERVATIONS (OBS_TYPE = "RADIO") and 9 site evaluations were recorded in EVALUATIONS (SOURCE = 3 [Howe 1987]: 2 in MT, fall 1981; 1 in NE, fall 1981; 3 in KS, fall 1981; 1 in SD, spring 1982; and 2 in KS, spring 1983). Apparently, only the sightings of these marked cranes that were reported by citizens (and other chance observations) were used in the site evaluation data sets. There was no overlap in site evaluation data between Howe's (1987) data and the data used here (W. Jobman, USFWS, Grand Island, NE, personal communication), therefore results from EVALUATIONS are independent of those in Howe (1987).

Multiple sub-observations for given main observation:  In a number of cases, multiple observations (2-12 records) existed for the same bird(s) observed in an area. We believed that these multiple observations were similar to repeated measures and thus could bias some measures of habitats used. Therefore, we sought to limit our analyses to only 1 record for each main observation. We assumed a discrete set of observations in an area was denoted by a combination of main observation and sub-observation codes (e.g., 88A-5A, 88A-5B, 88A-5C). In most cases, the multiple records were due to recording a number of different feeding habitats, different locations (e.g., different quarter-sections), or different roost sites. Because we conducted most analyses on each site use separately, we excluded multiple observations within each site-use data set. We selected only the first record where site use = "feeding" to include in the feeding habitat assessments (FEEDEVAL), the first record where site-use = "roost" to include in the roosting habitat assessment (ROSTEVAL), and the first record where site use = "roost and feed" or "unknown" for dual-use sites (DUALEVAL). Selection of the first record may minimize any effects of observer disturbance on habitat use by cranes. We believe roost-only data provide the most conservative assessment of habitats used for roosting, whereas those sites used for both roosting and feeding likely provide a better assessment of sites (nearly all wetlands) used for both purposes. All further summaries were conducted on these 3 data sets unless otherwise noted. We will evaluate habitat-use patterns described by the multiple observations in a separate report at a later time.

We did not conduct any statistical tests on the data because the observational data would violate several key statistical assumptions. First, we cannot verify that data are independent — it is impossible to know whether observations are from the same birds, or whether some cranes are more likely to be included in a series of observations. Second, statistical tests require that the probability of observation is the same among groups. With observational data, there is no way to determine if there is an increased likelihood of an observation in one habitat type over another. Therefore we don't know if the data are representative of the target population. Our presentation of the data, therefore, is entirely descriptive.

Handling of specific variables and analyses:  For each variable in the data set, we describe below specific ways in which the data were examined. For those variables which were categorical, we determined frequency distributions for each category. Unless otherwise noted, data summaries were conducted for each site-use data set. See Appendix I for definitions of original variables.

Wetland classification system:  We recognized that wetland description or classification was of particular interest to whooping crane biologists and therefore we were sensitive to checking and clarifying this variable using all available information. We split out the original WETCLASS variable (recorded following Cowardin et al. 1979) into separate classification variables: SYSTEM (lacustrine, palustrine, riverine), SUBSYSTEM (e.g., lower or upper perennial, intermittent for riverine systems), CLASS (e.g., rock bottom, unconsolidated bottom, aquatic bed), and REGIME (special modifier for flooding or water regime). For the latter, we merged category 9d (intermittently/temporarily flooded) with 7d (intermittently flooded) because of the rarity of their occurrence and their similarity. A number of earlier records used only wetland classes from Circular 39 (I, II, III, IV, etc; Shaw and Fredine 1956); we converted these, as best possible, into REGIME, but often were unable to add the complete data for wetland classification system following Cowardin et al. (1979) format. We also used comments and information under roost or feeding site descriptions to refine our conversions for CLASS and REGIME. Specifically, we classified I (wet meadow) as emergent-temporary; II (fresh meadow) as emergent-saturated; III as emergent-seasonal (sometimes semipermanent); IV as emergent-semipermanent (sometimes as permanent); and V as aquatic bed-permanent. We pooled classes of wetland regime into 4 categories: permanent (permanent, intermittently exposed, and artificially flooded), semipermanent, seasonal, and temporary (saturated, temporary, and intermittently flooded).

Water depth:  Observers recorded the range of depths for the entire wetland (RDEPTH) and range of depths at points within the wetland where cranes were observed (CDEPTH). Because both depth types were recorded in a single column, we separated the 2 depths into 2 variables. Because of the great range of depths given and because the range almost always included 0 or 2.5 cm, we considered only maximum depth for both parameters.

Quality:  Observers recorded water quality as clear, turbid, or saline. Because more than 1 category of water quality was sometimes recorded (e.g., clear and saline), the sum of percentages by type was sometimes greater than 100%. For each site use, we determined the frequency distribution by wetland system for each category of water quality.

Substrate:  Wetland substrates were categorized as sand, soft mud, hard mud, or other. Although there were some records with more than 1 substrate category recorded (4 in ROSTEVAL, 1 in FEEDEVAL, and 8 in DUALEVAL), we used only the first category, assuming this was the dominant characteristic of that site. For each site use, we determined the frequency distribution for each category of substrate, and also examined their frequency by wetland system.

Slope of wetland:  Observers categorized the shoreline slope as <1%, 1 to <5%, 5 -10%, >10%, not applicable, or other. For each site use, we determined the frequency distribution for each slope category.

Emergent vegetation type:  Vegetation types occurring in the wetland were classified as grass, sedge (Carex), cattail (Typha), rush (Juncus), smartweed (Polygonum), other, or none. Many records included multiple types of emergent vegetation; only category 7 (no vegetation) occurred without other types (with 1 exception). Because several vegetation types usually were present, the sum of percentages by type was often greater than 100%. We report the frequency distribution, by wetland system and site use, for each vegetation type.

Distribution pattern of vegetation:  Observers recorded the distribution of emergent vegetation as none, scattered, clumped, or choked; we found no specific definitions for these categories. This variable was originally referred to as "vegetation density," but it more appropriately describes the distribution of vegetation within a wetland. If >1 code was recorded, we used only the first code for summaries. We report the frequency distribution, by wetland system and site use, for each distribution category.

Roost site description:  Observers used 2 category lists to describe roost sites, 1 list of general habitat types and 1 list of crop types. Habitat types included flooded pasture, wooded creek or draw, flooded cropland, stock pond, reservoir, lake, marsh, river, salt marsh, tailwater pit, seasonally-flooded basin, cropland, pasture, wet meadow, hay meadow, woodland, or other; no definitions or descriptions were provided. Crop types included alfalfa, barley, corn, Conservation Reserve Program, rice, sunflower, fallow, milo, disked alfalfa, oat stubble, popcorn, green rye, soybean, bean stubble, sunflower (assumed to be stubble), winter wheat, wheat stubble, milo stubble, and corn stubble; no further definitions or descriptions were given. The recorded variable usually included only 1 numeric code denoting habitat type and infrequently had alphabetic modifiers denoting crop type; thus it was relatively simple to summarize. When >1 code was included (e.g., 11, 17), we used only the first numeric code and assumed that those codes most accurately described the main roost area. We did not examine frequency of alphabetic modifiers because they were rarely recorded.

Feeding site description:  Observers used the same list of habitat types and crop types to describe feeding sites. Unlike roost site data, however, the feeding site variable, as originally coded, was quite complex and included 1-5 numeric codes denoting habitat type and, for any 1 numeric code, 1-5 alphabetic codes denoting cropland type. We determined whether each habitat or cropland code occurred in a record and examined the frequency of occurrence of each type code in FEEDEVAL and DUALEVAL. We did not examine feeding site descriptors of ROSTEVAL because no feeding should have occurred during such site use, although a few records did have such information recorded. We pooled some habitat and crop types to facilitate comparison among seasons or site uses and, in particular, to pool appropriate types into a seasonal wetland type, permanent water type, and perennial upland cover (see Table 2). Cropland and woodland types were not pooled with other categories. Habitat classified as "Other" was very uncommon and thus ignored. We pooled crop types to facilitate comparisons among green crops, standing small grain or row crops, small grain or row-crop stubble, and other crop types. One crop type often dominated in the new descriptors: green cover was predominantly winter wheat, small grain stubble was predominantly wheat stubble, and row-crop stubble was predominantly corn stubble.

Table 2.  Pooled habitat and crop types for descriptions of feeding and adjacent habitats.
New descriptor Original
code no.
Original description
Habitat type
Seasonally flooded wetlands (WETSEAS) 9
Flooded pasture
Flooded cropland
Seasonally-flooded basin
Permanent water (WETPERM) 12
Stock pond
Salt Marsh
Tailwater pit
Cropland (UPCROP) 21 Cropland (see below for crop types)
Upland perennial cover (UPCOVER) 22
Wet meadow
Hay meadow
Upland woodland (UPWOOD) 25 Woodland
Crop type
Green crops (GREENS) A
Green rye
Winter wheat
Small grain - standing (GRAIN-STD) B
Spring wheat
Small grain - stubble (GRAIN-STUB) O
Oat stubble
Barley stubble
Wheat stubble
Row-crop - standing (ROW-STD) C
Row-crop - stubble (ROW-STUB) T
Soybean stubble
Sunflower stubble
Milo stubble
Corn stubble
Other (OTHER) L
Disked alfalfa
Conservation Reserve Program cover

Adjacent habitat:  Observers used the same list of habitat and crop type variables as noted above to describe habitats adjacent to the site. As occurred for feeding sites, this variable usually had multiple numeric and alphabetic codes. Data were extracted and frequencies of occurrence determined for all 3 data sets similar to methods noted above for feeding site description.

Wetland size:  For lacustrine and palustrine systems, we used wetland size class, ignoring actual sizes that were sometimes provided. For some data comparison, we pooled the 6 wetland classes into 3 classes: A = <0.4-2 ha, B = 2 to <20 ha, and C = 20 to >40.5 ha. For riverine systems, we extracted river width data (continuous rather than class variable) from the original variable column and created a new variable.

Distance to feeding site:  Distance to feeding sites was categorized as <0.4 km, 0.4 to <0.8 km, 0.8 to <1.2 km, 1.2-1.6 km, >1.6 km, or not applicable. We determined the frequency distribution for each category for roost and dual-use sites.

Distance to human development:  Distance to nearest human development was categorized using the same distance categories as used above. No definition of human development was given for USFWS report forms, but the Nebraska reporting form (Report Form 6) listed paved and gravel road, single or urban (>3) dwellings, railroad, commercial development, recreational area, and bridge. We determined the frequency distribution for each distance category. Because this information was the same among site uses for a record, we used only 1 record for each main observation.

Primary potential food sources:  Observers categorized potential food sources available to cranes at the site as: grain (seed and plant material), tubers, insects and other invertebrates, molluscs, crustaceans, fish, frogs, other, and salamanders. Data in this variable originally were coded in a similar manner as feeding site and adjacent habitat descriptions. Data similarly were extracted and examined using frequencies of occurrence for FEEDEVAL and DUALEVAL. Because feeding site descriptions often included both upland and wetland codes, we did not separate the data between upland and wetland habitats.

Foods observed eaten:  Foods observed eaten by cranes were recorded using the same categories as above. These data were examined in the same manner as above, but observations were so few that we report only a simple list of foods observed for both FEEDEVAL and DUALEVAL.

Site security:  The security of the site was defined as the stability and security of the habitat and any nearby activities that could threaten the site or cranes there. Categories included stable, threatened, and unknown. We determined the frequency distribution for each category of site security by site use and site ownership.

Extent of similar habitat within 16-km (10-mi) radius:   Observers ranked the extent of habitat similar to that of the site within a 16-km radius as none, little, moderate or common, abundant, or unknown. We determined the frequency distribution for each category of similar habitat. Because this information was the same among site uses for a record, we used only 1 record for each main observation.

Site ownership:   Ownership of a site was categorized as private, federal, state, and other. Many records included multiple types of site ownership. Because several ownership categories could occur for 1 record, the sum of percentages by type often was greater than 100%. We report the frequency distribution, by site use, for each ownership category.

Visibility:  Visibility from the site to the nearest obstruction >1.4 m high was categorized as <91 m, 91-401 m, 402-805 m, >805 m, and "unlimited;" we pooled the latter 2 categories together. To assess how visibility might differ among main habitat types for ROSTEVAL, we summarized data for each wetland system. For FEEDEVAL and DUALEVAL, we used descriptors from the feeding habitat descriptions to define whether the cranes were in upland, wetland, or riverine habitat.

Distance to utility lines:  Distance of the site to power or phone lines were categorized using the same distance categories as visibility. We report the frequency distribution, by site use, for each distance category. Because this information was the same among site uses for a record, we used only 1 record for each main observation.

Crane groups:  We used 2 variables (number of adults, number of juveniles) from OBSERVATIONS to classify social group for each record; EVALUATIONS only indicated total number of whooping cranes present. There were only 7 discrepancies between total group size in EVALUATION and OBSERVATION data sets; these discrepancies likely occurred when observers visited the original observation site 1-2 days later and noted a different number of adults or juveniles present. We classified cranes into 6 groups: 1) single adult, 2) single juvenile, 3) pair, consisting of 2 adults only, 4) single family group, consisting of 1-2 adults and 1-2 juveniles, 5) mixed group, consisting of a group with ≥1 adult and ≥1 juvenile, and 6) adult group, consisting of >2 adults and 0 juveniles. The number of juveniles often was missing (no data recorded), and sometimes the number of adults also was missing; we assumed that these were 0. We pooled records into 3 groups for some summaries: family groups (adults with at least 1 juvenile), nonfamily groups (adults with no juveniles), and single cranes (single adults and single juveniles).

Mapping and Distributional Analyses

The geographic information system (GIS) database consisted of whooping crane sightings, political boundaries, and physical features. We used data layers that were available for the entire flyway. Other suitable data were available in digital format for only portions of the flyway (i.e., NWI, refuge boundaries). State and county boundary data were used for indicating position of whooping crane locations and to identify state-wide trends. Both data layers were obtained from the ESRI ArcView v3.2. sample package (Environmental Systems Research Institute, Inc., Redlands, CA). Ecoregion (aquatic ecoregions of the conterminous United States; Omernik 1987) and stream (1:2,000,000-scale Digital Line Graph files of streams; Lanfear 1991) data were used to show relationships between migration movements and physical features of the landscapes. These data were obtained from the USGS list of spatial data sets for water (

Distribution patterns were displayed using ESRI ArcView v3.2 software. We projected data layers in Albers equal-area conic. This was consistent with other data files with scales of at least 1:100,000 maintained at USGS offices. Use of Albers also minimized distortion when merging township/range information from the large 10-state area used by whooping cranes during migration. In addition, we used data layers to graphically display spatial trends. When displaying data in Albers projection, equal areas are displayed correctly. Therefore display polygons are depicted with their true representative area.

All maps were made using only main observation data. All data layers were divided into 2 levels (flyway-states and state boundaries). The stream data layer was cut using a clip function and all other data layers were subset using the query tool. Maps were made by overlaying subset whooping crane data with the appropriate region-specific political or physical data.

Previous Section -- Development of Observation and Site Evaluation Databases
Return to Contents
Next Section -- Results

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo logo U.S. Department of the Interior | U.S. Geological Survey
Page Contact Information: Webmaster
Page Last Modified: Friday, 01-Feb-2013 20:05:05 EST
Menlo Park, CA [caww55]