Small Area Estimation Overview

Surveys are usually designed to provide reliable estimates at some geographical level: for example, the nation, Census regions, states, or counties. We refer to an estimate for a particular geographical area as "direct" if it is based only on responses from that geographical area. Most surveys are not designed to produce "direct" estimates for counties as the sample sizes are too small, and hence, the estimates are not reliable or stable.

A geographical area is regarded as "small" if the area sample is insufficient to yield direct estimates with adequate precision and reliability. In order to make estimates for small areas with adequate levels of precision, it is standard to use indirect estimates that utilize information from outside areas with similar characteristics to the area of interest. Generally, a statistical model is used to obtain indirect estimates for geographical areas considered to be "small". The information from respondents who are outside the geographical area and other geographical characteristics are incorporated through the use of a statistical model. The use of a model decreases the variability of the small area estimate, but if those characteristics are not chosen properly, it may introduce bias into the estimates (See Rationale for detailed information about why model-based estimates are necessary to produce small area level estimates of cancer risk factors and screening behaviors).

This website provides model-based estimates for small areas based on combined information from the two major health surveys: (1) The Behavioral Risk Factor Surveillance System (BRFSS), a large telephone survey conducted by every state and (2) The National Health Interview Survey (NHIS) an area probability sample survey. The BRFSS is designed to provide reliable direct state estimates by year. The National Health Interview Survey (NHIS), a smaller in-person survey, is designed to provide reliable direct estimates for the nation and Census regions. Thus, states are not small for the BRFSS but they may be small for the NHIS. Even though BRFSS is not designed to provide county estimates, some counties may have adequate sample size to have reliable direct estimates by year, but most counties would not have adequate sample size (See Data Sources for more information about the surveys).

A new statistical methodology combines responses from the BRFSS and NHIS, and the use of a statistical model yields estimates at a smaller geographical area than could be obtained from either survey directly (see Methodology for more information on the proposed combining model). The model-based estimates are expected to be better than the direct estimates on average, if the models used are appropriate; however, that doesn't mean that the model-based estimates are close to the true values for every area. When there is sufficient NHIS and BRFSS data for a specific geographic area, the combined estimates depend mainly on the available data from that geographic area. However, for areas with little or no NHIS and/or BRFSS sample, the estimates increasingly depend on using the demographic model to produce estimates for areas with "similar" profiles from across the country in terms of their covariates (see Limitations and Uses for more information).

Objectives in comprehensive cancer control plans often monitor cancer risk factors and screening behaviors (see National Comprehensive Cancer Control Program and Cancer Control P.L.A.N.E.T. for more information). The model-based estimates provided here may offer utility in local and regional comprehensive control planning.