nci logo
NIH
U.S. National Institutes of Health National Cancer Institute

Detailed Asian/Pacific Islander Databases (2000-Centered)

In the 2000 census, individuals were able to indicate multiple race responses on the census form. These responses can be tabulated as two population values for detailed Asian and Pacific Islander (API) groups; namely, the specific API group alone (counting those who self-identified with only one API group), and the specific API group alone or in combination with other any other race group(s) (counting those who self-identified with either a single API group or with more than one race group, at least one of which was the specific API group of interest). Thus, the population counts for each of the specific API groups are not mutually exclusive. We have created separate analysis files linked to each way of tabulating the race-specific population denominators. All databases use 2000 populations for each data year (1998-2002) and their intended use is to produce average annual incidence and mortality rates for the combined five-year time period; similar to those published in the following article: Miller BA, Chu KC, Hankey BF, Ries LAG. Cancer incidence and mortality patterns among specific Asian and Pacific Islander populations in the U.S. Cancer Causes Control 2008 Apr;19(3):227-56. View Paper.

Available Databases

The following four databases are available:

  • Incidence - Racial Ethnic Mono, SEER 18 excl AZ, Nov 2005 Sub for Detailed API Races (1998-2002) <Low 2000 Pops by 5>
  • Incidence - Racial Ethnic Mono, SEER 18 excl AZ, Nov 2005 Sub for Detailed API Races (1998-2002) <High 2000 Pops by 5>
  • Mortality - Racial Ethnic Mono, All COD, Total U.S. for Detailed API Races (1998-2002) <Low 2000 Pops by 5>
  • Mortality - Racial Ethnic Mono, All COD, Total U.S. for Detailed API Races (1998-2002) <High 2000 Pops by 5>

Obtaining the API databases requires the following:

  • A signed research data agreement to access the US mortality and SEER data through SEER*Stat.
  • Access the data utilizing SEER*Stat's Client-Server Mode. The data are not distributed on the SEER*Stat CD set.

Low Population Databases

Low populations include tabulations of respondents who selected a single race on the 2000 census form (e.g. Chinese alone). When working with the low populations, you must use the population defining race variable in all SEER*Stat sessions. You can use the variable in a selection statement or as a table variable.

High Population Databases

High populations include those respondents that selected the race of interest either alone, or in combination with one or more other races. When working with the high populations, you must use the population defining race variable in all SEER*Stat sessions. You can use the variable in a selection statement or as a table variable.

Race/Geography Available for Analysis and Used in NCI Publication

The following table shows the racial/ethnic group and geographic area combinations that are available for analysis with these database.

Table 1. Geographic areas included in cancer incidence and mortality rates for each racial/ethnic group, 1998-2002.
Race/Ethnic Group Incidence rates Mortality rates
CAa
CT
HI
IA
KY
LA
NJ
NM
UT
Atlanta metro
Detroit metro
Seattle-Puget Soundb
CA, HI, IL, NJ, NY, TX, WA
X Indicates area was included in rate calculations.
a Includes cancer registries for Los Angeles, San Francisco/Oakland, San Jose/Monterey, and all remaining areas in California combined.
b Indicates number of counties within the 11-county Seattle-Puget Sound area for which population estimates were NOT suppressed by the Census Bureau; and thus could be included in the incidence analyses.
c Incidence rates calculated for combined group of Asian Indians & Pakistanis due to Surveillance, Epidemiology & End Results (SEER) program coding rules; mortality rates calculated only for Asian Indians due to National Center for Health Statistics (NCHS) coding rules.
d Area not included in rate calculation due to suppression of population data by Census Bureau.
e Area not included in rate calculation due to small population size based on 2000 census (<1,000).
f Native Hawaiian rates calculated only for the state of Hawaii.
g Mortality data not available from NCHS for these race/ethnic groups.
Asian Indian or Pakistanic X X ---d X X X X X X X X 2 X
Chinese X X X X X X X X X X X 9 X
Filipino X X X X X X X X X X X 11 X
Guamanian X ---e X ---e ---e ---e ---e ---e ---e ---d ---d 6 X
Native Hawaiian     Xf                   Xf
Japanese X X X X X X X X X X X 9 X
Kampuchean X X ---e ---e ---e ---e ---e ---d X X ---e 6 ---g
Korean X X X X X X X X X X X 10 X
Laotian X X X X ---e X ---e ---e X X X 4 ---g
Samoan X ---d X ---e ---e ---e ---e ---e X ---d ---e 5 X
Tongan X ---d X ---d ---d ---d ---d ---d X ---d ---d 1e ---g
Vietnamese X X X X X X X X X X X 6 X

Census Bureau policy for Census 2000 data is to not disclose race/ethnicity-specific population counts below 100 for a particular geographic area [1]. Thus, we were unable to obtain comprehensive population denominators for some of the SEER reporting areas. When race/ethnicity specific census population data were suppressed for an entire registry, the registry was excluded from rate calculations for that particular API group. However, when the census population data were suppressed for a subset of the counties within the Seattle/Puget Sound metropolitan area, we chose to calculate an incidence rate that included all remaining counties for which the race/ethnic-specific population data were not suppressed. This resulted in the exclusion of selected counties in Seattle/Puget Sound from incidence rate calculations for each of the API groups, with the exception of Filipinos (Table 1).

In addition, when a specific API population group in a SEER registry coverage area was less than 1,000 (based on single race/ethnicity alone population data), the data for that area was excluded from the cancer incidence rate calculations for that group. The rationale for this exclusion was that incidence rates for specific API groups in these registries with small populations were generally low; suggesting that misclassification of API ethnic information in medical records may be a bigger problem in these areas. Using this population threshold limited the number of geographic areas for Guamanians, Kampucheans, Laotians, Samoans, and Tongans (Table 1), but excluded just 1–2% of the total number of cancer cases in these groups. Cancer incidence and mortality rates for Native Hawaiians are reported only for the State of Hawaii due to the extensive efforts at the Hawaii Tumor Registry to classify all cancer patients with any native Hawaiian ancestry and because of the unique cultural and environmental characteristics of this group [2]. About 60% of the total U.S. Native Hawaiian population resides in Hawaii.

Since specification of expanded API racial categories on death certificates is currently required for seven states only (California, Hawaii, Illinois, New Jersey, New York, Texas, and Washington) and for nine API ethnic groups [3], we restricted our mortality analyses to these areas and groups (Table 1). Thus, the geographic coverage differs between the incidence and mortality analyses.

Special Notes

In the incidence databases there is a racial group of "Asian Indian/Pakistani". This racial group is a single race in the incidence data, but is two racial groups in the population data. The populations for low and high were created by combining two races (Asian Indian and Pakistani). Collapsing the populations results in low populations that are potentially a little lower than they should be and high populations that are a little higher than they should be. The low populations are potentially lower because if a person considered himself both Asian Indian and Pakistani and no other race he would not have selected either single race, so they would not be represented in the low populations. The high populations are potentially higher because a person that considered himself both Asian Indian and Pakistani would be counted in both populations and therefore double counted in the combined population.

Using the Data in SEER*Stat

When using these databases in a SEER*Stat Rate session, it is not possible to make selections based on the year of diagnosis or year of death variables. Statistics generated in a rate session include the years 1998-2002 combined.

References

  1. U.S. Census Bureau (2007) Appendix H. Characteristic iterations. Census 2000 summary file 2 technical documentation, U.S. Census Bureau, 2001, p H-1. Available from URL: http://www.census.gov/prod/cen2000/doc/sf2.pdfExternal Web Site Policy. Accessed Feb 2007.
  2. Braun KL, Tsark JU, Santos L, Aitaoto N, Chong C (2006) Building Native Hawaiian capacity in cancer research and programming. Cancer 107(S8):2082-2090.
  3. CDC, NCHS, Division of Vital Statistics (2007) Instruction manual, part 4: classification and coding instructions for death records, 1999–2001, pp 29–31. Available from URL: http://www.cdc.gov/nchs/about/major/dvs/im.htmExternal Web Site Policy. Accessed Feb 2007.