Federal Committee on Statistical Methodology
Office of Management and Budget
FCSM Home ^
Methodology Reports ^

 

   

Statistical Policy Working Paper 13 - Federal Longitudinal Surveys


 

Click HERE for graphic.

 

 

 

 

 

MEMBERS OF THE FEDERAL COMMITTEE ON STATISTICAL METHODOLOGY
(November 1985)
  Maria Elena Gonzalez (Chair) Daniel Kasprzyk Office of Information and Bureau of the Census Regulatory Affairs (OMB) (Commerce)   Barbara A. Bailar William E. Kibler Bureau of the Census Statistical Reporting Service (Commerce) (Agriculture)   Yvonne M. Bishop David Pierce Energy Information Federal Reserve Board Administration (Energy)   Edwin J. Coleman Thomas Plewes Bureau of Economic Analysis Bureau of Labor Statistics (Commerce) (Labor)   John E. Cremeans Jane Ross Business Analysis Social Security Administration (Commerce) (Health and Human Services)   Zahava D. Doering Fritz Scheuren Defense Manpower Data Center Internal Revenue Service (Defense) (Treasury)   Daniel H. Carnick Monroe G. Sirken Bureau of Economic Analysis National Center for Health (Commerce) Statistics (Health and Human Services)   Terry Ireland Thomas G. Staple National Security Agency Social Security Administration (Defense) (Health and Human Services)   Charles D. Jones Robert D. Tortora Bureau of the Census Statistical Reporting Service (Commerce) (Agriculture)     PREFACE   The Federal Committee on Statistical Methodology was organized by OMB in 1975 to investigate methodological issues in Federal statistics. Members of the committee, selected by OMB on the basis of their individual expertise and interest in statistical methods, serve in their personal capacity rather than as agency representative. The committee carries out its work through subcommittees that are organized to study particular issues and that are open to any federal employees who wish to participate in the studies. Working papers are prepared by the subcommittee members and reflect only their individual and collective views.   This working paper of the Subcommittee on Federal Longitudinal Surveys discusses the goals, management, operations, sample designs, estimation methods, and analysis of longitudinal surveys. Conclusions are drawn about where to use longitudinal surveys, and the need to have an evaluation component in these surveys. The Appendices contain twelve case studies of recent longitudinal surveys. The report is intended primarily to be useful to Federal agencies in choosing to do, and then in designing, carrying out, and analyzing data from longitudinal surveys. The Federal Committee on Statistical Methodology intends to organize seminars to discuss the report with interested Federal agency staff members.   The Subcommittee on Federal Longitudinal Surveys was co-chaired by Barbara A. Bailar and Daniel Kasprzyk, Bureau of Census, Department of Commerce.         MEMBERS OF THE SUBCOMMITTEE ON FEDERAL LONGITUDINAL SURVEYS     Barbara A. Bailar* (Co-chair) Lawrence Ernst Bureau of the Census (Commerce) Bureau of the Census (Commerce)   Daniel Kasprzyk* (Co-chair) Marie E. Gonzalez* (ex officio) Bureau of the Census (Commerce) Office of the Information and Regulatory Affairs (OMB)   Barry Bye Catherine Hines Social Security Administration Bureau of the Census (Commerce) (Health and Human Services)   Dennis Carroll Curtis Jacobs Center for Statistics Bureau of Labor Statistics (Education) (Labor)   Robert Casady Inderjit Kundra National Center for Health Energy Information Statistics Administration (Health and Human Services) (Energy)   Steven B. Cohen Bruce Taylor National Center for Health Bureau of Justice Statistics Services Research (Health (Justice) and Human Services)   ADDITIONAL CONTRIBUTOR TO THE REPORT     Lawrence Corder Research Triangle Institute (Previously National Center for Health Statistics)     *Member, Federal Committee on Statistical Methodology           ACKNOWLEDGEMENTS     This report is the result of collective work and many meetings of the Subcommittee on Federal Longitudinal Surveys. Each chapter had a principal author (or authors), as noted below, but the final report, particularly the introduction and summary sections, reflects contributions from all of the Subcommittee     Many useful suggestions on content and organization were made by Maria Gonzales, chairperson of the Federal Committee on Methodology (FCSM).   Barbara Bailar, Co-Chair of the Subcommittee, prepared the Introduction and the concluding Chapter, which embody the discussions held by the whole Subcommittee.   All of the FCSM members reviewed several drafts and made many important suggestions. The Subcommittee in particular wishes to recognize the valuable contributions made by the primary reviewers: Zahava Doering, Fritz Scheuren and especially Monroe Sirken, who read and commented on two drafts of the complete report.   The principal authors of each chapter of the report are:   Chapter One Catherine Hines Chapter Two Lawrence Corder Chapter Three Bruce Taylor Chapter Four Daniel Kasprzyk and Lawrence Ernst Chapter Five Barry V. Bye   The Subcommittee thanks also the following persons who were responsible for preparing the Case Studies that appear in the Appendix: Edith McArthur (SIPP), Curtis Jacobs (CPI), Steve Kaufman (ECI), Dennis Carroll (NLS-72, HS&B;), Catherine Hines (NLS), Barry V. Bye (RHS, WIE), Stephen B. Cohen (NMCES), Robert Casady (NMCUES), James L. Monahan (LED), John DiPaolo, Robert Wilson, and Peter J. Sailer (SOI).   Catherine Hines edited the report. Joanne Watson (Bureau of the Census) prepared each of the drafts, and the Subcommittee thanks her for her patience and accuracy.     iii       GLOSSARY OF ABBREVIATIONS     AHS American Housing Survey (Formerly Annual Housing Survey)   CPI Consumer Price Index   CPS Current Population Survey   ECI Employment Cost Index   HCFA Health Care Financing Administration   HS&B; Longitudinal Survey of High School and Beyond   ISDP Income Survey Development Program   ISR Institute for Social Research (University of Michigan)   NCES National Center for Education Statistics   NCHS National Center for Health Statistics   NCS National Crime Survey   NLS National Longitudinal Surveys of Labor Market Experience   NLS-72 National Longitudinal Study of the High School Class of 1972   NMCES National Medical Care Expenditure Survey   NMCUES National Medical Care Utilization and Expenditure Survey   OSIRIS Statistical Analysis software, Survey Research Center, U. Michigan   PSID Panel Survey on Income Dynamics   RAMIS Data base management system, Mathematical Research Inc., Princeton, N.J.   RAPID Data base management system, Statistics Canada, Ottawa   RHS Retirement History Study   SAS Data base management system, SAS Institute, Cary, N.C.   SSA Social Security Administration   SIPP Survey of Income and Program Participation   SIR Data base management system, SIR, Inc., Evanston, IL     SOL Statistics of Income Program, IRS   WIE Work Incentive Experiment, SSA     iv       TABLE OF CONTENTS   Page GLOSSARY OF ABBREVIATIONS vi   INTRODUCTION 1   Chapter I: The Goals of Longitudinal Research 5   Chapter II: Managing Longitudinal Surveys 11   Chapter III: Longitudinal Survey, Operations 19   Chapter IV: Sample Design and Estimation 35   Chapter V: Longitudinal Data Analysis 49   Chapter VI: Summary and Conclusions 63   APPENDIX:   Case Study 1 Survey of Income and Program Participation 67   Case Study 2 Consumer Price Index 75   Case Study 3 Employment Cost Index 89   Case Study 4 National Longitudinal Study of the High School 97 Class of 1972   Case Study 5 High School and Beyond 101   Case Study 6 National Longitudinal Surveys of Labor Market 105 Experience   Case Study 7 Social Security Administration's Retirement 111 History Study   Case Study 8 Social Security Administration's Disability 115 Program Work Incentive Experiments   Case Study 9 National Medical Care Expenditures Survey 123   Case Study 10 National Medical Care Utilization and Expendi 127 tures Survey   Case Study 11 Longitudinal Establishment Data File 137   Case Study 12 Statistics of Income Data Program 147   REFERENCES 153           INTRODUCTION   Since the 1960's, the Federal government has sponsored an increasing number of longitudinal surveys as vehicles for research on administrative and policy issues. The goal of the Federal Committee on Statistical Methodology's subcommittee on Federal Longitudinal Surveys is to identify the strengths and limitations of longitudinal surveys, and to propose some guidelines for using them most effectively.   Beginning its work, the subcommittee found that there were multiple definitions of a longitudinal survey, so our first task was to define what this report would mean by the term. The difficulty arises because there are two facets to the definition, design and analysis. To be absolutely clear, one must distinguish between a longitudinally designed survey and a survey with longitudinal analysis. We have elected to put these components together in our definition. The distinguishing features of a longitudinal survey are: - repeated data collection for a sample of observational units over time; - the linkage of data records for different time periods to create a longitudinal record for each observational unit; and - the analysis is based on the longitudinal microdata and refers to data collected over time.   The essential feature is that, from the beginning, there is a plan to elicit data from the future for each observational unit.   This definition excludes some surveys with longitudinal elements, such as the Current Population Survey (CPS). The Survey of Income and Program Participation (SIPP) is included here as a longitudinal survey, although there are as yet no longitudinal analyses of SIPP. Federal agencies also conduct surveys of establishments that have longitudinal elements but these are not yet true longitudinal surveys either. There is an effort to create a longitudinal file for manufacturing firms at the Bureau of the Census. We included this program as a case study in this report because, although it does not meet our definition, it may be of interest to readers. Similarly, Federal agencies maintain longitudinal files of administrative records that do not meet our definition. Yet they may be used in ways that are similar to the analysis of longitudinal surveys, so we have included an example, the Statistics of Income Data Program, as a case study.   1       Rotating panel surveys* are often described as longitudinal surveys. They are not, but they may share many sampling, estimation, and analysis characteristics with longitudinal surveys. In addition, there is a tendency for ongoing rotating panel surveys to be changed to make longitudinal analysis possible. The National Crime Survey (NCS) is currently considering such a transition, and one possible result of the current redesign activities will be to create a longitudinal NCS data file if the cost is not prohibitive. There is interest in moving in the same direction with both CPS and the American Housing Survey (AHS, formerly the Annual Housing Survey). We should anticipate that eventually more rotating panel surveys will be modified, or designed from the beginning, to make longitudinal analysis possible. At this time, however, many rotating panels lack longitudinal data files, and many longitudinal surveys are designed without rotating panels.   The subcommittee members examined in detail 12 recent longitudinal surveys sponsored by the Federal Government, as examples and illustrations. These are: (1) the Survey of Income and Program Participation (SIPP); (2) the Consumer Price Index (CPI); (3) the Employment Cost Index Survey (ECI); (4) the National Longitudinal Study of the High School Class of 1972 (NLS-72); (5) High School and Beyond (HS-B); (6) The National Longitudinal Surveys of Labor Market Experience (NLS); (7) the Social Security Administration's Retirement History Survey (RHS); (8) The Social Security Administration's Disability Program Work Incentive Experiments (WIE); (9) The National Medical Care Expenditure Survey (NMCES); (10) the National Medical Care Utilization and Expenditure Survey (NMCUES); (11) the Longitudinal Establishment Data File; and (12) the Statistics of Income Data Program (SOI). The surveys chosen for case study treatment were selected to represent a variety of sponsors, research questions and kinds of respondents. Each of the 12 case studies is described in the Appendix,,and they are frequently cited to illustrate important points throughout the text.   We hope that the chapters of the text and the case studies in the Appendix will convince readers of four points that emerged from the subcommittee's review of longitudinal surveys. First, longitudinal survey designs are appropriate, and even required, for certain kinds of research. These include, but are not limited to, such topics as gross change, the causes of change, or the role of attitudes in change. However, many longitudinal surveys have not made full use of their longitudinal design in the analysis.   Second, longitudinal survey design, operation, and analysis techniques are still evolving. There are a number of important design issues that are not yet explored or understood. An example is the optimal length of time between interviews, and the number of interviews to conduct to achieve research objectives. To some extent the variations in survey design       ___________________________   * A panel is a sample of persons selected to participate at a particular point in the longitudinal sequence. In a rotating panel survey the sample units have a fixed duration. As they leave the sample, they are replaced by new units which are introduced at specific points in time.   2       reflect the wide and legitimate differences between the research goals that each survey was designed to accomplish. This does not explain, however, all the existing variation in methods . Decisions about sample design and attrition, about selecting the best respondent or analytical units, about the best estimation, imputation or weighting schemes, or about the impact of varying personal, mail or telephone interviews over the course of a longitudinal survey, have not always been consistent.   Third, the important question of the costs of longitudinal surveys compared to cross-sectional surveys has yet to be answered. There are conflicting reports about the relative costs of the two types of survey. Costs are usually cited as higher for longitudinal surveys, but the costs being reported are confined to data collection costs and processing costs. This does not compare the full range of survey costs including quality costs, costs of analysis, and other such elements which could, in the long run, change the picture of the relative costs.   The fourth and final point that emerged from the subcommittee's review was that the surest method for learning answers to design, operational, and analysis issues is to build an evaluation component into a longitudinal survey. By this means a record of comparative performance is created which benefits others. The case studies presented in this report, in particular, show how progress occurs when evaluation is built into survey operations, and how forethought and planning, far more than additional expense, are needed to increase our knowledge about longitudinal survey design.   This report is presented in 6 chapters. The first chapter is a review of the kind of research question for which a longitudinal approach is appropriate, illustrated with examples. The second and third chapters describe some of the problems encountered in planning and managing longitudinal surveys. Chapter four discusses problems related to sample design and analytical units in longitudinal surveys, and special problems of estimation and weighting. Chapter five describes and evaluates major approaches to the analysis of longitudinal surveys. The final chapter, number six, summarizes some issues the subcommittee members recognized as important, and outlines the need for building an evaluation component into prospective longitudinal surveys; both to answer questions about the quality of data derived from each survey and to answer questions about optimal design for future longitudinal surveys.   3         CHAPTER 1   THE GOALS OF LONGITUDINAL RESEARCH   There are at least five distinctive advantages to using a longitudinal survey rather than a cross-sectional survey some of these advantages are shared by rotating panel surveys.   1. A longitudinal sample reduces sampling variability in estimates of change. This is an advantage shared with rotating panel surveys such as CPS and NCS.   2. A matched longitudinal file provides a measure of individual gross change for each sample unit. This is an advantage shared to some extent by rotating panels, which can provide a measure of gross change, but not usually on an individual basis.   3. Longitudinal survey interviews usually have a shorter, bounded reference period that reduces recall bias in comparison to a retrospective interview with a long reference period. Rotating panels such as CPS and NCS also share this advantage. Longitudinal surveys with long intervals between interviews may lose this advantage.   4. Longitudinal data are collected in a time sequence that clarifies the direction as well as the magnitude of change among variables.   5. Longitudinal interviews reduce the respondent burden involved in creating a record that contains many variables. A single interview could not collect comparable detail without excessive respondent burden and fatigue. In addition, the quantity of data collected in a longitudinal survey is usually greater than that from several cross-sectional surveys because of the correlational structure of longitudinal data.   There are also some distinct disadvantages to longitudinal surveys. Some of these are:.   1. The analysis of longitudinal surveys is dependent on the assembly of the microrecord data. The full advantage of compiling a detailed longitudinal record with many variables may not be available until years after the start of data collection.   2. Beginning refusal rates may be comparable to those of cross-sectional surveys, but the attrition suffered over time may create serious biases in the analysis.     Principal Author: Catherine Hines     5       3. A longitudinal survey, including several data collections, is more costly than a single retrospective cross-sectional survey. A longitudinal survey may be less costly than a series of cross-sectional surveys. It is speculative whether a longitudinal survey is more costly than a rotating panel survey.   4. The estimates of gross change derived from longitudinal surveys tend to be inflated over time by simple response variance, The combined or net effect of such influences as simple response variance, response bias and time-in- sample bias effect on longitudinal estimates of gross change are still poorly measured.   5. Longitudinal surveys are often improperly analyzed, not taking into account longitudinal characteristics or attrition.   For some research goals, the advantages clearly outweigh the disadvantages. For other research goals this may not be the case. Research goals that demand longitudinal surveys are described in this chapter.   A. Measuring Change   Both cross-sectional and longitudinal surveys can be used to measure change. The National monthly estimate of unemployment based on the CPS is always compared to the estimate for the previous month or the same month a year ago. Estimates of such things as crime victimizations, retail sales, housing starts, or health conditions are all compared to estimates from a previous time period. None of these data are currently based on longitudinal surveys.   Which measures of change need a longitudinal file structure? One example is the components of individual change. These are measures of gross change for the observational units between points in time.* Longitudinal data are frequently displayed in a time- referenced table, showing the characteristics, attitudes, or beliefs of the sample at time 1; cross-tabulated by the same characteristics, attitudes, or beliefs at time 2. Another example is the average change for an observational unit. As pointed out by Duncan and Kalton (1985), if data are available for several time points for each observational unit, then a measure of average change or trend can be estimated. Finally, a longitudinal design permits the measurement of stability or lack of stability for each observational unit.   Measures of gross change are of interest in several of the case studies described in this report. Respondents are followed through employment and unemployment (NLS), training and the labor force (NLS-72, HS&B;), into and out, of poverty (SIPP), or between health, treatment, and disability (NMCES, NMCUES, RHS, WIE). The focus is sometimes on movement across an arbitrary threshold (such as poverty, defined by household composition and income), and sometimes on a continuous measure.   The observation periods in a longitudinal survey are commonly called waves. A wave describes one complete cycle of interviewing, from sampling to data collection, regardless of its duration.   6           In independent (i.e., cross-sectional) samples, sub- populations with very different gross-change patterns are indistinguishable if the sum of the changes is similar. This has been important to studies of employment. The NLS, for example, can distinguish a hypothetical population where 15% of the people are never employed, from a population where at each interview a different 15 % respondents report unemployment. A cross-sectional survey could not make the same distinction, which is vital to the development of intervention policies. Another example can be cited from the field of social indicators research. A series of variables, measured longitudinally, can be used to construct models for estimation to examine change over time with great elegance. (See Land, 1971, 1975.)   Young adults in the years after full-time school are frequent longitudinal survey subjects (NLS Youth Cohorts, NLS-72, HS&B;) because individuals in these years are known to pass between statuses (employment and unemployment, school and training programs, in and out-of the armed services, between households) rapidly and irregularly. Cross-sectional studies would miss all the individual reversals and repetitive change. To develop detailed models of the causes of change in these fluid populations, longitudinal measures are needed to capture the record of individual and gross change.   For example, cross-sectional studies of college enrollments have generally found relatively high stability over a number of years, whereas analysis of NLS-72 data identified frequent individual change occurring at a stable rate. A substantial percentage of the college students surveyed exhibited erratic enrollment patterns characterized by dropping out or transferring between 4-year and 2-year colleges. In light of these findings, student financial assistance (grants and loans) have changed. Legislation has shifted aid to channel the funds directly to the students, who choose the college they wish to attend -- rather than channelling the funds to college officials, who decide how the funds are doled out to enrolled students.   Studying the relationship between attitudes and behavioral change poses particularly difficult problems in research design. The problems inherent in determining which variable in a pair changes first are present, and they are exacerbated by the problems encountered in surveys of subjective phenomena, such as attitudes. Using retrospective questions to ask respondents to reconstruct thoughts or feelings as they existed in the past has proved unreliable.   Prospective longitudinal surveys provide the most reliable data on change in knowledge or attitudes, because longitudinal measures are collected while the subjective states actually exist. This appears to reduce the bias frequently caused by suppression or distortion of respondent recall. In addition, unlike retrospective measures of attitudes, contemporary measures can sometimes be probed or even verified.   The longitudinal surveys of high school students (NLS-72 and HS&B;) demonstrate the method's power to collect data on changing subjective states, and to study causation. These surveys have measured attitudes and expectations about employment, and subsequent employment experiences and behavior. The data, which could not have been collected cross-sectionally, can be analyzed to understand the formation of attitudes, as well as to evaluate the effects that attitudes have on subsequent behavior.   7           When the research goal is to measure a component of individual change, longitudinal surveys have strong advantages. They are the only method available to collect data on a recent occurrence basis over a long period of time. Although a retrospective cross- sectional survey could be used to attempt the same thing, the recall bias may be a strong force against this decision. The bias from the attrition in a longitudinal survey has to be balanced against the bias or lack of information in a retrospective cross- sectional survey. The bias from attrition is usually preferred.   Price and wage changes are measured in longitudinal surveys (i.e., the CPI and ECI) because the longitudinal sample design holds other variables constant. The assumption can be made that whatever unknown sampling bias exists in later waves was also present in earlier waves, and can be dismissed as a possible source of the changes being measured.     B. Assembling Detailed Individual Records   Longitudinal surveys generally provide researchers with more detailed records for each individual than is practicable through a cross-sectional design. In a longitudinal design, an extremely detailed record can be accumulated for each subject without making any single observation period (i.e., interview or wave) excessively burdensome. By 1982, for example, records for the original respondents in the NLS contained up to 1,000 data items for each sample case. To create a record of comparable detail complexity would have required a one-time questionnaire of extraordinary length. In addition, responses referring to earlier time periods would have been reconstructed from memory, reducing their reliability. In many instances, researchers are looking for cause- and-effect relationships that are more likely to be accurate if the data are compiled on a current rather than retrospective basis.     C. Collecting Data That is Hard to Recall   Some surveys ask questions that respondents have difficulty in answering precisely or objectively after much time has passed. These include questions that call for the kind of detail that people seldom recall clearly (such as complete records of expenditures, or health treatments), and questions that refer to events that respondents tend to telescope, embellish or suppress in their memories after time has passed (such as crime victimization, health problems, or visits to the doctor).   Questions such as these have been used successfully in longitudinal surveys, in which the previous interview provides a clear marker to bound respondent recall, and which are constructed with short reference periods between interviews. For example, the Consumer Expenditure Survey, conducted as part of the CPI program, collects detailed records of household spending patterns through longitudinal interviews. (See Case Study no. 2 in the appendix.)   A longitudinal survey with relatively short reference periods is one of the best methods for producing aggregated data for a longer time period, such as a year. For example, the primary goal of the NMCES and,NMCUES programs   8         was to develop estimates of medical expenditures for a calendar year. This was accomplished by obtaining medical expenditure data every 3 months and Compiling an annual total. A similar example is the new continuing Consumer Expenditure Survey, which covers all consumer expenditures. The SIPP program employs a similar design, using interviews at 4 month intervals to produce annual aggregates. The relatively short, bounded reference periods for these longitudinal surveys improve reporting by eliciting events closer to the time they occur. This increases the completeness of aggregated estimates and reduces error.   D. Modelling Studies and Pilot Programs   The detailed case histories built up in longitudinal surveys are important in analyzing the impact of alternative policies or intervention strategies. The complex individual case records accumulated in a longitudinal panel survey provide a microcosm in which the impact of changes can be simulated. Questions can be answered about the probable impact of changing a program's eligibility criteria, for example, or about the benefits which specified classes of respondents might anticipate under,various program changes. Intervention programs can be evaluated through longitudinal surveys to Study their effect on respondents with known characteristics. A sufficiently detailed record makes it possible to simulate alternative interventions, and predict a range of effects. (See Case Study 9 on the WIE, for example.)   In some cases longitudinal surveys, pilot intervention programs and Federal policy experiments evolved together in the 1960's. Several longitudinal surveys authorized as components of pilot or experimental intervention programs to measure program effects and ensure that decision-making information would be available when it was needed. Longitudinal data collection components were built into pilot income maintenance programs, for example, administered temporarily in cities in New Jersey, Indiana, Colorado and Washington State.   In conclusion, tho points about the periodicity of longitudinal research should be stressed. First, longitudinal data are never available immediately; any data that are based on the sequence of measures over time cannot be fully extracted until the final measures are collected. If information is needed at once, another research design has to be used which incorporates some alternative to a true longitudinal approach; such as retrospective measures, or the use of administrative records. Even if the quality of data from a longitudinal survey would be clearly superior, that would be irrelevant if the schedule outweighs these other considerations.   Second, longitudinal data can be used cross-sectionally to provide immediate data as long as the research focus is not specifically on changing measures over time. Each wave of a longitudinal survey can also be analyzed as a cross-sectional survey. Thus some data can always be made available immediately. Record data from non-going longitudinal surveys can be analyzed quickly from a cross-sectional perspective to serve certain analytical purposes without delay. It is also possible to add questions to the current waves of a longitudinal survey to meet immediate data needs, using an existing longitudinal sample and base-line demographic data for maximum efficiency. In these ways a longitudinal design adds analytical strengths without sacrificing the potential for cross-sectional research.   9         CHAPTER 2   MANAGING LONGITUDINAL SURVEYS     As described in the previous chapter, prospective longitudinal surveys have proved to be an important research approach, but certain limitations have also emerged that must be considered when these surveys are planned. The problems related to staff and management of longitudinal research differ in kind as well as degree from those encountered in cross-sectional research.   The core of the problem in managing a longitudinal survey is a conflict between the need for long-term and for short-term resources. Plans and funding must be stable over many years, but the need for staff rises and falls over the course of a longitudinal survey. Most organizations sponsoring longitudinal surveys have solved the dilemma through some combination of permanent and temporary staff. Fluctuations in resources are less pronounced in longitudinal surveys that employ non-going rotating panels (such as SIPP or, to some extent, the CPI) than they are in fixed panel surveys in which interviews are conducted at longer intervals (such as NLS, NLS-72, or HS&B;).   The major difficulty faced in planning and managing a longitudinal survey is in maintaining a core group dedicated to the project, and maintaining consensus between this group and senior agency staff. These groups tend to view long-term commitment of Staff and resources in different ways. The schedule, funding, and staff needs of a longitudinal survey are viewed differently by survey designers, by agency directors, and by those responsible for operations. It is a constant challenge to generate commitment to a long-term goal such as analysis of data, when senior staff with direct authority over the project often changes before the survey is completed.   A. The Need for Long-Range Planning   The need for long-range planning and organization for a longitudinal survey should be brought to the attention of senior staff very early with a planning document that outlines the workload, survey tasks, and anticipated products over time. The planning document should be prepared in conjunction with an analysis plan, and the design of the instruments and procedures will then follow once all groups are in agreement with the planning document.   Long range planning is vitally important to a longitudinal survey, because it promotes enduring support at a senior agency level, it widens the pool of sponsors and supporters; and it begins the process of documentation that ensure continuity of operations.   Principal Author: Lawrence Corder   11       A large-scale longitudinal Federal survey generally has at least nine principal management phases which may be briefly described as follows:   1. Budget Planning. Up to five years before data collection is to begin, a general plan must be conceived and provisions made to obtain continuing staff and funding resources throughout the longitudinal project.   2. Development of Position Papers. These are draft planning documents which discuss options, costs, and yields associated with various sampling plans, data collection designs, or questionnaires. These ensure widespread and enduring support for the longitudinal research.   3. Procuring outside assistance. If a contract is to be awarded, requests for proposals must be prepared, cleared and advertised, and responses must be evaluated before a contract is signed. This is a common approach to levelling out resource needs.   4. Final Research Plans. This stage includes final OMB clearance, conduct of field tests, revisions as necessary, and detailed agreements with any other cooperating agencies.   5. Data Collection. This refers to the full-scale field data collection. Longitudinal surveys (such as NLS) which have been extended beyond the original research period have repeated these 5 stages independently several times.   6 . File Preparation. Development of the system for data entry, data base design, processing, etc., may also require systems for optical scanning of questionnaires, machine/or manual edit steps, preparation of code books, the construction of composite variables, plans to preserve privacy in public data files, and numerous other activities. Each operation must be fully documented, to ensure comparability between waves.   7. Planning the Analysis. While the overall goals oft he analysis must be planned in the early stages, some details cannot be finalized until the data are available on computer files and code books are completed. Also, as policies shift, new analytical priorities must be met. In all cases, this process requires plans which may include in-house analyses and contracts for analyses. Contracts require a repetition of the procurement process described in phase 3.   8. Conduct of Analyses. These may go on for several years. Cross-sectional, analyses can be conducted as soon as one wave of interviews has taken place. Longitudinal analyses take place after some or all other waves are completed.   9. Publications. With in-house and professional peer reviews, these may continue for several years.   12       Each phase requires substantial time to complete, contains specific activities and results in the preparation of key documents. The final products of any longitudinal surveys are usually public-use data files and reports.* Ideally, these should be supplemented by rapid preparation of in-house documents as part of the policy-making process. Schedule milestones and due dates are part of any longitudinal survey, and the ultimate success of the project and even the usefulness of the analytical results may be judged against their timeliness.   It is not unusual for a longitudinal survey to consume a decade or more from inception to completion of the publication plan. The NMCES and NMCUES Studies, for example, both took 8 to 10 years to complete. While field operations and the period for analysis vary with each survey's objectives and resources, the successful pre-field period is probably very similar in each case. The planning period should be dedicated to achieving consensus internally, then to producing instruments and obtaining clearances and approvals (for contracts as well as for questionnaires). A typical schedule for completing pre-field activities alone (excluding budget planning) would frequently require 12 to 18 months.   Some of the most severe criticisms of longitudinal surveys have resulted from insufficient planning. It is not uncommon, for example, to omit thorough planning of the analysis. Then, at a production stage, it is discovered that people have different ideas on the tables and data to be produced and analyzed. It is also necessary to plan the linked files carefully so that the data needed for longitudinal analyses are readily available. Unfortunately, the planning of budgets and field work often takes precedence over the planning of processing and analysis, sometimes leading to delays, acrimony, and sometimes shifts in support.   B. Funding Longitudinal Research   The actual unit costs of doing longitudinal surveys may be no higher than for a series of cross-sectional surveys of comparable size and complexity (Wall & Williams:30). There is conflicting evidence on comparable costs, probably reflecting non-standard cost reporting on survey operations. Funds, however, must be committed over a number of fiscal years and budget plans are not easily altered. There is a trade-off to be made when errors are discovered or improvements can be implemented. Additional costs must be carefully considered, as well as the effect of changes in methodology on the longitudinal analysis. Errors, of course, should be corrected or, if too costly, an indication of their effects provided. Changes in methodology are different from changes necessitated by errors and must be thoroughly explored. Provision should be made to share information with analysts and data users on real change vs. methodologically-induced change. (The change to computer assisted telephone interviewing is one such change that needs careful exploration.) If errors or methodological changes result in higher costs, alternative methods of meeting those costs should be considered: higher funding, smaller sample size, more time between interviews, delayed processing, and so forth.   Surveys of business or industrial establishments are often an exception to this rule, to protect the identity of large firms that dominate certain samples.   13           Inter-agency cooperation can help meet long-term funding needs. The Health Care Financing Agency (HCFA) and the National Center for Health Statistics (NCHS) chose this approach in conducting NMCUES. Inter-agency agreements frequently involve the Census Bureau for data collection and analysis, but they may also be used between other agencies with related research goals. Inter- agency Cooperation in longitudinal surveys could take the form of joint sponsorship of a new longitudinal survey, or it could be in the form of using an existing longitudinal sample as a vehicle for research to save the cost of starting a new longitudinal survey.   The NLS-72 provides an example of a consortium approach: For the fifth follow-up interview in NLS-72, the National Science Foundation appended questions on math and science teachers, and the National Institute on Child Health and Human Development joined with the National Center for Education Statistics (NCES) to fund questions on child care and early childhood education issues. Longitudinal surveys are generally long term projects with significant start-up costs. If a survey can he constructed to serve more than one agency through an inter-agency agreement, start-up costs may be shared and several agencies will be bound to multiple-year funding commitments.   When agencies select outside contractors to conduct longitudinal research, competitive procurement is required. The decision to use a contractor to conduct a survey increases the time needed to start a project, because approval of contracting plans must be added to other planning tasks. One advantage of contracting out the survey work is that it gives an agency access to additional staff support in cases where the agency has no authority to add permanent staff.   Contracting for data collection by an outside agency may or may not be more expensive than employing a government organization for this purpose. In comparing costs, NCES found that the first NLS-72 follow-up, conducted by the Census Bureau, cost slightly more than the second follow-up, conducted by Research Triangle Institute (RTI), despite inflation. Other longitudinal surveys, including NMCES and NMCUES, have had just the opposite experience. The most cost-effective mode of operation appears to depend on the kind of survey, not on the agency conducting it.   The duration of longitudinal surveys often requires periodic recompetition once a competitive award has been made. As a result, agencies have found themselves switching contractors part way through the data collection phase of a longitudinal survey. The competitive award of each data collection wave can, however, help control overall survey costs, because it provides contractors with an incentive to hold down their costs.   The possibility of changing contractors over the life of a longitudinal survey requires a detailed documentation of methods that goes far beyond what is needed for any one-time survey. This level of documentation was not anticipated when the original contract to collect data for NLS-72 passed from the Educational Testing Service to RTI, and the change in contractors caused difficulties. Based on this experience, NCES now   14           builds a sub-contract to the previous contractor into any subsequent data collection awards. As a result, a later transfer of the NLS-72 contract from RTI to NORC was accomplished without problems.   C. Staff Needs   Staffing requirements for a longitudinal survey typically vary substantially, both by number and by type of staff throughout the history of the project. Staffing is much more controlled in rotating sample surveys, whether they are longitudinal or cross- sectional. Funding and staff needs for a longitudinal survey are much greater during the data collection period than during any other phase. However, some of the types of people needed for data collection, such as interviewers, are not needed in later phases. Staff monitors for field work and data processing are in high demand at early stages as well as intermediate stages. Because of sporadic needs, the use of a core group of survey professionals in combination with temporary staff, or interagency agreements or outside contracts, can be the best method to ensure adequate staffing for the entire effort.   To distribute the costs of a contract more evenly over a longitudinal survey, NCES and NCHSR have used incrementally-funded contracts. During the longitudinal survey, separate contracts are awarded for each phase or wave. Each contract extends over two or more years. At any point, some survey tasks are being advertised for competition while others are being completed under contract. Looked at from the standpoint of each fiscal year, the total costs and level of effort remain more nearly constant. NCES has also found that giving agency survey analysts the responsibility for monitoring contract performance will help control variations in staffing patterns.   By employing temporary peripheral groups in addition to permanent staff groups, two problems are solved: Research staff needs are met without adding permanent personnel to an agency; and peak workload needs are met without jeopardizing tight survey schedules. Inter-agency agreements or contracts not only bind parties to a specified set of research goals, but they also permit the level of staff effort to rise and fall as needed.   D. Maintaining Core Staff   The duration of longitudinal research projects creates another management problem (which has been called a Methuselah effect by Herbert Parnes). Each phase of a longitudinal study, such as planning, data collection, or analysis, is frequently carried out by different individuals, who may not even be part of the same organization. The relative inflexibility of a longitudinal study plan is an analytical necessity, but it could also prevent interim analysis or refinements in the design. For these reasons, it has been suggested that non-going longitudinal surveys may hold little interest for the calibre of professional staff that is needed for management or analysis (Wall & Williams: 35).   NCES, however, has successfully attracted talented analysts to manage the agency's longitudinal surveys. To some extent this may be because NCES ensures that the Agency's staff have challenging responsibilities for program     15     analysis. Agencies which see only data collection as their primary mission may be more apt to encounter the staff problems recognized by Wall and Williams. in order to allow mid-course corrections and modifications of the survey plan, NCES uses a multi-phase sampling design (as in HS+B). This, too, contributes to the flexibility of the NCES longitudinal survey program.   E. Data Collection and Processing Schedules   Longitudinal surveys have become notorious for developing serious backlogs because data collection takes precedence over all other tasks. The schedule for observations is usually the least flexible aspect of the design, because each subject must have an identical record structure. As data collection continues, it creates an ever-growing backlog of other procedures, such as analysis. Uncompleted tasks tend to accumulate, becoming increasingly difficult to finish. To prevent backlogs and delays, a longitudinal survey must be well-organized and planned so that analysis and data release keep pace with data collection.   Data collection schedules are not the only factor in backlogs. Another factor is data processing, including file linkage. Survey organizations that are more accustomed to doing cross-sectional surveys or other non-longitudinal surveys often have difficulty recognizing the special processing needs of longitudinal surveys. Databases need specification, key variables,need identification, and a policy on imputation needs to be thought through. Ideally, all this needs to be done when the survey questionnaire is designed, but this ideal is seldom, if ever, met.   F. Data Analysis   Data analysis is often looked on as the rewarding part of the job after the difficulties of data collection and data processing. Analytical interests often go beyond the agency conducting the study. Some agencies include analysis contracts in their contracting for services. Usually some analysis is done by agency personnel. One possibility to counter some,of the delay caused by the time it takes to complete a longitudinal survey is to analyze each wave as if it were from a cross-sectional survey. This not only provides timely data, but raises questions to be answered at later stages, and generally whets the appetite for more data and more analysis. Recent data from non-going longitudinal programs can be analyzed relatively quickly to serve some analytical purposes without delay. It is also possible to add questions to the current data collections of a longitudinal survey to meet immediate data needs.   G. Release of Data   A principal goal of any longitudinal survey should be to produce public use data tapes and analytical reports rapidly, both for policy-makers and the interested public. If public use files are to be created, then procedures to   16       protect confidentiality must be worked out in advance, File structure and documentation need to be readily available. Variance estimation must be provided for those using the file. The permanent survey staff should maintain a role in the preparation of files and reports, so that their expertise and interest are not lost.   In conclusion, longitudinal surveys, sometimes taking 5 years or more to complete, inevitably encounter staff changes. Two management approaches can minimize the loss of institutional memory. First, it is vital that every survey activity be documented. Interview instructions, edit specifications, variable definitions, file layouts, sampling, weighting and imputation methodologies, all instruments and procedures should be recorded and readily available. This task is very labor-intensive and, unfortunately, apt to be slighted when staff time is short. Second, inter-agency agreements or contracts may clearly lay out both the procedures to be used and the final products. It is also wise to specify key contractor staff persons who cannot be replaced without sponsor approval. These actions are important to minimize the effect of staff changes and to prevent errors and delays.   17         18     CHAPTER 3   LONGITUDINAL SURVEY OPERATIONS     The principal differences between field and processing operations in one-time surveys and in longitudinal surveys are created by the use of time as a significant factor in research. Longitudinal surveys typically encounter changing conditions, and survey designers have developed and evaluated a variety of methods for controlling the problems that can be caused by change in the sample or changes in the design or administration of the survey.   A. Sample change over time   The composition of the sample may be expected to change across waves for a variety of reasons. Respondents may refuse to participate, they may die, they may move and cannot be found, or they may leave the sampling frame (e.g., by entering an institutional population or by moving abroad). The danger is that the sample becomes increasingly less representative of the target population as time passes. To minimize the effects of these problems, new observational units are routinely introduced into the samples of some continuing surveys as time passes.   1. Selection of new units into sample   For some longitudinal surveys, they are a number of concerns related to the length of time respondents are kept in sample. Respondent burden across several interviews may produce a decline in the quality of data gathered or may result in increasing refusal rates. Respondents may also leave the sampling frame, move and cannot be tracked, or die, thereby affecting the representativeness of the sample. for these reasons, it may be desirable to institute a rotating panel design, which regularly moves new respondents into the sample and retires other respondents after a fixed number of interviews or period of time.   The Survey of Income and Program Participation (SIPP), the National Crime Survey (NCS), the new Consumer Expenditure Survey (CE), and the Consumer Price Index (CPI) have all adopted rotating panels. SIPP introduces new respondents annually and retains them for 2-« years (7 or 8 interviews) before rotating them out; NCS introduces new respondents monthly and interviews them for 3-« years (7 interviews). The CE Survey introduces respondents monthly and interviews them five times on a quarterly basis, while the CPI introduces new respondents once every five years and interviews monthly or bimonthly.   Fienberg and Tanur (1983) note that rotating panel designs may create some problems of inference, according to conventional sample survey theory, in that random selections of respondents occur at different times for different respondents. The argue, however, that this is only important when date of selection is related to temporal changes in the phenomena the survey was designed to measure. The inferential   Principal Author: Bruce Taylor 19       difficulties which might result from a rotating panel design must be balanced against the reduction of attrition-related bias, which is the alternative.   2. Movers   Some respondents may be expected to move from originally sampled housing locations (or telephone numbers) during their time in sample. Depending on the purpose of the survey and procedures adopted to track movers, respondent mobility has varying implications for the representativeness of the sample over time. a number of factors may enter into decisions regarding whether, or how, to follow movers.   A crucial consideration is to determine the most important unit of observation for the survey. A longitudinal survey of persons may be designed to follow sample individuals or households, if the substantive goals of the survey would be served by retaining as many of the originally sampled respondents as possible. A number of surveys, such as SIPP and NLS, focus on individual and household economic data, which continue to be relevant to the purposes of the survey regardless of respondent mobility. Consequently, following movers is an appropriate means to maintain data quality over time for such surveys.   Following movers may create other problems, however. For instance, if there are ecological correlates for the phenomena of interest, such as crime or quality of housing, then following mobile respondents may result in deterioration of the geographic representativeness of the original sample, with a consequent potential for bias in some measures for later waves. A rotating panel design may minimize this problem, because newer respondents are more likely to reside in the originally sampled housing location.   Another reason for following movers is that respondents may move for reasons related to the substantive goals of the survey. This makes it important to know why they move. If this is the only reason for following movers, then collecting data for only one wave after a move may be enough. In NCS, for example, some respondents may move from a high-crime area to a safer neighborhood, and it would be important to determine the proportion of moves which were related to crime victimization can be measured, but not the future consequences of victimizations for such movers.   The SIPP is attempting to follow all individual movers. Because living arrangements vary according to economic circumstance --and affect eligibility for social welfare programs -- a change in residence can be related to changes in income and program participation. Thus, for SIPP it is crucial not to lose data on movers. The CPI, on the other hand, follows only those movers who provide services, such as doctors or lawyers, since their expertise is the item being purchased. When a commodity outlet changes location, this move is considered a unit "death" and the CPI record is terminated.   The actual procedures developed for following movers are likely to reflect the field procedures of the organization conducting the survey, the collection mode used, the distance involved, and the costs associated with tracking movers. If the organization conducting the survey uses decentralized collection procedures, a respondent moving from the jurisdiction of one regional office to another may be more difficult and more expensive to track. Also, the costs of following movers may be greater if a face-to-face collection mode is used, rather than a telephone design, where tracking procedures may     20     be limited to obtaining a new telephone number. Depending on the cost, administrative difficulty, and proportion of respondents who move far enough to create problems, it may not be desirable to follow all movers or to rely on standard collection modes. SIPP field procedures, for instance, indicate that personal interviews need not be administered if the respondent has moved beyond 100 miles from any sample PSU, and rules also differ for respondents younger than fifteen years of age. If survey procedures allow telephone interviews in lieu of face-to-face interviews, a phone contact may be a desirable alternative for movers who are difficult to reach.   The type of sample involved may also affect the ease with which movers may be located. For instance, it is usually easier to find a mover through neighbors or subsequent occupants of a sample housing unit if an area sample has been adopted rather than with a random digit dial sample. Asking respondents to notify the field office with pre-printed cards when they move can be a partial solution, but this option relies heavily on the respondent's cooperation.   3. Attrition   When projected across waves of a longitudinal survey, manageable levels of non-response in a cross-sectional survey can become significant sample attrition. The potential for attrition in a longitudinal survey sometimes limits sample definition. Tracing mobile respondents generally accounts for a large proportion of field problems as well as costs, and refusal rates are likely to grow over the life of the survey. Incomplete records and missing interviews create analytical complexities that are unparalleled in cross-sectional research. Attrition is most dangerous when it is correlated with the objectives of the survey. For example, there is evidence that sample attrition may be related to victim status in the NCS. To the extent that the sample loses victims at a faster rate than non-victims, estimates from later waves will be biased. Also, Fienberg and Tanur(p.17) note than in social experiments disproportionate loss of respondents for different treatments may be a problem, because treatments often vary in their attractiveness to participants.   Sample attrition between observation periods may create the illusion of change when means are compared between waves, without adjusting for non-response. In study focused on identifying change, there is a risk that changes are spurious, due to sample attrition. In addition, respondent participation that varies from panel to panel could produce the appearance of change even when aggregate non-response is stable. The estimates of central tendency (Cook & Alexander: 191). Mean test results from longitudinal panels of students taking ETS exams were compared to mean test results derived from a cross-sectional survey of the same population. The means were significantly different, which the analysts attributed to selective attrition in the longitudinal sample. Effects of attrition in demographic surveys have been harder to predict. Attrition does not necessarily created unmanageable bias in a longitudinal survey: The NLS was still contacting 92 percent of living respondents 3 years after the original contact, and still contacting 80 percent of eligible respondents 12 years after the study began (U.S. Department of Commerce:321). In the ISDP panels of 1978 and 1979, attrition did not climb steadily over the five or six interviews administered to respondents. Instead, it leveled off and then declined slightly over all waves (Ycas:150). Nonetheless, a combination of attrition and varying participation from wave to wave can create serious   21       problems in creating complete records. In the 1979 ISDP panel, for instance, only two thirds of the original sample persons had complete interview records (Ycas:150).   Calculating the response rate in longitudinal surveys is itself difficult. The measures used in cross-sectional research are often not adequate for measuring non-response in complex records, as they do not reflect cumulative non-response across waves and do not take into account changes in the size of the eligible sample due to births, deaths, and the addition of new household members. The illustrate, non-response for entire housing units in the NCS is sometimes reported at 4 percent. However, when records for housing locations are linked to form a longitudinal file, it has been found that over half of the originally sampled housing units are missing at least one interview. This discrepancy is due to the fact that the former figure is a cross-sectional measure of unit non-response in a particular wave and does not account for the approximately 10% of sample housing units unoccupied at the time of interview (Fienberg & Tanur:14). This figure also dies not cumulate non-response over time. While the lower figure is an appropriate measure for many cross-sectional uses of NCS data, it clearly is inadequate for reflecting the completeness of linked housing unit records.   The methods that have been developed for tracing respondents in longitudinal surveys have been successful, but they have also proven to be expensive. The Census Bureau has estimated that the cost of contacting each wave of an ISDP research panel increase by 8 percent over the previous wave, due to the costs of following movers and interviewing additional households (Fienberg & Tanur:11- 12, White & Huang). However, NCES also found that per-unit tracing costs for the High School and Beyond (HS&B;) Survey were approximately 20% less than the cost of base year sampling, which illustrates the economies which can be realized by mounting a longitudinal study, rather than separate cross-sectional studies. To control costs, as well as potential bias, each longitudinal survey must investigate the characteristics of respondents who move. Depending on empirical evidence about how atypical non- respondents are, a judgment can be made about the proper balance between the costs of tracing respondents and an acceptable level of non-response.   Sample definition offers another approach to limiting unscheduled attrition. The probability of becoming a non- respondent is not randomly distributed among the population. In longitudinal samples such factors as rural resident, interval since contact, and region of the U.S. affect the probability of maintaining contact (Artzrouni:21-24). Some longitudinal designs have therefore sought to minimize attrition by avoiding the respondent classes that are most susceptible to attrition.   Setting aside respondent classes to control attrition can conflict with attaining a sample that truly represents the reference population. However, a sample chosen without regard to eventual tracing difficulties may also gradually lose its representative power through attrition. Only empirical evidence can indicate the extent to which characteristics that predict attrition co-vary with the characteristics that the study is designed to investigate. A sampling design which sets aside respondent classes with potential attrition problems should be undertaken only after careful consideration of the relative magnitude of bias which could be introduced by such a strategy and other alternatives, such as imputation for missing data or performing analysis on the remaining sample cases of an initially representative sample.   In cohort or panel studies, which require measurement to begin and end at the same time for all respondents, implementation of a rotating panel design, which reduces the impact of attrition by replacing respondents over time, will clearly not serve the goals of the survey. One possible strategy for dealing with attrition in such studies is to impute.   22       missing data, based either on statistical models or on complete data from prior waves or from respondents with similar characteristics. Another possibility is to reweight the sample for each wave to reflect non-response for various demographic groups in the sample. (See Chapter 4.)   Duncan, Juster, and Morgan (1982) model such a procedure for the Panel Study of Income Dynamics (PSID), conducted by the Institute for Social Research (ISR) at the University of Michigan. They compare results for data gathered with persistent efforts to pursue respondents and for the data set which would have resulted if less intensive respondent contact strategies had been adopted. When the latter is reweighted to adjust for missing cases and compared with the first data set, there are minimal differences in outcome measures. While this procedure has promise for minimizing bias resulting from non-response across waves, it may also allow some relaxation in pursuing respondents, allowing cost reductions in survey administration. The authors do note, however, that reweighting entails some risk of covariation-related bias in multivariate estimates, especially for models that are not well specified, and that maintaining an adequate number of respondents in some key subsamples may remain a problem.   A reasonable precaution to minimize the deleterious effects of sample attrition is to minimize respondent burden, which has been variously described as the amount of time which an interview entails or as the complexity of the task required of respondents for successful completion of an interview. Under the Paperwork Reduction Act of 1980, each Federal statistical program is restricted to a limited number of hours available for data collection in a fiscal year, thereby encouraging reduction of the burden placed on respondents. In addition to the statutory reasons for limiting the length of Federally sponsored surveys, controlling respondent burden may also improve data quality for longitudinal surveys in a number of ways. An important aspect of this data quality enhancement is that, respondent participation may be encouraged by reducing interview tedium, thereby reducing refusal rates and enhancing the representativeness of the sample over time.   Respondent burden hours may be reduced by a careful evaluation of the utility of collecting information in every wave. The SIPP, for example, minimizes respondent burden by dividing the survey into a core questionnaire ad ministered at each interview, plus "topical modules" to collect data not required as regularly. Sometimes only a subsample of respondents should answer certain topics. Finally, lengthening and/or varying the intervals between waves should also be considered as a means for reducing respondent burden. The CPS, while not a longitudinal survey, adopts this strategy of varying tim e between interviews. Respondents are interviewed for four months in succession, not contacted for the following eight months, and then interviewed for a final four months.     4. Changes in Units of Observation   A slightly different sample of respondents participates in each wave of a longitudinal survey. Such changes in sample may result from scheduled introduction or retirement of sample units in a rotating panel design, from attrition, or from introducing new respondents when household composition changes. This variation causes difficulties related to defining the correct reference population, in weighting for item non-response, and in weighting respondents who enter and leave the sample. In addition, the changing sample of respondents and aggregate units creates unique difficulties in analyzing data above the person level A variety of approaches has been used to define units of analysis in longitudinal research, and each has specific problems and strengths. These are discussed in detail in Chapter 4.   23       It should be noted here, however, that all weighting adjustments should be planned simultaneously. The problem of adjusting for non-response is the converse of problems created by persons entering the sample, and the adjustments for entrants and non-coverage, once selected, can be accomplished in a single operation.   Split and merged households present particular problems for sample comparability across waves. Such recomposition of households creates obvious difficulties for longitudinal matching, which will be discussed below. However, changes in household membership also raise questions about how to treat new members of split households who were not members of the originally sampled household but who came into sample because of their associations with original sample persons. Rules developed by the ISDP offer one method which seems generally applicable to a number of surveys: New household members were added to the sample, but if they left the household, or if this household subsequently split, only those members who were selected for the original sample were followed. This procedure avoids excessive growth of the panel, thus minimizing artifactual changes in aggregate panel statistics, but still collects relevant household data which correspond to data from "stable" households.   Whether a change in a household constitutes the birth or death of the sample unit depends on the goals of the survey. If the survey samples households and does not follow movers, then a complete turnover in the household occupants would indicate the birth of a new unit. If housing locations are sampled, then such a turnover would not constitute a death as long as the hosing unit remains occupied. The death of a member of the household, or event he head, does not constitute death of the unit for a household- based sample, but a divorce or separation often will be defined as termination of the unit. If an individual respondent leaves the sample, the reason for the departure should be determined. If the respondent has died, then the individual record should be terminated. However, if the respondent leaves the sampling frame for other reasons (e.g., entering the military or moving abroad), it is possible that he or she may return during the life of the panel, and the record should be retained.   Often the death of a unit can be determined by observation. For instance, when a housing unit is vacant or destroyed and the sample is location-based, termination of the record may be indicated. However, in other cases respondents must be queried regarding the status of the unit. If the unit of measurement is the household, occupants of the sample location must be asked whether they lived at the current address when the previous interview took place to determine whether they should be considered part of the sample. (Rules for this decision will vary between surveys.) If only part of the household has moved since the previous visit, it may be necessary to determine the reason for the departure to ascertain whether the movers remain in the sampling frame. In designs which do follow movers and which allow the formation of new households during the life of the sample, permanent departure of individuals to form new households will indicate the need to establish new household records. (See Chapter 4 for a fuller discussion of these issues.)   B. Changes Related to Respondents' Time in Sample   Varying sample participation is not the only change over time which complicates inference from longitudinal data. A number of factors related to the time respondents remain in sample may produce changes in survey measures which are independent of any substantive changes in the phenomena under investigation. These factors include variation over time in the rules for interviewing particular respondents and changes in     24       respondents' approach to the interview based on increased experience with the survey instrument as the sample matures.     1. Response Variability Due to Changes in Respondent   The manner in which a survey is administered may vary from respondent to respondent. "Proxy" interviews may be administered, in which adult household members complete interviews on behalf of younger respondents, or in which available household members supply data for other individuals in the household. (In some cases such proxies are restricted to household members who are not present, but, in other instances, one household member will supply personal data for all individuals in the household.) Respondent rules are also frequently needed for collecting household information if there is more than one respondent per household. A number of possibilities exist for respondent rules. For example, one respondent in a household may be selected to provide household data, while personal data is requested from each respondent individually. Alternatively, all respondents may be asked for household data. In the latter case, inconsistencies might be reconciled in the field, for instance, when respondents report conflicting details regarding a household crime incident. A computer edit, or a postweighting algorithm might also adjust for differences in reporting, when household measures are simply the sum of individual measures.   Respondent rules can affect longitudinal data over tim e. For instance, during a longitudinal survey, younger respondents may become eligible to complete an interview without proxy, and may begin to report information of which previous proxies are unaware. There is also evidence that household-respondent status may affect the manner in which personal data are reported, particularly if the two types of information requested are related. Biderman, Cantor, and Reiss (1982), for example, find that respondents who report household data also report higher levels of personal crime victimization than do respondents who do not report household data. They also find that, if the household respondent changed between interviews, levels of personal victimization for the affected persons would also change. The authors hypothesize that the initial battery of household victimization items serves as a warm- up for personal items and aids recall for household respondents.   If the household respondent is allowed to change across waves, then two effects should be anticipated. First, the quality of personal data reported by a given respondent is likely to change over time, depending on whether he or she serves as the household respondent. Second, different household members will vary in their knowledge of the relevant data, so the quality of household data may also be expected to change over time and thereby bias transition estimates.     There are some obvious remedies for these problems. First, proxy interviews should be minimized, recognizing that obtaining certain information directly from younger respondents may be inappropriate or that there maybe no other way to collect data for some respondents.   Surveys vary in their reliance on data collected by proxy (eg., about 60% for NCS, 40% for SIPP), and such a policy is likely to produce an improvement in data quality proportionate to the fraction of data currently collected in this manner. Second, care should be ta ken in assigning responsibility for answering questions about the household over time, either by consistently assigning this responsibility to the same respondent or by requesting these data of all respondents. The latter procedure minimizes the effect of an unavoidable change in household respondent and makes any respondent effect consistent across all waves however, due to mandated     25           ceilings on response burden for federally sponsored data collections, the additional precision realized may not justify the substantial number of redundant questions which are required. It should also be noted that the reconciliation procedures or post- weighting that would be required may make such a strategy very difficult to use.     2. Panel Bias   A number of factors associated with respondents' time in sample may produce changes in survey measures over time and thereby complicate explanation. The impact of these factors has been described as a history effect, secular effect, maturation effect, rotation group bias, time-in-sample bias, or Heisenberg effect. These factors include the reactivity of respondents to survey measures, changes in the performance of the respondent role, the "conditioning" effect of multiple administrations of the survey instrument, the aging of the panel, interaction between interviewers and respondents, interviewers' perceptions of their role, and the correlation between variables of interest and the probability of response. Changes in survey measures due to such effects present a danger for bias in longitudinal estimation. Consequently it is important to consider the influence of such factors when designing a longitudinal survey and to minimize the potential for such changes. This is a difficult task, because the reasons for the phenomenon are not clearly understood.   Ideally, the process of measurement should itself produce no change in the phenomenon under investigation. Research methodology in experimental psychology, for example, often involves disguising the purposes of research, so that the subject will produce the behavior under investigation with minimal "contamination" by the research procedure. In survey research, however, the respondent must not only understand the measures being collected but also must be led to appreciate the purposes and value of the research if response rates are to remain high. This is particularly important for longitudinal surveys, where retaining sample is a crucial goal Consequently the danger of reactivity between survey interviewing and the phenomena under investigation is a particular problem.   Researchers studying labor market experience, for example, have speculated that repeated interviews asking about job mobility might cause some of the mobility reported (Parnes:15). Questions about mobility may in fact cause subjects to consider the possibility and act upon it. National Crime Survey data also indicate that proportionately fewer crime incidents are reported in successive waves. This finding may stem from respondents' heightened awareness of vulnerability to crime, caused by participation in the NCS, which results in increased precautions taken against crime victimization. It has been suggested that respondents in a longitudinal sample might exhibit non-typical behavior Simply because repeated questioning regarding a topic may alter respondents' perceptions of the subject under investigation and change their behavior or attitudes accordingly.   For respondents no remain in sample, their responses can change over tim e solely as a function of longevity in the panel These temporal variations in response have implications for the quality of longitudinal data which are often unpredictable. In some cases, the quality of data may improve over time. Respondents may understand the respondent role better with repeated interviewing or pay greater attention on a day-today basis to the experiences being measured, with a consequent improvement in the richness or accuracy of the data gathered. Alternatively, if respondents or interviewers find the interview tedious or burdensome, they may become less enthusiastic about the   26           task over successive waves and avoid or give incomplete responses to survey items. One aspect of such a decline in data quality is the possibility that respondents may be "conditioned" by their participation over several waves to provide answers which produce artifactual changes over time. For instance, respondents may learn that a particular response will trigger a long battery of questions, which they may prefer to avoid in the future.   This is one alternative explanation for the decline in the rate of crime victimization reported in the NCS over successive waves. Respondents may learn that reporting a crime incident leads to an additional series of items for each incident reported, which results in a substantially longer interview. The Census Bureau's Current Population Survey (CPS), which is not strictly a longitudinal panel survey but which has many of the attributes of a longitudinal survey, exhibits a similar trend. Reporting unemployment triggers a battery of questions dealing with reasons for unemployment and activities directed towards looking for work. Reported unemployment invariably falls between the first and second waves of interviews in the CPS. This phenomenon in CPS could be related to several factors. One has to do with repeated interviewing and attrition. Williams and Mallows showed that, if the probability of response in a given save of interviewing was correlated with variables of interest, then, even with no change in the variables, a spurious change would occur.   The passage of time can also produce unintended change between observations because of gradual shifts in the meaning of questions and answers. Even when questionnaires are not changed, there may be evolution In the way respondents perceive or answer questions, which produces the appearance of movement (Parnes:14). This might be caused by events (including the survey itself), by maturation in the sample, or by non-response.   It is very difficult to determine whether a change across waves is real change or spurious change. Continuing validation research is necessary to identify panel bias in longitudinal data. Panel bias may be studied by comparing data collected in subsequent waves of a longitudinal survey to data collected in cross-sectional surveys (as in Cook & Alexander).   Although some conditioning or panel effects may be inevitable, several tactics can be used to minimize their impact. One option is to implement a rotating panel design to replace respondents after a predetermined number of interviews. This procedure affords two primary benefits. First, those respondents who have been in sample the longest are replaced with more "inexperienced" respondents. Second, the temporal overlap of old and new sample facilitates studies of time in sample effects. All respondents are administered the same instrument under the same conditions at the same time, which serves to test alternative hypotheses about panel effects.   Another possible means to attenuate or postpone the effects of panel bias is to minimize the respondent burden imposed by the interview. Careful construction of the instrument to minimize tedium and encourage respondent rapport should be central concerns in planning any survey but take on added importance in longitudinal data Collections, because of the need to sustain the active participation of respondents overepeated interviews. The overall length of the instrument may play a role in the respondents willingness to participate fully in successive contacts. However, design of the instrument to minimize tasks which the respondent is likely to find either tedious or particularly difficult is also an important consideration. Use of long follow-up batteries should also be minimized, to attenuate the effects of respondent conditioning.   27           C. Operations Change Over Time     Changes in the administration of a continuing survey are almost inevitable. Revisions to the instrument, redesign of the sample, introduction of new collection modes, and transfer of data collection responsibilities to another organization can all introduce changes in the data and compromise the validity of longitudinal comparisons. While a consistent time series may be difficult to maintain under such circumstances, means exist which allow the analyst to deal with the effects of such changes.   Eventually in most longitudinal research there is a pressure to change the survey measures in response to changing hypotheses. In addition, later findings frequently indicate a need for measures of new variables. Particularly when longitudinal research is exploratory and designed to identify significant correlates of change, researchers may be inclined to correct large a mounts of data to minimize future requirements for change in the questionnaire design. This aspect of longitudinal research may be costly, but it is an understandable precaution given the tendency for research hypotheses and/or policy-aims to change over time.   To accommodate changing methods, a survey may be run under old and new procedures simultaneously for a period of time, to allow comparisons between data collected before and after the change. Ideally, both old and new designs should be implemented at full sample, in effect twice the usual sample size, but budget constraints will often make this impractical The CPS has adopted this double-sample strategy to phase in new samples based on the 1980 Census. The CPI also used both old and new sample designs simultaneously for a six- month period in 1978, when the survey was revised.   Another strategy to consider when a questionnaire item is rewritten or a derived variable in a file is altered is to make changes in such a way that analysts may record the revised variable to correspond to the original variable (and vice versa), or to retain old questionnaire items in the revised instrument for some time. NCES adopted the latter strategy for the HS&B; survey when it adopted an "event history" approach to gathering employment and education data. In addition to the new items, the previous "Point in time" activity item was continued, allowing calibration of new items to the old and providing a degree of comparability between versions.   To reduce field costs, many sponsor agencies have approved designs which permit data collection by telephone after the first visit. NMCES and MNCUES, for example, used phone contacts for follow-up interviews. The available evidence suggests that such changes in mode may not produce uncontrollable fluctuations in the measures obtained. Benus (1975) notes that data collected by telephone and by personal visit for the Panel Survey of Income Dynamics (PSID) are quite similar. Groves and Kahn (1979) found overall that univariate distributions and bivariate relationships were not significantly different for 200 questions ad ministered by telephone and in person. However, they note that telephone interviews elicited more rounded financial figures, less detailed responses to open-ended questions and narrower distributions on some attitude items. They also indicate that respondents tend to perceive telephone interviews as longer than personal interviews of the same length. Findings that telephone respondents tend to give more "don't know" answers to filter questions triggering other questions may be related to this difference in perception of length. Telephone respondents may be more eager to bring the interview to a close. Consequently minimizing respondent burden seem s particularly crucial for interviews conducted by telephone.   28           While the research literature on the effects of interviewing mode on survey response is generally encouraging, there are enough examples of differences in respondent behavior to indicate that a mixed mode design should not be implemented without adequate pretesting and analysis of the effects. One danger is that a particular questionnaire design or questions about a certain subject area might trigger mode-related differences in respondent behavior. To facilitate measurement of such mode-related response variability, it is desirable to design shifts in mode of data collection so that the changes across waves are systematic, making the effects measurable. It is also important in surveys which do not require interviews with all household members to ensure that interviews are obtained from the same household members when the interviewing mode varies across waves, as respondent availability may vary by mode.   In conclusion, prospective longitudinal surveys require administrative and operational features that are different in kind as well as degree from those in cross sectional research. The long-term analytical goals of the survey must be considered in planning every aspect of sample definition and weighting. Provisions should be made for validation studies to evaluate such factors as attrition and panel bias. Finally, changes in format, operations and staff must be anticipated and managed in ways that ensure the comparability of measures from wave to wave.   In practice it is worth noting that there are only a limited number of organizations which handle nearly all large-scale longitudinal surveys. Due to their experience, these organizations have a high level of expertise, and the continuity of experience contributes to successful planning and implementation. However, the concentration of longitudinal research in such a small number of organizations increases the impact that any errors, such as limitations in the sampling frames most commonly used, would have on the representativeness of longitudinal research.     D. Processing   While the measures collected in longitudinal research may be similar, to those collected in cross-sectional studies, there are special problems in controlling and interpreting them. The sheer size of the data files created in national longitudinal surveys creates special problems in processing and analysis. The massive files can be difficult, expensive, and slow to process, which has often limited their use to organizations with the staff, equipment, and often complex software capable of handling complex data sets. As a result, data analysis has typically lagged behind the accumulation of data (Kalachek:17). Fortunately, this situation is changing with the advent of public use files for multivariate analysis and with the dissemination of m ore user-friendly "statistical data base" packages to facilitate data management and analysis.   In processing data from longitudinal surveys, difficulties are encountered related to cross-wave case matching, cross-wave data revisions, and preparation of data files for analysis. Often there is no single "best" procedure for processing, because ease of processing and analytical requirements are not always compatible goals.   Errors in individual record files can cause multiple problems. Often items which should remain consistent across waves (e.g., race and sex) or which should change only in predictable ways (like age and marital status) will exhibit changes due to respondent confusion, transcription error by interviewers, or keypunching errors by processing staff. Detecting these errors is important, not only because such items often define key     29           demographic variables for analysis, but because such items are frequently needed to match cases. Errors are also inevitably introduced when imputations are made for missing data.   Several procedures are possible to minimize errors. For SIPP, the field office staff immediately checks completed interviews to reconcile discrepancies, avoiding more costly correction of data after they have been keyed. Another possible procedure is to build computer edits into the processing system to detect inconsistencies between current and prior interviews. NLS-72 and HS&B; use machine edits to identify and resolve inconsistencies for about thirty critical items. Another option, utilized by CPI, is to create a machine-generated control card, which avoids errors in transcription and which provides interviewers with prior-wave data necessary to reconcile discrepancies in the field. This latter procedure, however, can also lead to reduced reporting of actual change.     1. Cross-Wave Matching   In order to link data across waves, variables must be created to match records at the desired unit of analysis. A number of data management issues must be addressed, including the consistency of linking variables across waves, providing for longitudinal matching at multiple levels of analysis, and rules for matching merged and split households.   If longitudinal records are not matched correctly between waves, the effects can be similar to sample attrition or non- response. The records of one or more observations will be missing from a respondent's longitudinal file, giving the appearance of missing interviews. One possible consequence of matching errors is error in analysis, either because incomplete records are deleted, or because missing data are imputed. If records are linked incorrectly, longitudinal data are also likely to produce flawed results by showing false changes in status. Even cross-sectional analyses may be in error, if control card information or data from previous interviews are carried over onto the improperly matched record by the processing system.   A number of procedures are possible for linking units accurately from wave to wave, including matching of household and individual line numbers, or matching independent person and/or household identification numbers. Economy in the number of variables used for a match is generally a virtue, because the opportunity for mismatches due to transcription or coding errors increases with the number of variables used. So does the likelihood of missing data, which often results in the computer assigning a missing data code, which hampers matching. Limited redundancy in linking variables can, however, provide some protection against false matches, in that such cases are more likely to be flagged in the matching process.   Validation procedures to detect longitudinal mismatches should be incorporated into the processing system and can often rely on demographic variables which either should not change over time (e.g., race, sex, or date of birth) or which can be expected to change in predictable fashion (e.g., marital status or age). Such methods are particularly useful when person-level matching is performed using the assigned line number of respondents within household. It is also useful to imbed check digits in key linkage numbers, to detect miskeying. In addition to careful design of validation variables, immediate error checking by the field office of items important for matching and validation is likely to reduce the number of mismatches significantly.   30       Often, person records are linked across waves by matching on household ID and on the line number of an individual within the household record. This is usually cumbersome, and it makes linking individual data across waves extremely difficult if an individual moves out of the sampled household, if the household dissolves, or if the household merges with another household, all of which render the previously assigned household ID obsolete. Consequently, for surveys which are intended to follow individuals, regardless of the duration of their association with a sampled household or household location, assignment of an independent person ID is highly desirable. This is not to argue that ID is at other levels of observation are not useful, as longitudinal analysis at household, person, or event level is often needed. The important consideration is that linking variables be designed so that changes in sample composition do not prevent record matches.   SIPP has implemented an ID which, while complex, illustrates the sort of linkage which is often desirable. (Cf Jean & McArthur, 1984). The ID consists of:     PSU number - 3 digits Segment number - 4 digits Serial number - 2 digits Address ID - 2 digits Entry address ID - 2 digits Person number - 2 digits     Household ID consists of address ID, PSU, segment, and serial numbers. The latter three numbers are fixed once assigned. The entry address ID also does not change. The first digit of the address ID indicates the wave at which the household was interviewed at that address. The second digit sequentially numbers, by address, households resulting from a split into two or more households by original sample persons. The first digit of the person number indicates the wave at which the respondent entered the sample, and the second two digits sequentially number persons within the household. This ID also remains fixed.   Linking households or individuals with the SIPP system is fairly straightforward. Households whose composition does not change require the household ID, and individuals require the household ID and person number to provide a match. The inclusion of a fixed entry address ID also facilitates matching records for individuals or households who move, and for split households. Combining the person number and the entry address ID provides a person number which remains constant regardless of changes in address and household composition. This provides a link to data collected for an individual across all waves, allows a match to the initial household, and permits the analyst to filter data for only the original survey respondents, if desired. This system remains adequate for multiple movers or for households which split a number of times.   In 1979 two waves of interviews from an ISDP panel were merged into a single longitudinal file using personal identification variables. Mismatching between records proved to be a significant problem, and there was evidence that additional matching errors were undetected (Kalton & Lepkowski:26). A second file was created using ID numbers rather than personal characteristics. This file had significantly fewer discrepancies during edit checks for such items as sex and age, indicating that fewer matching errors occurred with the use of the ID number for linking.   31           Sometimes the potential of longitudinal data has not been exploited because of the complexities involved in updating data with information collected in subsequent waves. For instance, a respondent may report a crime victimization or a health problem, but information on insurance coverage will remain incomplete, because the claim had not been settled at the time of the interview. It is frequently desirable to revise or add data during a later interview and to create an automated control system which would allow revision of the original record. One possibility is to provide a check item on the instrument for information which is frequently incomplete. The control system could then flag incomplete data during processing and direct the interviewer to follow up on this question in a later wave. Similar procedures were used in N M C E S and N M C U E S, which allowed validation of data collected on health care payments and insurance coverage during later interviews.   Revising files obviously creates some complications, and there are trade-offs between ease of processing and ease of analyzing the revised records. One of the simplest procedures for processing is to reserve a field for follow-up data in the interview along with an incident or event ID which allows a match to the original record. This procedure unfortunately would make the analyst's task considerably more difficult, in that several files would have to be scanned to locate all updated material. The required matching and file restructuring routines would also be rather cumbersome and expensive to run, unless the data were released in a form compatible with a statistical data base which performed the matching. These complexities create potential for data management errors, particularly for inexperienced users accessing public use files.   The alternative is to correct the original records based on followup data and to release the updated files. A disadvantage of this procedure is that several versions of the same, file would be in circulation.* Nonetheless this procedure appears to have greater potential for facilitating straightforward analysis and management of the data, particularly if early versions of a file are labeled as "preliminary."     2. Data Structures to Facilitate Analysis   A number of strategies may be used to create longitudinal data files. One is to create, a separate fixed length record for each case at the smallest unit of analysis, with separate fields devoted to repeated measures of the same variable. Often this is not feasible, because this procedure entails a thorough revision of the file every time a new wave is completed. It is often preferable to produce a separate file for each completed wave or even more frequently if data collection extends over a lengthy period and to include in the files a number of linking variables which remain constant for each case across waves. Other than the size of the files produced, the main difference between these two approaches then is in the processing system adopted: The former produces Integrated longitudinal files, while the latter produces files resembling crow-sectional data sets which allow the analyst to link the records later.   Producing a file which uses the smallest unit of observation as the basis for a record is often not the most efficient structure for a data set. A number of surveys     ________________________________   *This is not as serious a problem for longitudinal files, the latest version of which can more easily be identified, as it is for cross-sectional files created from a particular wave.   32           collect data on households, individuals within households, and discrete events experienced by the household in aggregate or by individual members. Given the implicit "nesting" of such data, creating a file based on the smallest unit will result in much redundant information for higher level units. The number of events recorded and the number of household members may also be expected to vary between households, and variable length records will result, necessitating extensive "padding" to create a rectangular file.   A more efficient strategy in such cases is to produce hierarchical files with the data pertaining to each level of observation appearing in separate records and with variables appearing in more than one type of record to allow for linkage across levels. A number of software packages such as SAS and OSIRIS now exist which can process and analyze such files. In addition, a number of "statistical data base" packages are available, such as SIR, Canada's RAPID, and Mathematical Policy Research's R A MIS, which provide sophisticated capabilities for matching across waves and levels, and which thereby simplify the analyst's data management tasks in working with longitudinal files.   Decisions regarding the optimum structure for a longitudinal file also need to take into account the expected size of files. Limits on the number of records many soft ware packages can process may be exceeded by the size of large federal data collections. Consequently, file structure options for facilitating analysis of longitudinal data may be constrained. Sponsors may find it necessary either to forego compatibility with some otherwise useful software packages or to release subsets of their data to provide compatibility with a wider range of software packages.     3. Confidentiality   Processing operations and data structures for analysis cannot be designed solely to reduce costs, complexity, or bias. They must also protect respondent privacy as far as possible. This is sometimes not compatible with maximum efficiency. Procedures for protecting confidentiality of paper records and of tape records must be thought through carefully.   The problem of maintaining respondent confidentiality is more difficult in longitudinal surveys than in cross-sectional surveys. In cross-sectional research, the confidentiality of a response can be protected by stripping responses of identifiers at an early stage in processing. In longitudinal surveys, response records must be linked to personal identifiers, sometimes for decades, until data collection and analysis are complete. Longitudinal records commonly contain multiple identifiers in order to facilitate tracing and to ensure that records can be matched after each wave, regardless of missing data. Name, address and Social Security number are often augmented with the name and address of family, neighbors, or friends who are to be contacted in tracing respondents who have moved. The large number of identifiers, plus their dispersion across records and across time, makes protecting confidentiality in a longitudinal survey far more difficult than in cross-sectional research. However, most research organizations have learned over the years how to protect paper records.   An illustration of one solution to problem is that adopted by N C ES for the NLS-72 and HS & B: Identifiers are stripped from the tape prepared by the contractor before it is turned over to the sponsor agency. These data are maintained by the contractor but may only be used with the explicit approval of the sponsor. The procedure provides a complicated, layered procedure which inhibits any unauthorized access by sponsor, contractor, or public users and provides protection similar to that of a cross-   33           sectional study.   This example illustrates a number of the basic safeguards which should be integrated into any longitudinal data collection effort. First, identifiers should be used only to maintain the quality of the data, e.g., for tracing respondents or for matching purposes. Second, only staff performing these functions should be allowed access. Hardcopy media containing identifiable data should be stored in a secured area to limit access. Electronic files should be similarly secured and, when in use, access should be restricted by the operating system to authorized processing personnel only. Third, all privacy- relevant data should be stripped from public use tapes before release. Ideally, the collection agency should separate identifiers during processing and store them on a file separate from the substantive data. Finally, when data Section is complete, all copies of identifiers should be destroyed. Even when such measures are taken, agencies and research organizations must consider the possibility of confidentiality breaks. The quantity of information available about respondents creates the possibility that a series of rare responses can identify respondents. Current research in confidentiality is addressing this problem and should provide useful guidelines for enhanced security measures in the near future.   34           CHAPTER 4 SAMPLE DESIGN AND ESTIMATION   There are many issues in the design and estimation strategies for longitudinal surveys that are identical to those for cross- sectional surveys. Some issues, however, such as weighting and compensating for nonresponse become more complicated with a longitudinal survey. Usually the complications arise because of the changing nature of the population, as discussed in Chapter 3. In this chapter, we discuss some of the major design and estimation problems, many of which need more research.   A. Defining a Longitudinal Universe   Defining the initial study universe for a longitudinal survey is no more complicated than defining the universe for a cross- sectional study, The initial universe is fixed at a specific point in time and is explicitly d fined. Sample units can be selected and the only difficulties are related to the sampling frame itself. Time, however, gradually complicates the problem of defining a longitudinal universe.   The study universe usually does not remain constant over the period of the longitudinal survey, as was discussed earlier., The universe of individuals, households, families, or establishments changes over time. If a universe changes slowly along the critical dimensions of the survey, the problem of a longitudinal universe definition may be ignored. However, if changes in the universe over time are not trivial, a static universe definition may not be sufficient. The choice of definition for the longitudinal universe will have a direct effect on data collection and analysis.   Judkins et al (1984) describe three methods for defining a longitudinal universe. These ideas are generalizable to any longitudinal study of persons or other units. One method for defining a longitudinal universe is to select a specific time during the course of the study as the point that defines the universe. If the universe is defined at the time of sample selection, it is called a cohort study. Units in the sample are defined at the time of the first interview. At later waves of interviewing, data need be collected only from these units. All inferences and estimates refer only to the universe in existence at the time of the first interview. For example, for the CPI commodities and service sector, the universe is a set of cohort samples with attrition due to deaths. Births are introduced only when an entire cohort is replaced with a new sample.   Principal Authors: Daniel Kasprzyk and Lawrence R. Ernst     35           The longitudinal universe may also be defined at a time other than the time of sample selection. Under both scenarios, statistical, operational and methodological problems may arise because the sample was selected at one point in time and the analyses of the study universe reflect a different point in time. It is possible that elements of the study universe at the time of sample selection are no longer part of the longitudinal universe; it is also probable that elements of the longitudinal universe which exist at the time of definition were not in existence at the time the sample was drawn. This creates an operational problem -- whether to collect data from these "entrants" to the longitudinal universe -- and it creates a statistical issue, the development of estimation methods for this universe. For example, in the SIPP universe (the non-institutional population, and members of the military not living in barracks) individuals may leave the universe by moving outside the United States, to an institution, to military barracks, or by dying. At any time during the study period persons may enter the SIPP universe by returning from overseas, institutions, or military barracks, or through birth.   A second method of defining a longitudinal universe extends the first method by looking at more than one time point. Several time points are selected, each one defining a universe at that time. Then the entire set of units -defined by these different cross-sectional universes is included in the longitudinal universe. Thus, if a person entered a sample household by being born or returning from overseas sometime after the initial interview, that person would be included in the longitudinal universe. People can be added to the universe, and anyone who is in the universe for any of the time periods should be included in the estimation.   For analysis of aggregations of persons, such as households and families, some identification of aggregations at each time point is necessary. Since these aggregations can and do change over time, conceptual, operational and statistical difficulties occur. See, for further discussion of this subject, the section on units of analysis in this chapter. This approach, however laden with difficulties, is the approach which best captures the dynamics of the longitudinal universe.   The third method for defining the longitudinal universe is also an extension of the first method, but instead of including all units that enter, leave or stay, this approach includes only those that are common to all the selected time periods. In this approach, one includes in the definition of the longitudinal universe only those elements which were members of all cross- sectional universes. This definition leads to a static universe containing only those elements which do not enter and exit the universe. For example, for households, families, and establishments the universe contains only those units in existence throughout the entire survey period.   As discussed above, defining the longitudinal universe can be a problem when it contains units which enter and leave the cross- sectional universe. When the units are establishments or a group of individuals, some decision concerning "rules of continuity" is necessary. The next section briefly reviews models for longitudinal household (family) units of analysis.   36           Units of Analysis   Aggregations of persons, such as households and families, present difficult conceptual and practical problems in longitudinal surveys. Over time individuals enter and leave households, and set up new households. It is no longer obvious how a household or family should be defined when time becomes an integral part of the definition. McMillen and Herriot (1985) attempt to reduce the possible definitions to a reasonable number, in order to conduct an empirical evaluation of alternative concepts. They also provide a brief review of the historical basis for a longitudinal definition of households. Much of the discussion below is based on the McMillen and Herriot (1985) paper and one by Kasprzyk and Kalton (1983).   Three models have been used to describe household and/or families over time: 1) a static model; 2) an attribute model; and 3) a dynamic _model. The static model of households (or families) classifies households at one point in time, and reflects a cross-sectional perspective. Households and their members are defined at one point and individual characteristics are aggregated over the survey period to provide summary statistics for aggregated analysis units. A critical, but false, assumption has to be made that the household composition remains fixed during the survey period. This definition is not truly longitudinal, because it ignores any changes that each unit may undergo. In this approach weighting the so-called longitudinal sample corresponds to weighting the cross- sectional sample. Note, however, that for CPI or any Laspeyres type index the assumption of fixed composition is what is desired, since the change in composition of sales is being held constant so that price change is the only thing measured.   The second model for defining households or families over time is the attribute model. In this model, the individual is the unit of analysis, and household and family characteristics are treated as individual attributes. As a result, the problem of changing units over time is avoided. Results under this approach are expressed as "X% of persons live in households with attribute "Y", rather than "X% of households have attribute Y." Household characteristics are, therefore, attributes of the individual. The attribute model has been used extensively by the Survey Research Center of the University of Michigan for the analysis of data from the University of Michigan's Panel Study of Income Dynamics.   Dynamic models, the third type, represent the most difficult conceptual and operational problems. In these models, households (or other groups of individuals) are defined over time, not at one point in time, by a set of rules. These rules, often referred to as continuity rules, identify the initiation, continuation, and termination of the analytic unit. Three examples of continuity rules which have been proposed as dynamic definitions of households are presented in McMillen and Herriott (1985). It is not obvious that one set of rules is better than others; in fact, one concept may be more useful for certain kinds of analyses, but not for others. Little empirical work using alternative dynamic concepts has been published, although Citro (1985) has recent begun an investigation using data from the SIPP development program it remains to be seen whether the dynamic concepts can be properly interpreted and employed to provide useful results for policy application.   37           C. Sample Design   For a longitudinal study with a static population, that is, one in which there are no additions over time, the need for longitudinal estimates presents no special difficulties in sample selection. It is only necessary to choose a single sample at the selected point in time, as if a one-time survey were being conducted, and then follow the sample units initially chosen. For such a study there is, in general, no ambiguity about the analytic units, and no additions are permitted to the population. The longitudinal studies of the National Center for Education Statistics (NCES) are examples of this approach.   The populations for all the other longitudinal surveys described in this report are dynamic in nature. For these surveys initial sample, selection presents no particular difficulties. It is only necessary that each unit in the population at the time the initial sample is chosen have a known probability of selection. Complications arise, however, because of the additions to the universe, and the care that must be taken in order to follow the sample units of analysis over time.   Ideally, provision should be made at the design stage to give additions to the universe a chance of entering the sample, or, failing that, to make adjustments for their absence at the estimation stage. For SIPP, Employment Cost Index (ECI) and items in the CPI for which the Point of Purchase Survey (POPS) is the source, the problem of new units is partially alleviated by employing a rotating panel design. Thus, all additions to the universe will eventually be given a chance of selection, with the length of time between panels as the maximum lag. For the ECI and the CPI, because of the difficulty of identifying births quickly, this is the only provision made for additions at either the design or estimation stage. In general, additions to the universe in these surveys have no chance of affecting the estimates until the selection of the next sample or panel. This again is consistent with the Lespayres concept of a fixed set of items and outlets for Measuring price Change only.   In contrast to the ECI and the CPI, the designs of NMCES, NMCUES and SIPP give individuals, families and households that are additions a chance of selection as soon as they enter the universe. At each round of interviewing in these surveys not only is the initial sample interviewed, but so are all individuals currently residing in a household with the original sample people. Individuals joining the universe and moving into a household containing at least one person who was in the universe when the initial sample (or most recent sample) was chosen have a chance of entering the sample. So does any family or household joining the universe that contains at least one individual who was in the universe when the initial sample was chosen. Other individuals, families and households that join the universe have no chance of selection. To cite another example, the CPI rent survey samples building permits in order to identify new units quickly.   Care must be taken in the design of longitudinal surveys to assure that the analytic units used in the estimation process for a specific time interval are followed throughout that time interval. In general, this is not a serious problem with surveys such as the ECI and CPI, since the definitions of analytic   38           units for these surveys generally include a fixed location such as an item at a specific outlet. Furthermore, in cohort studies such as the High School Class of 1972 which only makes estimates for individuals selected in the initial sample, there are no difficulties other than the operational problems associated with following people. However, for NMCES, NMCUES, and SIPP there are difficulties associated with following certain sample analytic units.   A key reason for these difficulties is that a household or family may continue to exist under most longitudinal definitions even though it no longer contains any individuals who were initially in the sample. Under the procedures established for each of these surveys, the household or family will no longer be followed. Ernst, Hubble, and Judkins (1984) discuss this problem in detail. Any individuals who are additions to the universe and who are to be used in the estimation process should also be followed. Provisions were made to do this in T#ICES and NMCUES but not in SIPP. Tn fact, it has not been decided whether additions will be used at all in SIPP for longitudinal person estimation. Judkins et al (1984) discuss this question.   D. Weighting   There may be several stages of weighting a sample. One is to reflect the original universe; another is to adjust for nonresponse; a third may be to adjust for sample coverage. Longitudinal surveys have the usual weighting problems of cross- sectional surveys and then at least one additional problem. That is to provide a longitudinal weight to be used during analysis. In this section, we discuss the simple unbiased weighting and adjustment to independent estimates. Nonresponse, since it can be handled either by weighting or imputation, is deferred to the next section.   I. Unbiased Weights   Typically, the unbiased or base weight for a sample unit is the reciprocal of its probability of selection. In longitudinal surveys, this has generally been the weight assigned to sample units which were in the universe at the time the sample was selected.   The development of base weights becomes more complicated-in surveys such as NTICES, WCUES, and SIPP which incorporate additions to the universe in the estimation process, since it is often not practical to compute selection probabilities for such analytic units. For example, NMCES and NMCUES families which are additions to the universe will generally be used in the estimation process if, and only if, at least one member of the new family had been a member of a sample family during the first round of interviews. It would be extremely difficult to determine the first round families for all the members in the new family, and then compute the probability that at least one of the first round families could have been selected. Fortunately, it is not necessary to know the probability of selection in order to obtain base weights which yield unbiased estimators. See Ernst, Hubble and Judkins (1984) for a description of this methodology.   39   Several longitudinal weighting procedures will now be described. Since most of them will be defined in terms of cross- sectional weights, it is useful to define what is meant by the cross-sectional weight. The first round cross sectional weight for a sample household is taken here to be the reciprocal of the probability of selection. For all nonsample households in the universe this weight is zero. For any time period after the first interview it is defined to be the mean of the first round cross- sectional household weights for all persons in the household who were in the universe during the first interview. This type of weighting procedure is currently used in SIPP to produce cross- sectional household and family estimates.   There appear to be only two precedents for the weighting of longitudinal households and families -- NMCES and NMCUES. For these surveys each family was assigned its cross-sectional weight at the date the family was first formed (See Whitmore, Cox, and Folsom (1982)). The only other survey where serious consideration is being given to the longitudinal household estimation issue is SIPP. Five alternative methods for obtaining unbiased longitudinal weights are discussed in Ernst, Hubble, and Judkins (1984):   1. The NMCES/NMCUES procedure, assigning each longitudinal household (family) its cross-sectional weight at the date the household (family) was first formed.   2. For any time interval, assigning each longitudinal household (family) its cross-sectional weight at the beginning of the time interval.   3. For any time interval, assigning each longitudinal household weight the average of the first round weights for all persons who remain members of the household throughout the time interval. If there are no such people, the longitudinal household weight is zero. This procedure generally has a slight bias.   4. For any time interval, assigning each longitudinal household the average of its monthly cross-sectional weights.   5. If a longitudinal household is defined as an attribute of a specific individual, such as the householder or principal person, then assigning the longitudinal household the first round weight for that specific individual.   The procedures listed apply to the restricted universe of all households in existence throughout the time interval of interest. Some modifications are necessary to apply these procedures to the unrestricted universe of all households in existence for a portion of the time interval of interest. There are advantages and disadvantages to each procedure. They differ, for example, in their need for data from longitudinal households which no longer contain any first round sample persons, or their need to ask retrospective questions in order to determine the appropriate weights.   Finally, we briefly discuss longitudinal person estimation. NMCES and NMCES employ longitudinal person estimation that incorporates additions to the universe. Each additional person is associated with a first round family   40           and then assigned the first round weight of that family. For SIPP, it has not been decided whether individuals who are additions to the universe will be used in the person estimation process or, if so, how they would be weighted. One procedure being considered is to assign to persons who join the universe the cross-sectional weight of the household that they are a member of at the time they join the universe.   2. Adjustments to Independent Estimates   As a final step in the weighting process for several longitudinal demographic surveys, the population is partitioned into demographic groups and individual weights are adjusted so that the sample estimates of the demographic subpopulations agree with independently derived estimates. In general, this estimation step reduces sampling variability and biases resulting from undercoverage.   In the National Longitudinal Surveys (NLS) this adjustment was done for age-race-sex groups for the time of initial sample selection. The adjusted estimates of totals for each group were made to agree with independently derived Bureau of the Census estimates. The Census estimates are obtained by carrying forward the most recent census data to take account of subsequent aging of the population, mortality and migration between the United States and other countries. Since the same sample cases are followed throughout the life of the survey, no subsequent adjustments to independent estimates were made with the following exception: an annual adjustment was made for the cohort of young men (ages 14-29 in 1966) to maintain agreement with the independent estimates. This adjustment corrects population underestimates for men who were not represented in the original sample because they were in the Armed Forces at the time the sample was selected and who subsequently returned to the civilian population.   For annual data files from NMCES and NMCUES, family weights were adjusted so that the estimated number of families existing as of March 15 of the interview year agreed with counts from the March Current Population Survey. For each demographic group the adjustment factor used for sample families in existence on March 15 was also applied to families that did not exist on this date. This was done with the assumption that the rate of undercoverage and nonresponse was the sane for all families in a demographic group, irrespective of whether or not the families existed on March 15. Details of this procedure are given in Whitmore, Cox and Folsom (1982).   For person estimation in the NMCES' and NMCUES' annual data files, the adjusted family weights for each sample individual's first round family were further adjusted separately for each individual to produce agreement with independently derived age- race-sex estimates. The adjustment factor applied to each sample individual in a group was such that the average of the adjusted- sample estimates of numbers of individuals in each group at four times during the year agreed with the average of the independent estimates at the same four times. Details are provided by Jones (1982).   No decision has been made yet on how longitudinal weights for SIPP will be adjusted to agree with independent estimates. One possibility is to use procedures similar to the NMCES and NMCUES procedures. A potential drawback to that approach is that survey estimates will agree with the independent     41           estimates at only one point in time. If agreement is required at other points in a time interval, then adjustment procedures could be modified so that the adjustment factor is not the same for each sample unit of analysis within a demographic group, but instead is also a function of the starting and ending date of that sample unit. This modified approach to adjustment has several disadvantages, such as possibly requiring some weighting factors to be very large.   E. Nonresponse In A Panel Survey   Nonresponse in longitudinal surveys can be treated from either the cross-sectional or longitudinal perspective. References concerning the treatment of nonresponse in panel surveys are in Kalton, Kasprzyk and Santos (1980), Kalton, Lepkowski and Santos (1981), Kalton and Lepkowski (1983), Marini, Olsen and Rubin (1980), David, Little, and McMillen (1983), Little (1984, 1985). Assuming the data, requirements for the survey mandate a longitudinal analysis, then the longitudinal perspective is clearly the more desirable, since it reflects the survey design.   If nonresponse in a longitudinal survey is treated from a cross-sectional perspective, each wave is treated as a separate survey. This has practical advantages in that the release of wave data may occur more quickly than if the separate waves were first linked, and linkage problems resolved. A disadvantage is that records with imputed data will be inconsistent from wave to wave because data processing and estimation procedures are implemented independently from one time to the next. Despite the inconsistencies at the micro-record level, changes in aggregates from the wave to another can be investigated. From a longitudinal perspective, nonresponse in a longitudinal survey is viewed not as nonresponse in a set of unrelated observations but as nonresponse in a set of variables with some logical dependency between two or more points in time. For example, in the CPI missing prices at time t are imputed based on prices obtained at time t-1, and on current average price movement for the item. This view adds considerable information to the data set for the treatment of nonresponse. However, it raises issues concerning the treatment of nonresponse which have not been addressed from the cross-sectional perspective.   Longitudinal surveys can be treated as cross-sectional to generate point-in-time estimates. Because of the repeated interviews, however, indicator variables can measure status over time, thus providing better information on patterns of behavior, transitions from one state to another, and the length of time in a particular status. The importance of obtaining this kind of information justifies linking the waves as quickly as possible and treating nonresponse from a longitudinal perspective.   The treatment of nonresponse in longitudinal surveys is in many ways no different then in cross-sectional surveys. The above discussion attempts to provide some indication of the similarities and differences in the two approaches. The time dimension adds a level of complexity for all decisions related to the treatment of nonresponse. First is the problem of longitudinal data base construction; efforts need to be made to construct longitudinal files which allow analysts to use the panel aspect of the survey. This includes, at a minimum, ensuring that sample units in one wave are linked to sample units in other waves and that critical data items remain consistent from   42     one interview to the next. Second is the problem of selecting imputation or weighting to handle nonresponse on one or more waves. Third is the problem of timing for release of data. Cross- sectional imputation offers the practical convenience of releasing data as soon as each wave's data are available. However, not all data useful for good imputation are available this way. Imputed values are likely to be better when a combined data set is used. Fourth, in spite of the fact that longitudinal imputation is frequently more effective than cross-sectional imputation, a back- up system is necessary to handle cases where values needed for longitudinal imputation are missing.*   1. Types of Nonresponse   Three types of nonresponse occur in surveys: noncoverage, unit nonresponse, and item nonresponse. Noncoverage is the failure to include some units of the survey population in the sampling frame, which means they h ave no chance of appearing in the sample. This may occur, for example, because of incomplete listings at the final stage of selection. Unit nonresponse occurs when no information is collected from the designated sample unit. It can occur because of a refusal, because of a failure to contact the unit (no one at home), or because the unit is unable to cooperate (language difficulties).   Item nonresponse occurs when a unit participates in a survey, but does not provide answers to all the questions. It may occur because:   1. the respondent does not know the answer to the questions;   2. the respondent refuses to answer the questions;   3. the interviewer fails to ask or record the answer to the question;   4. the response is rejected during an edit check (e.g because it is inconsistent with another response.   The distinction between noncoverage and total and item nonresponse is important because it affects the type of compensation procedure adopted. With noncoverage, the survey can provide no information other than that ___________________________   * The following sections describe imputation and reweighting to handle item and unit nonresponse in connection with improving finite population estimates. Imputation and reweighting strategies are not used, however, when estimating mathematical models of an underlying random mechanism or process. Since such analyses focus on estimation of model parameters, neither assigning values to individual cases nor adjusting to independent estimates is appro- priate. Instead, methods of model estimation are used to account for the missing data under the assumption that the same model applies to all sample cases, even though some cases provide more complete histories than others. Model estimation by the method of maximum likelihood is the most common approach (Tuma and Hannan (1984), chapter 5). The contribution of each sample case to the likelihood function is derived; and if the observations are statistically independent, then the likelihood function is, in most cases, the product of the individual contributions.   43           available on the sample frame. Compensating for noncoverage is usually carried out by using sources external to the survey to produce some form of weighting adjustment, as described in the last session.   Noncoverage in a longitudinal survey can be problematic depending on the population which is to be measured. If the population is approximately static, (that is, the amount of change in the population over the life of the panel is not substantial), then the treatment of noncoverage from the longitudinal perspective is not any different than from the cross-sectional perspective. To be precise, however, changes in the survey population should be reflected in later waves of the panel. Often this does not occur because of operational reasons or because such a small proportion of the population is involved.   For example, in SIPP the person population does not change greatly over the life of a panel. The principal changes are children who reach adulthood during the life of the panel, deaths, immigrants, emigrants, and persons returning from military barracks and institutions. The survey design captures information about new adults, deaths, and emigrants; however the design does not cover new entrants to the population who live in households which do not include adults eligible for initial sample selection, such as households in which all members are from the following sectors:   1. U.S. citizens returning from abroad;   2. immigrants who move into the U.S. after the first wave of interviewing; and   3. persons who return from military barracks or institutions.   The different approaches suggested for treating total and item nonresponse illustrate a concern for the kind and amount of data available for use in compensation procedures. Total nonresponse is typically treated by some form of weighting adjustment, using data available from the sample frame in addition to observations obtained by the interviewer. With item nonresponse, the responses to other survey questions may provide information. To use other responses effectively, item nonresponse is usually treated with some form of imputation (that is, by assigning values for missing responses based on responses from respondents with similar characteristics) rather than with weighting procedures.   From the longitudinal perspective, the issue of unit and item nonresponse is not very well defined. From this perspective, a unit's record consists of all information collected on the unit over the life of the panel. This suggests, however, that data missing for one or more waves of a panel can, in fact, be treated as item nonresponse. Nonresponse on one or more waves of the panel may logically,be treated as item nonresponse for all variables that should have been recorded for that wave(s). The distinction between unit and item nonresponse is not obvious, and, often, in the interest of simplicity, a judgment must be made identifying the appropriate level of response necessary to treat a case for item nonresponse rather than unit nonresponse. Ultimately, these issues are best resolved after empirical research on the nature, extent, and patterns of the missing information. This, along with knowledge of the uses of the data, will help determine a strategy for handling nonresponse in a panel survey.   44           2. Total Nonresponse   Total nonresponse in a cross-sectional survey means that no one at the household responded for one reason or another. It is often called unit nonresponse in cross-sectional surveys. It is generally handled by weighting adjustments, using data available on the sample frame such as region, city, block, type of area; or available from interviewer observation, such as race of householder. Usually the data available for weighting adjustment is quite limited.   In a longitudinal survey the concept of total nonresponse can take on a different meaning, including units which provided information for some, but not all, of the waves of the panel. Thus, viewing the entire longitudinal record as complete response, and responses at one or more waves as partial responses, the definition of total nonresponse can be reconstructed to include units which participate in the survey some part of the time. These units, despite having provided more data than "true" total nonrespondents, can be treated as total nonrespondents. In NMCUES, for example, total nonresponse is defined to include units (individuals), responding in fewer than one-third of the waves they were eligible for interview. (See Cox and Bonham, 1983,, and Cox and Cohen, 1985).   3. Unit Nonresponse   For the purpose of this discussion, unit nonresponse will refer to individual or person nonresponse to one or more interviews in a longitudinal survey. The length of a longitudinal survey increases a) the amount of data available for nonresponse adjustments and b) the complexity of nonresponse compensation procedures. Each individual's microdata record does not consist of unrelated, independent observations taken at different points in time, even though the data may be collected in that manner. Many variables reflect the same measure at different points in time. The status of a variable, such as income, at one point is frequently related to its status at a previous point. In a cross- sectional survey only two response categories exist, response and no response. In a longitudinal survey of n-waves there exist 2n possible patterns of response. For example, in a 3 wave study there are eight possible response patterns illustrated as follows (where NR refers to nonresponse and R refers to response):   1. R R R 2. R R NR 3. R NR R 4. NR R R 5. R NR NR 6. NR NR R 7. NR R NR 8. NR NR NR   Response patterns are usually classified as forming a "nested" pattern of nonresponses (i.e., variables from early waves of the survey are observed more often than variables from later waves), or as "non-nested". Attrition is a form of nested nonresponse, and estimators for dealing with nested nonresponse have been discussed in the incomplete data literature. (See Anderson (1957).: Rubin (1974), or Marini, Olsen and Rubin (1980).)     45           The three wave study example illustrates the kind of difficulty which can occur when one or more waves of data are missing. Case 1 is an example of total response -- an interview is obtained in each wave of the panel. Cases 2 and 5 illustrate attrition and nested nonresponse. Cases 3 and 4 illustrate non-nested patterns of response (two out of three interviews obtained) and cases 6 and 7 illustrate different non-nested patterns of nonresponse with only one of three interviews obtained. Case 8 is an example of total nonresponse. The difficult decisions about nonresponse which must be made for a three wave study are indicative of problems with surveys of more than three waves.   One way of treating unit nonresponse in a panel survey is to define the level of response necessary for a unit to be considered a "responding" unit. All units which exceed this response level would be treated as if they were present in all waves of the panel and their missing interview data regarded as a form of item nonresponse; units with a response level less than the standard would be treated like total nonresponse.   Underlying these alternative strategies for handling wave nonresponse is the issue of whether it is better to use imputation or weighting to adjust for wave nonresponse. The weighting procedure simultaneously compensates for all data items of a nonrespondent, but reduces the sample size available for analysis. Weighting adjustment procedures also typically incorporate many fewer control variables than an imputation procedure, although David and Little (1983) suggest a model based approach which increases the number of variables used in the adjustment.   Imputation, whether it be cross-sectional or longitudinal, fabricates data. The uninitiated user may not understand this and may attribute greater precision to the estimates than is warranted. Imputation techniques by their nature may fail to retain a covariance structure of the data. However, by identifying critical data items in advance, an imputation procedure can be developed to control key covariances. In practice, a two fold strategy of using both weighting and imputation procedures may often be the best solution (David and Little (1983)). A more detailed discussion of the weighting versus imputation issue for wave nonresponse can be found in Kalton (1985) and in Kalton, Lepkowski and Lin (1985).   4. Item Nonresponse   In the previous discussion it was noted that one way of treating unit nonresponse was to consider it a "form of item nonresponse" in a longitudinal record and use imputation techniques. That is, in a longitudinal survey, unit nonresponse can be treated conceptually as item nonresponse. Item nonresponse, because it typically refers to missing data item(s) in an otherwise completed interview, provides a good illustration of the fact that there is nothing theoretically special about longitudinal imputation. As Kalton, Lepkowski, and Santos (1981) have stated, longitudinal imputation for item nonresponse is simply imputation for item nonresponse using auxiliary data from a larger data base, including using longitudinal data elements as well as cross-sectional ones. The principal distinction is the availability of data which are highly correlated with the missing data, usually the same variable measured at different points in time. For example, the imputation in CPI is done from this perspective.   46       Theoretically, a decision concerning cross-sectional versus longitudinal imputation in a longitudinal survey is obvious. The longitudinal approach can certainly do no worse than the cross- section approach. The longitudinal approach can use any of the variables measured on a wave, but in addition it can use variables from other waves. As Kalton and Lepkowski (1983) point out,,if response on an item is highly correlated over time, then the value from a previous interview will be a good predictor of the missing value at the current interview.   Two exceptions to this statement should be noted: (1) the predictor variable must be reported at more than one point in time; and (2) the variables used in a cross-sectional imputation system are known to be poor predictors of the missing value and thus would likely be poor predictors in a longitudinal system. The two limitations are important because they suggest that empirical analysis of cross-wave data is necessary before developing a cross- wave imputation system. They also point out that in addition to an imputation system using two or more waves of data a fallback cross- sectional method is often needed to compensate for items which are missing in every wave of the panel.   Using cross-wave measures as auxiliary variables in an imputation scheme has special significance when individual changes will be analyzed. Obviously, if imputed values are assigned without conditioning on the previous wave's value, measures of change are very likely to be distorted. In this case, modeling state-to-state transitions becomes extremely important in developing an imputation system.   Some methods for longitudinal imputation Are discussed by Kalton and Lepkowski (1983). These methods make use of the stability a variable may have between successive waves of a panel, and they include:   1. direct substitution   2. cross-wave hot deck imputation   3. cross-wave hot deck imputations of change   4. deterministic imputations of change   A simulation to compare results using these 4 approaches is also described in the same source.   47                 CHAPTER 5   LONGITUDINAL DATA ANALYSIS     INTRODUCTION   In the past, much longitudinal analysis has been done cross- sectionally, with each wave of a survey analyzed independently. The linked records were often difficult to use and discouraging to analysts. With improved data bases and the use of statistical techniques to analyzes transitions, trends, and change, longitudinal surveys are now showing their distinct analytical advantages.   A. Determinants of Longitudinal Analysis Methods   Longitudinal analyses study,the change in some unit -- a person, a family, a business and so on -- over time. The focus is not on a description of the current status of the unit. Rather, interest is usually directed at the underlying process that determines any observed change.   The methods employed in the analysis of longitudinal surveys depend on four factors: (1) the nature of the process being studied, (2) the type of variables being measured, (3) the analytic objectives, and (4) the method of data collection. These factors taken together determine the kind of mathematical models of the process that are appropriate and estimable.   1. The Nature of the Process Being Studied   Many processes can be represented as the flow of a unit between some set of categories (states), such as the change in a person's employment status from employed to unemployed. Such a representation requires an enumeration of the possible categories and a probabilistic description of how movement takes place from one category to another. The flow of the process may be discrete or continuous in time. In a discrete time process, change of state occurs only at a fixed set of points. For example, eligibility for many government benefit programs is a discrete time process. Social Security Administration old age and disability programs, AFDC and many State welfare programs all pay monthly benefits. Eligibility for benefits changes only at discrete points in time, spaced one month apart. Other processes, such as change in health status, changes in price level, death, change in attitudes, or employment, can change state at any point in time and are therefore continuous in time.   The process under study may be time stationary or time nonstationary. A process is time stationary if its probabilistic structure and its governing parameters are not themselves changing over time. Processes which are not stationary in time are the most common. The payment of benefits under government programs often undergoes structural changes as the result of legislative and administrative actions. Morbidity is   Principal Author: Barry V. Bye   49     continuously affected by advances in medical science, and individual labor force decisions are in part determined by changes in the national economy.   2. The Type of Variables   A process may be described by variables which are discrete or continuous; and variables may be either observable or unobservable. Labor force status -- employed, not employed, out of the labor force -- is a discrete observable variable with three mutually exclusive and exhaustive states. Variables such as well being and satisfaction, on the other hand, are often taken to be continuous variables that are usually measured only imperfectly by a set of indicator variables.   3. The Analytic Objectives   The analysis of longitudinal data may have several objectives. Descriptive analyses are concerned with the regularities of the process under study. Such analyses often use cross tabulations at two or more points to show gross and net change of the units. There are other descriptive statistics: the number of times that a certain state has been entered since the last measurement, the average length of time spent in a given state, the distribution of probabilities for the next transition and the derivation of calendar period estimates not based on retrospective reports. Hypotheses tests often deal with differences in these statistics among several subpopulations.   Researchers interested in causal analyses tend to focus on the underlying structure which governs the process. Mathematical models of the transition from one state to the next become prominent in causal analyses, and the estimation of the parameters becomes the, primary statistical goal. The signs and statistical significance of the estimated parameters are usually interpreted in the context of some higher level generalization or theory.   Sometimes longitudinal analyses are designed to project a process into the future. Projection is of primary concern in evaluating changes in government programs or the results of field experiments, particularly When the full effects of the changes have not yet been realized. Projection usually requires a mathematical model of the process. The parameters are then estimated from longitudinal data.   4. The Method of Data Collection   Two major strategies are used in gathering longitudinal data. In the first approach a complete history of the process is obtained. This approach is the event history method. Measures include the sequences of states occupied by the individual units, and the times when changes in state occur. The second approach is the multi-wave method. Tn this approach the current status of the units is obtained at two or more points in time, but information is often lost on the duration and sequence of events, and on the possibility of multiple changes between measurements. Information on the duration of events may not even be collected in the multi- wave method. For example, at the initial interview, there may be no data concerning the initial status of the process. At the final interview, there are no data concerning the next state of the process.   50     To summarize, the appropriate data collection strategy for a longitudinal survey is chosen by assessing the nature of the process, the variables, and the research objectives. For example, structural analyses of discrete, observable processes will require event histories (see Tuma & Hannan, 1984). On the other hand unobservable variables such as attitudes can only be measured in a multi-wave panel context, because the best one can do is measure the current status at any fixed point. Multi-wave methods have been used in most large scale surveys even when the focus is on observable processes. The resulting logs of information often severely restricts the analyst's ability to recover the underlying parameters and to discriminate between competing mathematical models. (See Coleman, 1981, and Singer & Spilerman, 1976).   B. Analysis Strategies for Longitudinal Data   Many of the approaches that are used for the analysis of cross sectional data are applied to longitudinal data as well (see Dunteman and Peng, 1977). There are two ways to use longitudinal data in these analyses. In some cases, variables are measured repeatedly over time. In other cases, longitudinal data are used to establish the temporal sequence of a set of variables. Establishing the correct temporal sequence of a set of variables is important for assessing causal linkages within the set.   Categorical data are collected in longitudinal surveys as well as cross sectional surveys. These data can be arrayed in cross tabulations showing the relationship between antecedent and outcome variables. When the status of a particular variable is measured at more than one point in time, cross tabulations can be constructed that describe the change in status of the sample units over time. When longitudinal data are placed in cross tabular form, the statistical techniques used to analyze cross-sectional data may be applied. These contingency table analysis techniques include the general testing of hypotheses about the structure of the table (Landis & Koch), the use of log-linear modeling (Bishop et al, 1975, Dunteman & Peng, 1977, and Hauser, 1978), and the development of certain classes of latent structure models (Clogg, 1979).   In longitudinal studies where the outcome variable is continuous, a number of cross-sectional analysis models have been applied. These models fall within the realm of regression analysis and the analyses of variance and covariance. One of these methods, path analysis (see Blaylock, 1970), involves estimating a sequence of regression equations where all endogenous variables, ordered in time, are regressed upon all preceding variables. Path analysis methods have been extended by J™reskog and Sarbom (1979) to cases where the outcome and predictor variables are in principle unobservable (latent) and can only be measured imperfectly by a set of indicator variables. When such variables are measured at several points, J™reskog's methods can be used to determine whether the nature of the construct is changing over time and which predictor variables account for the changes.   While cross-sectional analysis is often adequate for describing changes in status and identifying determinants, these methods are usually unsuitable for the analysis of the underlying process that generated the data. Social processes are often better represented by discrete-state,     51           continuous-time stochastic models. The first step in constructing this kind of model is to specify rates of transition between states. A number of researchers (see-Coleman, 1981, Ginsberg, 1972a and 1072b, and Tuma, 1976) have shown that regression analysis -- usually specified in terms of linear or logistic equations with the outcome as the dependent variable -- can supply information about the rates of transition only for a severely limited class of models. In those cases where regression is useful, the process must have run a sufficiently long time that the observed proportions in the outcome categories are not themselves changing over time. Even when cross tabulations show status change between two (or more) points, model identification can be problematic. The data are often equally compatible with more than one model.   Because of the problems encountered when applying cross- sectional analysis methods to longitudinal data, current analysis strategies focus directly on the rates of transition from one state to the next. In the biological sciences these investigations fall under the rubric of survival analysis (see Elandt-Johnson & Johnson, 1980). In the social sciences, general theories of stochastic processes are applied (see Bartholemew, 1973 and Tuma & Hannan, 1984). While these new methods permit a richness of analysis not possible with cross-sectional methods, they can have a significant impact on sample design and data collection issues. Many of the techniques require event history data rather than multiwave panel data. In those cases where only longitudinal data are obtainable, observations at unequally spaced survey dates are often required. Many of the new approaches utilize non-parametric methods or rely on maximum likelihood techniques for the estimation of model parameters. Applying these techniques properly to the complex sample designs found in longitudinal surveys remains a largely unexplored area in statistical research.   C. Examples of Longitudinal Analysis   Because there is such a wide variety of methods, the flavor of longitudinal analysis is best captured through examples. Two Social Security Administration projects will be discussed; the first is the Social Security Administration Retirement History Study (RHS). In this project some examples of the more familiar cross-sectional approaches are presented. The second is the Social Security Disability Program Work Incentive Experiments (WIE) which provide examples of some current analytic strategies.   1. Social Security Administration Retirement History Study   The Social Security Administration's Retirement History Study (RHS) is a multiwave survey designed to address a number of policy questions relating to the causes and consequences of retirement. Among these questions are: Why do individuals retire before age 65? How well does income in retirement replace preretirement earnings? What happens to the standard of living after retirement? The original sample of 12,549 persons was a multi-stage area probability sample selected from members of households in 19 retired rotation groups from the Current Population Survey. The sample was nationally representative of persons 58 through 63 years old in 1969. Initial interviews were conducted in the spring of 1969 and then in alternate years through 1979. Data collected during this period provide   52     detailed information on work history, sources of income, expenditures, health, and attitudes toward and expectations for retirement. Results from the RHS have been reported in a number of Social Security Administration research reports (listed in SSA publication #73-11700). The data have also been analyzed by researchers outside the government via public use tapes.   An interesting variety of cross-sectional analytic methods suitable for multi-wave data have been used with the RHS data. One example is a two-wave descriptive analysis of the change in income between 1968 and 1972 using simple turnover tables (Fox, 1976). The second example is a three-wave structural equation model of income satisfaction (Campbell and Mutran, 1982).   a. Analysis of income change   Fox examined income level and change between 1968 and 1972 by constructing simple turnover tables. One of these tables (table 1 on page 59 and 60) classified respondents or couples by their income position in 1968 and 1972. The table shows the marginal distributions each year and the joint probability of change separately for married couples, unmarried men, and unmarried women, crossed by work status in 1968 and 1972. The table indicates some increase in income over time for persons either employed or not employed in both years, and, as expected, a substantial decrease in income for persons employed in 1968 but not employed in 1972. Among this latter group, Fox (1976) noted that income loss for unmarried men appeared greater than for unmarried women.   Fox's findings are examples of general questions that can be answered by the analysis of turnover tables.   1. Are income changes between the two points different for different subpopulations?   2. Are there differences in marginal income distributions between sub-populations at a given time?   A number of authors (Bishop et al, 1975, Hauser, 1978, Landis & Koch, n.d., and Singer, 1983) have shown that hypotheses involving marginal distributions and attribute-by-time interactions can be specified and tested using existing methods for the analysis of categorical data. For example, testing whether income changes vary by subpopulation is the same as testing for a 3 (or higher) way interaction between income level at time one, income level at time two, and subpopulation characteristics. The weighted least squares approach (Landis et al., 1976) would be an appropriate methodological approach for testing this kind of hypothesis, especially for complex sample designs. Given a consistent estimate of the sampling covariance matrix for the table cells, appropriate test statistics for a wide variety of hypotheses can be computed.   Fox's analysis also illustrates two additional methodological issues. We are informed in the technical note to his report that only 63 percent of the sample respondents had usable income data in both 1968 and 1972 due to the 'very conservative editing" of income response. In both years, respondents had to give usable answers to about 20 different income components (twice that, if married). An inadequate response to     53   any one of these components was enough to cause a nonresponse for the entire set. Three questions immediately arise. What is the effect of response error for individual income items on the analysis of the turnover tables? Would imputing missing income items affect the analysis? How did analyzing only the partial data set affect the analysis?   Response errors are likely to result in-an overestimate of change in income class, because some of the observed change is due to reporting error rather than to real change over time. Generally, in order to separate real change from classification error, an observation at a third point is required. This third observation could be a reinterview, taken soon after one of the regular waves, designed to measure reporting error directly. However, under certain modeling assumptions, three widely spaced observations can also provide estimates of real change and classification error (see Bye & Schecter 1980 and 1983). A second problem resulting from classification error arises when attempting to measure differences among various subpopulations. There may not be real change at all; the analyses may simply reflect differences in the propensity for response error among the subpopulations, leading to incorrect interpretations.   The effect of imputation on the analysis of turnover tables will depend on the specific imputation scheme. If, for an individual, responses from other waves are used to impute missing values for a particular wave, real change may be understated. If, on the other hand, the amputations are carried out separately for each wave, real change will most likely be overstated. Particular care must also be given to substantive interpretations, when the same attributes are used both for imputation and for substantive analysis.   Analyzing partial data sets requires an assumption that the nonresponents are like the respondents. Usually no studies have been carried out to support that. To the extent that nonrespondents are different, as they frequently are in health and income studies, the data set is biased and the interpretation is inadequate.   b. Stability of income satisfaction   Campbell and Mutran (1982) present an analysis of the stability of income satisfaction over time using data from three waves of the RHS -- 1969, 1971 and 1973. They assume that income satisfaction is an unobserved continuous variable measured imperfectly by two indicator variables. The two indicators are "satisfaction with the way one is living" (SAT), and "ability to get along on income" (GET). Figure C (page 61) presents a path diagram for one of the models estimated by Campbell & Mutran (1982). (The estimated covariance matrix of the observed variables is shown in Table 2, page 62.)   Campbell and Mutran posit that income satisfaction is in turn a function of health status, (an unobserved variable with three indicators), of actual income level in 1969, and of the number of times in the hospital in 1970. The authors note that this path model is significantly underspecified but provides an interesting example of the use of LISREL methodology (J™reskog & S™rbom (1978) and (1979)).     54         LISREL unites factor analysis and structural equation modeling for a wide variety of recursive and nonrecursive models with and without measurement errors. (see J™reskog & S™rbom, 1976). The LISREL approach assumes that both measurement and structured equations are linear in the unknown parameters and that all variables are normally distributed.   2. Social Security Administration Disability Program Work Incentive Experiments   Under the provisions of the Disability Insurance Amendments of 1980, the Secretary of Health and Human Services was directed to develop and carry out experiments and demonstration projects designed to encourage disabled beneficiaries to return to work and leave the benefit rolls. The primary objective of the experiments is to save trust fund monies. The bill itself contains several examples of the kind of change in the post entitlement program that Congress had-in mind. These include changing entitlement provisions for Medicare benefits, lengthening the trial work period, and modifying treatment of post entitlement earnings, such as the application of a benefit offset based on earnings.   Congress imposed important constraints on the experiments: they must be of sufficient scope and size that results are generalizable to the future operation of the disability program, and no beneficiary may be disadvantaged by the experiments as compared to the existing law.   Eight treatment groups and a control group have been proposed (see (SSA, 1982, for details). Each treatment group represents an alternative to the current post entitlement program representing either some change in the law or administrative practice (or both). A two stage stratified cluster sample of 31,000 newly awarded beneficiaries was planned for the experiments. The sample would be representative of all beneficiaries under age 60 at the time of award. The sample beneficiaries would be assigned at random to one of the nine experimental groups in such a way that the full experimental design is replicated in each geographic cluster. The total sample size in each treatment group would be 3,000, and there are to be 7,000 in the control group.   Under the current disability program, a beneficiary who returns to work despite continuing severe impairment is granted a 24 month period in which to make a work attempt while remaining on the benefit rolls (the first 12 months with full benefits, the second 12 months with benefits in suspense.) workers are expected to need 1 or 2 years to return to work and 2 or 3 years to complete the trial work period and be terminated from the rolls. Thus an observation period of 4 to 5 years is required to track beneficiaries through the shortest of the post-entitlement out- comes. Observed short-run labor force response will provide some information about the effects of the treatments, but trust fund savings will be significant only if employment is sustained in some groups. Thus, sustained work is the key labor force parameter in the evaluation of the work incentive experiments.   55           At the same time, the analysis of short run labor force outcomes, commencing about 2 years after the experiments begin, is a necessary first step in gauging trust fund effects. The data available for the short run analysis will consist of a voluntary baseline questionnaire (face-to-face interview) covering socioeconomic and demographic background items plus a series of mandatory quarterly reports (mail with telephone followup) showing the beginning and end of work attempts and monthly earnings for each month of the quarter. The response to the quarterly reports is mandatory because work reports and monthly earnings are required for administrative purposes.   a. Short run longitudinal analysis of return to work   The first step in the analysis of return to work will compare the proportion of beneficiaries who have made a work attempt among treatment and control groups. However, short run differences could be misleading if the full effect of the treatment has not been realized. Consider the hypothetical outcome in figures A and B below.     Click HERE for graphic.         56   In Figure A the difference between treatment and control is small for the first two years, but becomes large afterwards. In Figure B, short run difference appears large at first but then becomes smaller. Clearly, change over time in the proportion of beneficiaries who return to work is most important in determining the experimental effect. The rate of change of this proportion over time for beneficiaries who have not yet returned to work is called the hazard rate function (or hazard function). A short run evaluation of return to work will focus on differences in rates of return to work among treatment and control groups.   Using individual observations of the time of return to work, the first analysis of return to work will be to estimate and graph the cumulative hazards of return to work for treatment and control groups and test the difference between the hazards.   If there are differences between treatment and control groups, the graphical displays of the cumulative hazards should provide a useful guide. These can then be used to project long run differences in the probability of return to work among the experimental groups. Introduction of covariates from the baseline questionnaire might also improve the accuracy of these predictions (see Hennessey, 1982).   b. Structural Models of Duration -- Testing a Sociological Theory   It has been suggested that the longer a beneficiary remains on the disability rolls, the less likely he or she is to return to work. The reason given is that the beneficiary makes the necessary social and psychological adjustments to continue in the role of a disabled person. The fact that population rates of return to work for disabled beneficiaries decline over time is often taken as evidence supporting this theory. However, one can show that population heterogeneity can account for an apparent decline in population transition rates over time, even if the individual rates are constant or increasing. (See Heckman & Singer, 1982, for example.) Therefore any assessment of the apparent negative duration dependence must account for population heterogeneity.   One way to examine this issue is to specify and estimate a structural model for the hazard function for return to work. The parameters are usually estimated by maximum likelihood methods, incorporating the likelihoods for sample cases moving from nonwork to work at time t, and for sample cases which haven't yet moved by time t (which, in this case, is the end of the observation period).   c. Estimating long run trust fund effects   The Disability Amendments mandate that the primary evaluation of the experiments be in terms of trust fund effects. In general, the cost to the trust funds of an individual beneficiary is the sum of the expected costs to the Disability and Medicare funds between initial entitlement and the termination of benefits or the attainment Of age 65. The cost to the disability trust fund can be further broken down into the sum of the cash benefit payments plus the cost of vocational rehabilitation (if applicable)     57           minus the payback of FICA contributions (if the beneficiary returns to work) during this period. The estimation of long run effects requires the projection over time of the probability of receiving cash benefits for disability, the expected amount of those benefits, the probability of working, and the expected earnings level.   An analysis plan for the WIE is being developed which is based on a continuous-time stochastic model. The state space for the process admits four possibilities:   E.1 : Recovered   E.2 : Deceased   E.3 : Nonworking Beneficiary   E.4 : Working Beneficiary   At the time benefits are awarded the beneficiary is assumed to be in state E.3. The beneficiary can switch between states E.3 and E.4 until he or she reaches state E.1 or E.2 (which are taken to be absorbing states) or reaches age 65 (and is automatically converted to the old age program.)   A semi-Markov model is proposed to link the various work and non-work episodes over time. This model assumes that each work and non-work period is independent of prior work history (but might depend on age and other exogenous factors which can be incorporated into the hazard functions.) Although it is unlikely that this sort of independence does in fact exist, the short observation period effectively precludes the ability to detect the real dependencies.   In conclusion, once the hazard functions are estimated separately for each experimental group, future work and benefit status histories will be simulated. These histories together with estimates of earnings and benefit levels will allow the estimation of long run trust fund costs for each experimental group.   Using four years of administrative data, Hennessey (1982) found that semi-Markov models of work and benefit status for male beneficiaries can accurately predict the histories three years hence. His,results provide encouragement for this overall analysis strategy.     58     Click HERE for graphic.       59           Click HERE for graphic.       60         Click HERE for graphic.     DEFINITIONS OF VARIABLES     A. Satisfaction with Income is an unmeasured construct with three indicators:   1. SAT69, SAT71, SAT73 Are you satisfied with the way you Does are living?   4 = More than satisfied 3 = Satisfied 2 = Less than satisfied 1 = Very unsatisfied   2. GET 69, GET71, GET73 Ability to get along on income 4 = Always have money left over 3 = Have enough with a little left over sometimes 2 = Have just enough, no more 1 = Can't make ends meet   B. Health is an unmeasured construct with three indicators:   1. LIM69, LIM71, LIM73 Does health limit the kind of work you do?   2 = No 1 = Yes   2. OUT69, OUT71, OUT73 Are you able to leave the house without help?   3 = No limitation 2 = Yes, though health limit work 1 = No   C. Number of times in hospital (HOS7 is measured with one indicator   D. 1969 household income (INC69) is single indicator of log income from all sources     from all sources from Campbell and Mutran, 1992. Reprinted with permission   61   Click HERE for graphic.       62     CHAPTER 6   SUMMARY AND CONCLUSIONS   In developing the working paper on longitudinal surveys, the subcommittee found that few of the issues were simple. For each question that was raised there were multiple and sometimes contradictory conclusions encountered in the literature, or in the experience of the subcommittee members. This complicated the task of drawing conclusions about when or how to use longitudinal surveys; what was is clear is that anyone considering a longitudinal survey should remember four general points. These points could apply equally well either to longitudinal or to cross- sectional surveys, but certain aspects are especially important in longitudinal surveys.   First, research goals should be clearly stated and alternative kinds of data collection should be evaluated. Cross-sectional research is not automatically less expensive, and certain research goals cannot be attained with one-time surveys. The evidence seems to indicate that longitudinal surveys are not intrinsically more costly than one-time surveys of comparable scope. In many cases, one longitudinal survey will be more efficient than a series of one-time surveys. However, cost considerations may dictate that neither a longitudinal survey nor a series of one-time surveys could be carried out. Compromises are often made on frequency of interview or sample size to permit some longitudinal data collection.   - For certain research goals, such as identifying the frequency or duration of change, or the causes of change (as in longitudinal surveys of labor force status), only a longitudinal survey will work. For topics that are difficult for respondents to recall, such as attitudes or detailed behavior (as in longitudinal surveys of retirement, or health treatments, or household income), a prospective longitudinal survey is the best choice.   - All other things being equal, a longitudinal survey achieves a given level of precision for measures of change with a somewhat smaller sample than is possible in a series of one- time surveys. In addition, the cost of maintaining contact with a longitudinal sample may be no higher than the cost of selecting and contacting a one-time sample.   - Timing of results plays an important part in the decision to select a longitudinal survey. If early results are needed, then a longitudinal survey is not appropriate. If early waves of a longitudinal survey can be analyzed quickly and provide useful information, then some of the timing problem is dissipated. If the research needs can only be met by a longitudinal survey and those waiting for results clearly understand the timing, longitudinal surveys are clearly superior.   63         Second, once the decision has been made to conduct a longitudinal survey, the subcommittee recommends that a greater emphasis be placed on the early formulation of clear and specific analysis objectives as the next step in research planning. The failure to formulate detailed analysis early enough explains some of the disappointments that some organizations have experienced with longitudinal surveys.   - As the simplest example, when research objectives are not clearly stated or understood, the longitudinal nature of the data has not always been fully exploited in analysis.   - Many of the operational features of longitudinal surveys should only be selected after the development of clear and specific plans for analysis. Even such seemingly unrelated factors at the interval between interviews may be determined by analysis plans. For example, discrim- ination between some simple stochastic models is ruled out if data collection intervals are constant. Other examples are given in Singer and Spilerman's study of longitudinal analysis (1976).   - A clear statement of specific research goals, including analysis plans, reduces the likelihood that a project will require unanticipated funding extensions or auxiliary sponsors for completion. Comprehensive planning ensures that a survey will appeal to a wide constituency, and reflect the research goals of an adequate sponsorship base.   - Fully developed research objectives make it less likely that a need for different -- or additional -- data will become apparent part way through the survey.     Third, longitudinal surveys can easily incorporate features that facilitate the evaluation of internal data quality, and that compare the effectiveness or cost of alternative methods. Repeated data collection makes this possible in ways that are beyond the scope of a one-time survey.   - Any longitudinal survey that varies data collection mode while maintaining a constant questionnaire can be a vehicle for studying the impact of mode of interview on response. Evaluations have indicated that the NLS obtained comparable results by using personal or telephone interviews after the first interview, for example.   - Data from longitudinal surveys can be used to understand the impact of nonresponse on the representativeness of a sample. The characteristics of nonrespondents in a later wave can be studied through what is known about them from the first interview, or from later follow-ups in which they do respond. In the NLS, each extension of the survey has been preceded by evaluations of the impact of attrition through comparisons with population controls developed in the first wave of interviews.           - The effect of continued participation on response can be evaluated each time new persons are brought into the sample or interviewed for the first time. The original HS+B survey program, for example, provided for an additional sample; a group from the original sample to be interviewed only in the later waves, specifically to evaluate panel effects.   - Alternative methods for simulating complete response from incomplete data (such as imputing from other cases, or from what was reported in another interview, or by increasing the weight of completed interviews) can be evaluated using a longitudinal file. The final comparisons have to wait until all the waves of a longi- tudinal survey are completed, but preliminary results can be used in earlier waves, and a variety of procedures can be compared at the end of the program in order to select the most effective method.   - Data from longitudinal questionnaires can and should be compared to the results from comparable questions asked of similar respondents in one-time surveys. The results of NLS labor force questions were constantly evaluated against cross-sectional labor force surveys. This provides ongoing information on sampling error, and on the impact of questionnaire design on response.   - Data from a longitudinal survey, from related administrative records,and from comparable surveys of one-time samples can be compared to estimate the impact of recall periods, or the interval between interviews, or the effect of bounding interviews. The Income Survey Development Program demonstrated the importance of just such an exhaustive testing program which accompanied planning for SIPP.   - The costs of alternative data collection strategies should be recorded, along with the operational considerations and the impact on data quality. This information will be invaluable when the most efficient methods must be chosen for other surveys.   - The costs and effects of alternative data processing strategies should be recorded to allow comparisons, such as the costs and benefits of matching longitudinal records through characteristics or through unique identification codes for sample persons and households. Early tests such as these led to the development of the case-linking strategy selected for SIPP.   These and many other comparisons are possible with longitudinal surveys, because so many materials, respondents and operations vary throughout the course of the survey. With minimal additional efforts toward record-keeping and control, most longitudinal operations can provide important data for evaluating internal data quality and to guide future survey designers.   65           Fourth there are many measurement error problems that exist with any kind of survey, some of which are exacerbated by a longitudinal design. So far, the research on many of these methodological problems has not been definitive, so choices are made based on cost and intuition. There is a rich field for investigations and those seeking to do longitudinal surveys should strive to include some methodological elements. Some of this kind of research has been carried out, as described above, but more is needed.   - Time-in-sample bias permeates every survey that requires repeated interviewing. It is not limited to one particular kind of variable or one mode of data collection. As a result there is a systematic bias in the data that shows up when data are compared by the number of interviews a respondent has had. No one knows which set of data are more accurate, those from earlier or those from later interviews. People make judgments based on little or no data, and the topic needs careful investigation.   - Response errors have the effect of exaggerating change. People do forget and change their minds, and different household respondents give different answers to the same questions. The length of time between interviews also influences answers. More work needs to be done to separate real from spurious change.   - Attrition is a serious problem in longitudinal surveys. Many longitudinal surveys are able to keep 90 to 95 percent of their respondents on each interviewing wave, but even low nonresponse mounts over time. Although compensation strategies look promising, it is troublesome to realize that for some variables, a quarter to one-half of the data are not given by respondents.   - There has been little research on the best length of time to allow between interviews. Decisions are based mainly on cost, yet we know that the longer the interval, the less that is reported, and the more that is reported in the wrong time periods. Work needs to continue on this aspect.   - It is known that the questions on a survey are not processed one by one by respondents. The presence of questions on other topics affects responses to questions on variables of interest. This happens whether the additional questions precede or follow the main questions. However, the tendency is to keep adding new topics. We may be causing a deterioration of data quality by doing this.   Longitudinal surveys are increasingly being used as the basis for policy decisions by the Federal government. In our review, we have become convinced that for some research goals there is no alternative to longitudinal data collection. However, before agencies make the decision to conduct a longitudinal survey, they should carefully consider the important operational, management, and statistical problems associated with them.   66     CASE STUDY 1   SURVEY OF INCOME AND PROGRAM PARTICIPATION   I. Purpose of the survey   In October 1983, the Bureau of the Census conducted the first interviews of the Survey of Income and Program Participation (SIPP). The SIPP is a nationally representative household survey intended to provide detailed information on all sources of cash and noncash income, eligibility and participation in various government transfer programs, disability, labor force status, assets and liabilities, pension coverage, taxes, and many other items. Data from the survey will provide a multiyear perspective on changes in income, and their relationship to participation in government programs, changes in household composition, and so forth. In general, the SIPP data system is designed to measure elements of the federal tax and transfer system in a comprehensive data base.   SIPP began in response to the recognition that the principal source of information on the distribution of household and personal income in the United States -- the March Income Supplement of the Current Population Survey (CPS) had limitations which could only be rectified by making substantial changes in the survey instrument and procedures. For example, the CPS does not provide monthly income, monthly household composition or detailed asset information. These deficiencies became especially serious when the scope of policy analyses was broadened during the 1960's and early 1970's as public assistance programs were expanded and reorganized. Model-builders were forced to make many assumptions and impute intrayear data using CPS data to carry out their activities. In this environment, with analysts requiring more detailed data and improved measures of cash and noncash income, the Income Survey Development Program (ISDP) was established.   The purpose of the ISDP, authorized in 1975, was to design and prepare for a major new,survey, the Survey of Income and Program Participation (SIPP). The ISDP developed methods intended to overcome the three principal shortcomings of the CPS for analyses of income: 1) the under reporting of property income and other irregular sources of income; 2) the underreporting and misclassifi- cation of participation in major income security programs and other types of information that people generally find difficult to report accurately (for example, monthly detail on income earned during the year); and 3) the lack of information necessary to analyze program participation and eligibility (annual income estimates were available, but eligibility for most Federal programs is based on a monthly accounting period).   Four experimental field tests were conducted to examine different concepts, procedures, questionnaires, and recall periods. Two of the tests were restricted to a small number of geographic sites, the other two were nationwide. The largest test, conducted in 1979, was also the most complex. Although used primarily for methodological purposes, the nationally representative sample of 8,200 households was sufficiently large to provide reliable national estimates of many characteristics. More detailed discussions of the ISDP and its activities are provided in Ycas and Lininger (1981) and David (1983).   67         Because the ISDP was the predecessor to SIPP, it is not surprising that many characteristics of the ISDP are reflected in the SIPP design, including many elements of the survey's design, content, and questionnaire format.   II. Sponsors   The ISDP development effort was directed by the Office of the Assistant Secretary for Planning and Evaluation in the Department of Health and Human Services and was carried out jointly with the Bureau of the Census, which assisted in the planning and carried out the field work, and the Social Security Administration (SSA), which administers the major cash income security programs. In late 1981 virtually all funding for ISDP research and planning for the ongoing SIPP program was deleted from the budget of the Social Security Administration. The loss of funding for fiscal year 1981 brought all work on the new survey to a halt. Then in fiscal year 1983, money for the initiation of the new survey was allotted in the budget of the Bureau of the Census.   In planning the content, procedures, and products of the SIPP, the Census Bureau works closely with a SIPP Interagency Advisory Committee, established and chaired by the Office of Management and Budget (OMB). The committee consists of individuals representing the following departments and agencies: the Departments of Labor, Education, Defense, Commerce, Agriculture, Health and Human Services, Treasury, Housing and Urban Development, and Justice; Energy Information Administration; National Science Foundation; Council of Economic Advisors; Congressional Budget Office; Bureau of Labor Statistics; Bureau of Economic Analysis; Veterans Administration; Internal Revenue Service; and the Office of Management and Budget.   III. Sample Design   SIPP started in October 1983 as an ongoing survey program of the Bureau of the Census with one sample panel of approximately 21,000 households in 174 primary sample units (PSU's) 1/ selected to represent the noninstitutional population of the United States. The sample design is self-weighting; that is, each unit selected in the sample has the same probability of selection.   In February 1985 and every February thereafter, a new, slightly smaller panel of 15,000 households is introduced. This design allows cross-sectional estimates to be produced from the combined sample from both panels. The overlapping panel design enhances the estimates of change, particularly year-to-year change. Since portions of the sample are the same from one year to the next, year-to-year change estimates can be based in part on a direct comparison across 2 years for the same group of households.   To facilitate field operations, the sample is divided into four approximately equal subsamples, called rotation groups; one rotation group is interviewed in a given month. Thus, one cycle or "wave" of interviewing takes 4 consecutive months. This design creates manageable interviewing and processing workloads each month instead of one large workload every 4 months; however, it results in each rotation group using a different reference period.   68         Data collection operations are managed through the Census Bureau's 12 permanent regional offices. Interviewers assigned to these offices conduct one personal visit interview with each sampled household every 4 months. At the time of the interviewer's visit, each person 15 years old or older who is present is asked to provide information about himself/herself; a proxy respondent is asked to provide information for those who are not available. The average length of the interview is about 30 minutes. Telephone interviewing is permitted only to obtain missing information or to interview persons who will not or cannot participate otherwise.   An important design feature of SIPP is that all persons in a sampled household at the time of the first interview remain in the sample even if they move to a new address. For cost and operational reasons, personal-visit interviews are only conducted at new addresses that are in or within 100 miles of a SIPP primary sampling unit (persons moving outside that limit are contacted by telephone if possible). After the first interview, the SIPP sample is a person-based sample, consisting of all individuals who were living in the sample unit at the time of the first interview -- these people are labelled original sample persons". Individuals aged 15 and over who subsequently share living quarters with the original sample people are also interviewed in order to provide the overall economic context of the original sample persons. Changes in household composition caused by persons who join or leave the household after the first interview are also recorded. These individuals are interviewed as long as they reside with an original sample person. More information about these procedures can be found in Jean and McArthur (1984).   IV. Survey Design and Content   Each person in the SIPP sample is interviewed once every 4 months for 2 2/3 years to produce sufficient data for longitudinal analyses while providing a relatively short recall period for reporting monthly income. The reference period for the principal survey items is the 4 months preceding the interview. For example, in October, the reference period is June through September; when the household is interviewed again in February, it is October through January. This interviewing plan will result in eight interviews per household.   An important design feature of SIPP is the assignment of an individual identification number. Each sample person is assigned a unique fourteen-digit identification (ID) number at the time he/she enters the sample; an additional two-digits code is assigned if the person moves to a new address. A master list of identification numbers is used by the regional offices to monitor the status of interviewing each month after Wave 1. The regional offices keep track of each number on the list representing all the persons assigned for interview in a month; each must be accounted for with a completed questionnaire or a reason for noninterview. The list is updated regularly to account for persons who are added or deleted from the sample.   The ID helps to link information about an individual across time; it identifies which household each person is a member of at any point in the panel. Through the ID system, data can be linked from all persons ever associated with a given household throughout the 2 2/3-year duration of a panel.   69           The survey consists of three major components: (1) the control card, (2) the core data, and (3) topical data. The control card is used to obtain and maintain information on the basic Characteristics associated with households and all household members and to record information for operational control purposes. These data include the age, race, ethnic origin, sex, marital status, and educational level of each member of the household, as well as information on the housing unit and the relationship of the householder to other members. A household respondent provides this information, which is updated at each interview. The control card is also used to keep track of when and why persons enter and leave the household, thereby providing enough information to compose monthly household and family groups. There is also space to record information that will improve the interviewer's ability to follow persons who move during the survey. In addition, after each visit, data on employment, income, and other information are transcribed from the core questionnaire to the control card so the data can be used in the next interview as a reference for the interviewer and thus shorten succeeding interviews.   A questionnaire is filled for each household member who is 15 years or older. The questionnaire consists of a "core" of labor force and income questions asked during each interview and a set of topical modules which are scheduled during the life of the panel. The core labor force and income questions are designed to measure the economic situation of persons in the United States. These questions expand the data currently available on the distribution of cash and noncash income and are repeated at each interviewing wave. SIPP core data build an income profile of each person aged 15 and over in a sample household. The profile is developed by determining the labor force participation status of each person in the sample and asking specific questions about the types of income received, including transfer payments and noncash benefits from various programs for each month of the reference period. A few questions on private health insurance coverage are also included in the core.   Persons employed at anytime during the 4-month reference period are asked to report on jobs held or businesses owned, number of hours and weeks worked, hourly rate of pay, amount of earnings received, and weeks without a job or business in addition to questions about labor force activity and the earnings from a job, self-employment, or farm, the core includes questions related to nearly 50 other types of income as well as the ownership of assets which produce income.   The SIPP has been designed to provide a broader context for analysis by adding series of questions on a variety of topics not covered in the core section. These questions are labelled "topical modules" and are assigned to particular interviewing waves of the survey. If more than one observation is needed, a topical module may be repeated in a later wave.   The survey design allows for the inclusion of these special modules because less time is required in later waves to update the core information collected in the first interview. The subjects covered do not require repeated measurement at each interview and, therefore, may use a reference period longer than the period used for the core information. Examples of topical modules include health and disability, work history, assets and liabilities, pension plan   70       coverage, tax-related information, marital history, fertility, migration, household relationships, child care arrangements, and pension plan coverage. For more information about the SIPP design refer to Nelson, McMillen, and Kasprzyk (1984).   V. Survey Response Rates   The first SIPP interviews were conducted in October 1983. At this time, cumulative household noninterview rates are available for the first six waves of SIPP, that is, through August 1985. Sample loss through the sixth wave has been 19 percent, of which 15 percent was due to refusals and other situations in which the interviewer was unable to make contact with the household, and 4 percent was due to movers that the interviewer was not able to contact again.   Survey nonresponse rates for persons are discussed in McArthur and Short (1985). In this work they characterize the population that is leaving the sample; comparing these persons' characteristics to those of persons continuing to be interviewed in the survey. At the end of the third wave of interviewing, combining all reasons for noninterview -- including refusals, institutionalization, move s to unknown addresses, persons who were temporarily absent, and so on -- 10.5 percent of all persons who were interviewed during the first wave had left the sample. There is some indication that those noninterviewed persons are different from persons who continue to be interviewed. Noninterviews are more likely to be renters rather than homeowners, to live in large urban areas, and to have reported their marital status to be single or separated.   Coder and Feldman (1984) found that imputation for a selected group of items was quite small. In this analysis item nonresponse rates on labor force, income recipiency, and income amounts are examined. They also discussed the impact of self or proxy respondents on nonresponse rates. Lamas and McNeil (1984) discussed the quality of data measuring household wealth in the survey. The nonresponse rate was low for all asset types (1.4 percent) for all persons asked about asset ownership. They found that nonreponse rates varied by type of asset -- lowest for rental property and highest for certificates of deposit -- and by age and education levels of the respondents -- higher nonresponse for older persons and higher nonresponse with greater educational attainment. McMillen and Kasprzyk (1985) used counts of amputations made for each person as the measure of item response rates. The maximum number of amputations that could have been made for an individual was 83. They found that in the first two waves of interviews, 86 percent of the persons had no imputation at all. In Waves 1 and 2, respectively, 87 percent and 92 percent of the cases with some imputation had no more than 3 items imputed. More work planned to study nonresponse is discussed in the research section.   VI. Survey Evaluation Work   SIPP evaluation work is in an early stage; the Census Bureau and other users of the SIPP data are developing appropriate methods of evaluation. For example, research is being carried on for three types of nonresponse -- unit nonresponse defined as nonresponse to all waves of the survey, wave nonresponse defined as nonresponse to a particular wave interview, and item nonresponse defined as nonresponse to a particular item -- and their patterns of occurrence.   71       Another area of useful evaluation work combines survey data with administrative record data. The SIPP was developed as an integrated data system in order to use combined information sources to validate and supplement information collected in the survey. An internal Census Bureau committee is assessing the potential uses of administrative data linkages and identifying content and availability of administrative record systems for use in demonstration studies. One record linkage project which is currently under development will match SIPP survey data for individuals to their administrative records at the state level. Various federal record systems which may also be brought into this project are also being investigated. At this time both the number of states and the number of records systems involved is limited.   A discussion of the quality of the income data collected as of each wave of the SIPP is contained in an appendix to each SIPP quarterly report (U.S. Bureau of the Census). The appendix supplies information on the nonresponse rates for selected income questions, the average amounts of income reported in the survey or assigned in the imputation of missing responses, and the extent to which the survey figures underestimate numbers of income recipients and amounts of income received. For example, in the report for the third quarter of 1983 (P70, no.1) nonresponse rates range from a low of about 3 percent for Aid to Families with Dependent Children (AFDC) and food stamp allotments, to about 13 percent for self- employment income. The report states that survey underestimates of income recipients ranged from about 21 percent for AFDC to about 1 percent for Social Security recipients, and the survey estimate of persons receiving state unemployment compensation payments was about 103 percent of the independent estimate. The underreporting for AFDC is-related to misclassification of this income type as other types of public assistance or welfare.   Evaluation of the ISDP is relevant to work in the SIPP. For example because of its design, SIPP has a potential for missing and inconsistent data problems from wave to wave. One area of current research is the phenomenon of significant income changes and program turnover occurring between waves more often than within waves. Some analysis of this phenomenon using data from the 1979 ISDP Panel is presented in Moore and Kasprzyk (1984). Continuing this area of research using data from SIPP, Burkhead and Coder (1985) looked at gross changes in income recipiency from month to month over a period of one year, the first three waves of SIPP. Their examination indicated that change in recipiency statuses was significantly higher for the months that spanned successive interviewing reference periods, that is between the last reference month for one interview and the first from the next interview. Vaughan, Whiteman, and Lininger (1984) also discussed the quality of income and program data in the ISDP. They discuss numbers of income recipients and program participants, and amounts of income and benefits in comparison to independent sources and the CPS. Other relevant studies are: Ferber and Frankel (1981), studying the reliability of the net worth data in the 1979 panel of the ISDP; Feldman, Nelson and Coder (19801, evaluating the quality of wage and salary income reporting in the 1978 ISDP; and U.S. Bureau of the Census (1982).   VII. Survey Data Products and Research Activities   A number of publications and public-use data files are being generated from the information collected in SIPP. Both publications and data files are   72           identified by whether they are cross-sectional or longitudinal. Two types of cross-sectional reports are planned by the Census Bureau: 1) a set of quarterly reports that focus on core information; and 2) periodic or onetime reports that use the detailed data from the topical modules.   The quarterly cross-sectional reports show average monthly labor force activities, income, and program participation statistics. The first quarterly report was issued in fall 1984 (U.S. Bureau of the Census, 1984) and contains data referring to the Third Quarter of 1983. The report covering the Fourth Quarter of 1984 was released in November 1985. The periodic and single-time reports will use the detailed data from the topical modules (for example, disability and earnings, health insurance coverage and household net worth). These reports may also use a combination of the core and topical module data.   Plans for longitudinal data reports are under discussion, but they are expected to concentrate on data that can be used to examine trends and changes over time. This may include analyses of the dynamic aspects of the labor force or the effect of changes in household composition on economic status and program participation. Examples of reports under consideration in this series are: economic profile reports, presenting yearly aggregates of monthly data on individuals; comparative profile reports, presenting annual comparisons of the economic activity of individuals; transition reports, providing changes in income and program participation status between two points in time: longitudinal family and unrelated individual reports, presenting the characteristics of longitudinal family units defined in SIPP (see McMillen and Herriot.(1984) for more information on this topic); and special event reports, providing data preceding and/or following a particular event, such as marriage, divorce, separation, the birth of a child, a return to school, a move to a new address, or a job change.   SIPP cross-sectional data files are issued on a wave-by-wave basis. 2/ Each file includes person, family, and household information collected in the survey wave. Virtually all data obtained on the core questionnaire are included on the files; certain summary income recodes are also included. Data that might disclose the identity of a person are excluded or recoded in accordance with standard Census Bureau confidentiality restrictions. Wave files are edited, imputed, and weighted in a manner consistent with their use for cross-sectional analysis. A unique identification number is included to allow users to merge two or more SIPP files. However, since the processing of wave files is independent, wave- to-wave data inconsistencies will occur and the user must be prepared to resolve them.   Data files containing topical module information will be released together with the core data that were collected at the same time. Identifiers will be included on the file to allow linkage to other topical module files.   Plans for producing public-use files designed for longitudinal analysis are now under discussion. The first longitudinal file For SIPP will be a research file containing twelve months of core income data; this is essentially the first three SIPP interviews.   73       A SIPP working paper series has been established as a mechanism to provide timely and widespread access to information developed as part of the SIPP. Papers in the series will cover a broad range of topics including: procedural information on the collection and processing of data; survey methodology research; and preliminary substantive results, such as the measurement of household composition change over time   The 1984 and 1985 meetings of the American Statistical Association were used to bring the research community up-to-date on a variety of SIPP-related research issues. A wide range of topics, both methodological and substantive, were covered in sessions organized under the auspices of the Social Statistics and Survey Research Methods Sections. Papers presented in 1984 have been compiled by Kasprzyk and Frankel (1985) and the 1985 papers have been compiled by Frankel (1985).   A number of other research projects are underway at the Census Bureau and at independent research centers such as the Survey Research Center/University of Michigan. These projects are vital to the understanding, use, and future development of the SIPP. This work includes studies of longitudinal imputation and weighting strategies; characteristics of persons who become nonrespondents: composite estimation; potential for use of data base management systems; linkage of administrative records and economic data from other census files to SIPP results, see Sater (1985). The American Statistical Association (ASA)-National Science Foundation (NSF)- Census research fellow program has been expanded reidentify explicitly SIPP-related research activities.   _____________________________   1/ A primary sampling unit consists of a county or a group of contiguous counties. 2/ For information about the SIPP public use files, please call the Data Users Services Division at (301) 763-4100 and ask for the "Data Developments" for SIPP.   74       CASE STUDY 2   CONSUMER PRICE INDEX   I. Purpose   The Consumer Price Index (CPI) is a measure of price change for a fixed quantity and quality of goods and services purchased by consumers. The CPI is used most widely as an index of price change. During periods of price increases, it is an index of inflation and services as an indicator to measure the effectiveness of Government economic policy.   The CPI is used also as a deflator of other economic series, that is, to adjust other series for price changes and to translate these series into inflation-free dollars. These series include retail sales, hourly and weekly earnings, and some personal consumption expenditures used to calculate the gross national product (GNP) - all important indicators of economic performance.   A third major use of the CPI is to adjust income payments. More than 8.5 million workers are covered by collective bargaining contracts which provide for increases in wage rates based on increases in the CPI. In addition to workers whose wages or pensions are adjusted according to changes in the -- PI, the index now affects the income of more than 50 million persons, largely as a result of statutory action: Almost 31 million social security beneficiaries, about 2« million retired military and Federal Civil Service employees and survivors, and about 20 million food stamp recipients. Changes in the CPI also affect the 25 million children who eat lunch at school. Under the National School Lunch Act and the Child Nutrition Act, national average payments for those lunches and breakfasts are adjusted semi-annually by the Secretary of Agriculture on the basis of the change in the CPI series, "Food away from home".   Also, the official poverty threshold estimate, which is the basis of eligibility for many health and welfare programs of Federal, state and local governments, is updated periodically to keep in step with the CPI. Under the Comprehensive Employment and Training Act of 1973, the "low income" criterion for distribution of revenue-sharing funds, is kept current through adjustments based on the index.   In addition, the Economic Recovery Tax Act of 1981 provides for adjustments to the income tax structure based on the change in the CPI in order to prevent inflation-induced tax rate increases. These adjustments, designed to offset the phenomenon called "bracket creep", are to be calculated initially in 1984 and reflected in the 1985 tax schedules.   II. Sponsors   The CPI is collected, analyzed and published monthly by the Bureau of Labor Statistics. The Census Bureau under contract to BLS collects two surveys, the expenditure survey and the Point of Purchase survey which are used to construct sampling frames for selecting the item and outlet sample for the CPI.   75     III. Sample Design - General   The most recent major revision of the CPI was completed in 1978. This revision introduced probability sampling procedures at all levels of sampling including within outlet selection of items. It incorporated new expenditure weights from the 1972-73 Consumer Expenditure Survey, new retail outlet samples from the 1974 Point of Purchases Survey, and population data from the 1970 census. It also introduced a second index, the more broadly based CPI for All Urban Consumers (CPI-U) , which took into account the buying patterns of professional and salaried workers, part-time workers, the self-employed, the unemployed, and retired people, in addition to wage earners and clerical workers. The two indexes differ chiefly in the weighting used.   In January 1983, the BLS changed the way in which homeownership costs are measured. A rental equivalence method replaced the asset price approach to homeownership costs for the CPI-U. In January 1985 the same change will be made in the more narrowly defined index constructed for the Wage earners and clerical workers (CPI-W). The central purpose of the change was to separate shelter costs and the investment component of homeownership so that the index would reflect only the cost of shelter services provided by owner-occupied homes.   Several key concepts indicate the nature of the Consumer Price Index and guide the way in which it is calculated.   1. Prices and Living Costs. The CPI is based on the prices of food, clothing, shelter and fuels, transportation fares, medical services, and the other goods and services that people buy for day- to-day living. It is constructed in accord with statistical methods that make it representative of the prices of all goods and services purchased by consumers in urban areas of the United States. Price change is measured by repricing essentially the same market basket of goods and services on monthly or bimonthly time intervals and comparing aggregate costs with the costs of the same market basket in a selected base period. The longitudinal aspect of the survey is the month to month linkage of the sample of item/outlet specifications (quotes) and their price, size and quantity for the given quote.   2. Weights and relative importance. The weight of an item in the index is derived from a survey of consumers which provides data about the dollar amount spent for consumer items during the survey year. In a fixed weight index, such as the CPI, the implicit quantity of any item used in calculating the index remains the same f rom month to month (for example, the number of gallons of gasoline) . This should not be taken to mean that the relative importance, of gasoline in the average consumer's budget remains the same. Relative importances change over time because they reflect the effect of price change on expenditures. Items whose prices rise faster than the average become relatively more important.   3. Sampling. Since it is impossible to obtain prices for all expenditures by all consumers, the CPI is constructed from a set of samples not all of which are longitudinal in nature:   a. A sample of areas selected from all U.S. urban areas.   76     b. A sample of families within each sample area for expenditures of consumers, this sample need not be longitudinal, but linkage of records from a series of interviews was used. c. A sample of outlets from which these families purchase goods and services. A household survey which is used to identify and construct the sampling frame of outlets is not longitudinal, however, the sample of outlets selected from this frame is longitudinal., d. A sample of items for the goods and services purchased by these families. This is the primary longitudinal component of the CPI.   It is from these samples that weights are developed and data are obtained for the monthly calculation of the index. Specifics for each sample or sampling stage are described as follows:   A. CPI Area Design   Pricing for the CPI is conducted in 87 sample geographic areas. Eighty five strata were defined by combining similar PSU's according to the following 1970 Census characteristics:   1. region, population size, SMSA versus non-SMSA 2. percent population increase from 1960 to 1970 3. major industry 4. percent nonwhite 5. percent urban   This area design resulted in 29 strata with one pricing area per stratum and 58 non-selfrepresenting strata. Twelve publication areas consisting of three city-sizes (non-selfrepresenting SMSA's of over 388,000 population, SMSA's less than 388,000 population, and non-SMSA urban areas) crossed by four Census regions were defined along with the 29 local areas to provide estimated indexes for all urban areas of the country. Each of the twelve region, city size publication areas contained four, six or eight strata. In addition special supplementation was made to support publication for Denver.   B. Expenditure Survey Sample Design   In 1972-73 two household surveys, a Diary and an Interview Survey were conducted by the Census Bureau for BLS to collect expenditure information for consumer units. The sampling unit for these surveys was a housing unit. The reporting unit was a consumer unit which was defined to be (1) a group of two or more persons, usually living together, who pool their income and draw from a common fund for their major items of expense, or (2) a person living alone or sharing a household with others, or living as a roomer in a private home, lodging house, or hotel, but who is financially independent-that is, income and expenditures not pooled with other residents. Never married children living with parents always were considered members of the consumer unit. The eligible population included the civilian noninstitutional population of the United States as well as that portion of doctors' and nurses' quarters of general hospitals. Armed forces personnel living outside military installations were included in the coverage while armed forces personnel living on post were excluded. Also excluded from eligibility were persons living in college dormitories, fraternity or sorority houses, prisons, monasteries, aboard ships, or in other quarters containing five or more unrelated persons.   77     The first component was a Diary Survey completed by respondents for two consecutive one week periods. The objective of the Diary Survey was to obtain expenditure data on small frequently purchased items which are normally difficult to recall. These items include expenditures for food and beverages, natural gas and electricity, gasoline, housekeeping supplies, non-prescription drugs, medical supplies, and personal care products and services. Consumer units were asked to list all expenses during the survey period. Data on income and family characteristics also were collected. The sample of housing units was balanced across areas and time of year. The records of the two consecutive one week periods for each consumer unit were linked to create two week levels of expenditure.   The second component of the CE, called the Interview Survey, was a panel survey in which each consumer unit in the sample was interviewed every three months over a fifteen month period. This survey was designed to collect information on major items of expense as well as on income and family characteristics. Items reported on the interview survey included expenditures for the following: housing, household equipment, house furnishings, vehicles, subscriptions, insurance, educational expenses, clothing, repair and maintenance of property, utilities, fuels, vehicle operating expenses and expenses for out of town trips. The final interview in the fifth quarter provided the regularly recorded expenses plus information on homeownership costs, work experience, changes in assets and liabilities, estimates of consumer unit income and other selected financial information. The quarter records for each consumer unit were linked to form annual records for each consumer unit. Only consumer units responding in at least the fifth interview were used to form these "linked" records of annual expenditures for estimation.   The samples of consumer units for the CE were selected as follows. For both the diary and interview survey the nation was stratified into 216 geographic strata using stratification variables defined for the Current Population Survey of the Census Bureau. Thirty of these areas were designated as selfrepresenting. Half of the housing units in each self-representing area were covered in the first survey year and half in the second survey year. The 186 equal sized non-self- representing areas were divided into two 93-area groups. One sample area from each of the 93 groups was in sample in each of the two survey years. Each sampling area was randomly selected proportional to population from each of the 186 strata.   1. Interview Survey   The universe for sample selection was the 1970 Census 20% sample data file. A sample of 12,613 housing units was designated for the 1972 Interview Survey component, and 13,014 housing units for the 1973 Interview Survey. For the first year 11.1 percent were vacant, nonexistent or ineligible and the refusal rate was 10.3 percent of the designated sample. Interviews were completed in 9914 units. For the second year 12.9 percent was vacant, nonexis- tent and ineligible with a refusal rate of 9 percent. Interviews were completed in 10158 units.   At the time of selection, housing units for the Interview Survey within a PSU were distributed by month within the quarter to allow for data collection throughout each quarter. Each sample unit was visited once each quarter, at   78   approximately the same time in the quarter, and each consumer unit within the household was interviewed. Data from previous quarters were available for the interviewer to use in bounding expenditure reporting. Bounding is an interviewing technique which unduplicates expenditures reported in the previous interview from the current interview. The type of expenditures reported during each interview varied since the recall periods varied from three months to one year. Housing, major equipment, automobiles, subscriptions and insurance were annual recall items. A semi- annual recall period was used for minor equipment, house furnishings, renting and leasing of vehicles, and education. The following sections were covered each quarter: repair, alterations, and maintenance of owned property; utilities, fuel, and household help; clothing and household textiles; equipment repairs; vehicle operating expenses; and out-of-town trips. Interviewing was conducted with any person available in the consumer unit; no attempt was made to interview all persons in the consumer unit, that is proxy responses within a consumer unit were used. Proxy responses for persons away at school was the source for some of the college members of a consumer unit.   2. Diary Survey   Again the universe for sample selection was the 1970 Census 20% sample data file. A sample of housing units was selected from this Census file for each year of the diary survey. Approximately 14,590 housing units were designated and 12,661 eligible for the 1972 Diary component, and about 15,210 designated and 12,999 eligible for the 1973 Diary component. These numbers included an augmented sample of households which were to be visited during the four week period preceding the end of the year holidays. Each housing unit was visited twice, once at the end of each week of the two week survey period. For the first year the eligible response rate was 80.1% and 89.9% for the second year.   IV. CPI Survey Design and Content   The primary longitudinal samples for the CPI is the sample of item/outlet specifications and their respective prices, which are obtained every month. BLS collects prices for the Food, Commodities and Services, Rent, and Property Tax components of the CPI. These prices are collected monthly or bimonthly in all 87 areas. Each one of these components has a separate survey with its own sample design. Data used for the Mortgage Interest and House Prices components of the CPI are not collected by the Bureau but are obtained from outside sources such as FHA and FHLBB.   The Point of Purchase Survey (POPS) is the source of the outlet sampling frames for about 60% of the CPI items by expenditure weight. The items not covered by the POPS are grouped together under the heading non-POPS and include rent, property tax, mortgage interest, house prices, utilities, transportation, insurance, and several miscellaneous categories. These sample designs are not described here except rent.   1. Point of Purchase Household Survey - Frame Source   In the spring-summer of 1974 a household survey, the Point of Purchase Survey, was conducted by the Census Bureau for BLS to provide the sampling     79     frame of outlets for food and most commodities and services to be priced in the CPI and to provide demographic data for classification of the households reporting an expenditure for an outlet. The survey was conducted in the 85 PSU's defined for the CPI. The commodities and services for which sampling frames were developed in each PSU included food, apparel, drugs, personnel care items, household furnishings and housekeeping supplies, beverages, most medical services, sports equipment, gasoline and automobiles, and automotive parts and services. Expenditures, name, and location of the place of purchase were collected for approximately 100 relatively broad categories of expenditures with reference periods of one week to two years depending on the expected frequency of reporting. To control the expected number of responses received from a household and minimize respondent burden two groups of categories were defined; one set given to 1/4 of the sample households and the second set given to 3/4 of the sample households. The combination of sample size of the households asked a category and the reference period for a given POPS category was designed to generate approximately 6 to 12 not necessarily unique outlets reported for a given PSU/POPS category.   For POPS the national sample size was 23,000 designated housing units. Since separate frames of outlets were required for individual CPI pricing area (PSU's), the sample is not self- weighting across PSU's, but within a PSU, the households are selected with a uniform probability.   2. CPI Outlet Sampling Procedures   When a sample ELI was selected a specific POPS category was identified for outlet selection. In self-representing areas, sample households were divided into two independent groups by the first stage order of selection. This defined two frames of outlets for outlet selection to support variance estimation. The following approach was used for outlet selection for frames developed from the POPS and CPOPS Survey.   A systematic selection of outlets reported for a given POPS category for the W population was made where the measure of size for each outlet was proportional to the average daily expenditure reported for the outlet by all consumer units in the W population. Before January 1982, the outlets for the U population were then selected using a conditional probability technique to maximize the overlap between outlets. The sample outlets for the U population were then selected by a repeat of the systematic selection using the new measures of size. After January 1982 the collection of prices for the W population was discontinued. The sample outlets are now selected systematically with probability proportional to average daily expenditure of the U population.   All outlets reported by CPOPS sample families in any sample area are eligible for pricing. However, BLS restricts pricing of outlets to be within a 25 mile radius of a given sample PSU unless 10 or more designated items are identified in some clustered area beyond the mileage limitation. If this is the case, there is no mileage limitation and all items in the clustered area are priced.   The non-POPS categories were excluded from the POPS either because existing sampling frames were adequate, or it was felt the POPS would not yield an adequate sampling frame.   80       Each non-POPS commodities and services item has its own sample design. For each item, the frame consisted of all outlets providing the commodity or service in each sample area. A measure of size was associated with each outlet on the sampling frame. Ideally, this measure of size was the amount of revenue generated by the outlet by providing the item to the !CPI U population in the sample area. Whenever revenue was not available, an alternate measure of size, such as, employment, number of customers, or quantity of sales was substituted. Since no measures of size could be determined strictly for the w population, a single sample of outlets and quotes was selected for estimating the index for each population. All samples were selected using the systematic sampling technique with probability proportional to the measure of size available.   a. CPI Sample Items   The basic CPI item structure is an follows: The seven, major groups (food, housing, apparel, transportation, medical care, entertainment and personal care) are broken into 68 expenditure classes (ECIS) (such as auto repair). Within each EC, expenditures are grouped into one or more item strata (such as body work, power plant repair, component repair, and maintenance and service). There are a total of 265 item strata within each item strata, one or more substrata, called Entry Level Items (ELI's) are defined. There are a total of 382 ELI's. ELI's are the ultimate sampling units for items as selected in the BLS Central Office. They are used in the field by the data collectors as their initial level of item definition within an outlet. An ELI is assigned to one and only one POPS or Non-POPS outlet category.   Four regional market basket universes were tabulated into the item strata structure from the Diary and Interview surveys to reflect regional differences within each of the four regions (Northeast, North Central, South, and West) eight independent samples of ELI's were selected for each item stratum. Thus, eight samples of ELI's were selected for each region and for each population-thirty-two sample selections nationally for each population. Each CPI PSU was assigned one or two of the eight item samples from the corresponding region for pricing. Self-representing published areas were assigned two independent item samples and each non- self-representing area was assigned one item sample. These independent item samples were designed to accommodate variance estimation for the CPI. A given item sample for all item strata assigned to a given PSU is called a half-sample. The sample of ELI's and appropriate POPS categories are merged to create specific outlet/item samples.   b. Within Outlet Selection for Specific items   For each ELI, whether in a POPS or Non-POPS category, the selection of a specific store item by a data collector is performed using multi-stage probability selection techniques with measures of size proportional to percentages of dollar sales usually provided by the respondent for the outlet.   To perform this operation, the data collector is provided with a checklist that includes all the descriptive characteristics which are believed to identify the items of the ELI and determine or explain price differences for all items defined within the ELI. In addition, the data collector is given the definition of the ELI, suggested stages of groupings of items to aid in   81       quickly selecting a specific store item and a series of worksheets on which to define the categories of items, post the probabilities and identify the next category within which to select the specific store item by use of the random number table on the worksheet.   In developing this procedure, it was necessary to provide the data collector with several alternative methods for defining the categories and obtaining the percentage of dollar sales or approximations to those sales. The procedures developed to obtain the proportion of sales were:   a. Obtaining the proportions directly from a respondent. b. Ranking the categories by importance of sales and then obtaining the proportions directly or using preassigned proportions. c. Using shelf space to estimate the proportions where applicable. d. Using equal probability if all else fails.   To define the categories, direct responses from the respondent as to what he sells or an inventory technique was used.   The procedures make possible an objective probability sampling of items throughout the CPI. They also allow broad definitions of ELI's so that the same tight specification need not be priced everywhere. The wide variety of specific items greatly reduces the within item component of variance, reduces the correlation of price movement between areas, and allows a substantial reduction in the number of quotes required to obtain the same precision as the pre- 1978 index. A second important benefit from the broader ELI's, along with the POPS categories, is a significantly higher probability of finding a priceable item within the definition of the ELI within the sample outlet. Procedure a) was used approximately 60% of the time, procedure b) was used about 30% of the time, procedure c) about 7% and procedure d) the remainder.   Once the sample of items in the sample PSU's are identified, the price for the specification which define the items within the sample outlets are priced on a monthly or bimonthly basis. This continues for a minimum of a 5 year period and is the basis for measuring price change for the CPI. This time series for each individual specification is the longitudinal element of the CPI.   C. Sample Maintenance   Since 1977, the Bureau has sponsored a Continuing Point of Purchase Survey (CPOPS) also conducted by the Census Bureau. This survey is aimed at producing current data on outlets. The CPOPS has been expanded from the original 100 categories of expenditures included in the POPS to 134 categories of which 102 categories are asked from each of two equal size panels. This survey is conducted each year in one fifth of the 87 PSU's on a rotating basis. From the results of this household survey, new samples of outlets and item specifications are rotated into the CPI data collection to replace the old sample of outlets and items priced for the CPI in a given area.   d. Response Rates   A sample of 24,278 outlets were designated from the original POPS survey for CPI pricing. The out-of-scope response rate was 12.6 percent. There were   82         1,649 with non-responses resulting from no contact, refusals, or temporary agencies. This non-response rate for designated sample units was 6.8 percent. Thus the response rate was 93%. Each year one-fifth of the sample areas have all of the outlets reselected for repricing. Approximately 7300 outlets are selected of which 11.8% are out of scope and the response rate has been 95% from those outlets which have sample items available to price. An annual attrition rate for outlets has been 3.3%. In addition for the outlets which remain in sample, the average annual item substitution rate for items within outlets has been 6.2%. Substitution occurs because an item selected for sample is modified or no longer available and the field representative obtains a description and price for an item most similar to the original item selected from the outlet.   V. Rent Survey   A. Sample selection   The current CPI rent index is based on a sample of approximately 23,000 rental units, allocated among the 87 PSU's. The units were selected from two universes, a stratified multistage, systematic, self-weighting area sample of housing units built before 1970 and a continuously updated sample of newly constructed units. The Bureau of the Census provides the sample of new construction units from building permits. Approximately 2,000 units have been obtained from this source as of 1982.   Using an area segment sampling approach, 19,000 rental units were selected from 6,422 area segments. There has been an attrition of about 2,000 units due to conversions to owner housing. This sample has been augmented with approximately 1,500 new segments and 4,000 rental units to minimally support the rental equivalency concept of homeownership. This augmentation followed a process similar to the original area segment sampling approach.   B. Data Collection   In order to collect the monthly information necessary to calculate the rent index, the sample is divided into six panels of approximately 3,800 units each. The units in each panel are visited twice a year on a six month cycle. The information collected includes the rents paid for the current month and the previous month, information on extra charges and reductions, a description of the unit, and the facilities included in the rent. The latter questions are used to make quality adjustments to the calculated rents in order to assure that the rent change measured is for a set of units of a consistent quality. Data collection is by personal visit or telephone to tenants or property managers.   For the CPI Rent sample the response rate for occupied in scope units is 88 percent.   VI. Scope and Calculation   A. Index and Non-Rent Estimation   Prices used in calculating the index are collected in 87 urban areas across the country from about 24,000 retail establishments.     83     Prices of food, fuels, and a few other items are obtained every month in all 87 locations. Prices of most other commodities and services are collected every month in the five largest urban areas and every other month in other areas. Prices of most goods and services are obtained by personal visits. Some repricing for selected easily identified commodities are obtained by telephone and a mail questionnaire is used to obtain electricity rates.   In calculating the index, price changes for the various item strata in together with urban area weights which represent each market basket are averaged sent their importance in the spending of the appropriate population group. Local data are then combined to obtain a U.S. average. Separate indexes are also compiled by size of city, by region of the country, for cross-classifications of regions and population-size classes, and for 29 local areas. The estimation for monthly item strata level price relatives (R.t,t-1) is the ratio of two long term relatives for time t and t-1. R.t,O R.t,t-1 = ------ R.t-1,O     Each long term relative is calculated as a weighted sum of individual items price relatives m W.i P.ti R.t,O = ä --- ---- iE1 M P.Oi   where   R.t,o is the long term estimate of price change for a set of items representing the item strata   P.ti is the price at time t for item i   P.Oi is the price at time 0, the base period, for item i   W.i is an estimate of expenditures for the ELI contained in the item strata for which the items are a sample   M is the number of eligible sample prices in the ELI   The index each month is a weighted average of the price relatives divided by a base expenditure (C.O). The weights (C.t-1,i) of the index are estimates of expenditure for each item stratum which reflect buying patterns of a given reference period and all price change up to the previous month: m I.t,O = C.t-1,i R.t,t-1,i i=1 ------------------ C.O   B. Rent Estimation   Estimates of the monthly rent price relatives for each market basket are calculated using special cost weights and 1- and 6- month, estimates of rates of change.   84           Let S.1 be the set of units interviewed in time t in a market basket which has rent values for time t and t-1, and S.6 be this set of units interviewed in time t in a market basket which has rent values for times t and t-6. The rents for the ith unit in a market basket for the given time period are represented by r.iT where T-t, t-1, or t-6. The l- and 6-months rates of change,   R.t,t-1 and R.t,t-6, are calculated by:   ä r.it W.i ä r.it W.i ieS.1 and R.t,t-1 ieS.6 R.t,t-1 = ----------------- ----------------- ä r.it-1 W.i ä r.it-6 W.i ieS.1 ieS.6   where W.i reflects the probability of selection adjusted for nonresponse.   Using R.t,t-1 and R.t,t-6' a composite estimate is made of a current month's cost weight CW.t for the market basket:     CW.t = P R.T,T-1 CW.t-1 + (1 - P)R.t,t-6 CW.t-6'     where P = .65. The value of P was based on simulations of weighted averages of 1- and 6-month rent relatives designed to minimize variances.   A final 1-month estimate of rent price change for the particular market basket is   CW.t R.t,t-1 = ----- CW.t-1     C. Rental Equivalency   In January 1983, BLS will begin measuring the housing component of the CPI-U using the rental equivalency method which assumes the cost of homeownership is the amount which would be paid to rent an equivalent home. Rental equivalency will be measured using a sample of rental units with new weights assigned to each rental unit which reflect the number of homeowner units in the universe for which the rental unit is equivalent. The rent component of the CPI will continue to be measured in the usual way.   After 1986 rental equivalency will be measured using a sample of owned units. Rent change will be determined for these units by matching the owned units to equivalent rental units based upon unit and neighborhood characteristics. Using estimated owners rents, monthly change for rental equivalency will be calculated in a fashion similar to that used to calculate the current rent index.   VII. Data Products and Analysis   The monthly CPI is first published in a news release during the fourth week following the month in which the data are collected. (The index for January is published in late February.) The release includes a narrative summary and analysis of major price changes, short tables showing seasonally adjusted and unadjusted percentage changes in major expenditure categories,     85       and several detailed tables. Summary tables are also published in the Monthly Labor Review the following month; shortly thereafter, a great deal of additional information appears in the monthly CPI Detailed Report.   Seasonally adjusted data are presented in addition to unadjusted data because they are preferred for analyzing general price trends in the economy. They eliminate the effect of changes that normally occur at the same time and in about the same magnitude every year, such as price movements resulting from changing climatic conditions, production cycles, model changeovers, holidays, and sales. Seasonal factors used in computing the seasonally adjusted indexes are derived by the X-11 Variant of the Census Method II Seasonal Adjustment Program and are reevaluated annually.   The data collected is item descriptive data plus the price, size and quantity of the item being priced. Longitudinal analysis is specifically related to determination of degree of price change and trend for a given commodity sector and explaining the reasons for the change for both the short and long term by examining the micro data and ancillary information for the locale and the nation. In addition, studies are conducted to assess the impact of government policy changes or changing economic conditions on the index. The techniques used are regression, distribution analysis and simulation.   VI. Limitations of the Index   The CPI is not an exact measure of price change. It is subject to sampling errors which may cause it to deviate somewhat from the results which would be obtained if actual records of all purchases by consumers could be used to compile the index. These estimating or sampling errors are limitations on the precise accuracy of the index rather than mistakes in the index calculation. The accuracy could be increased by using much larger samples, but the cost is prohibitive. Furthermore, the index is believed to be sufficiently accurate for most of the practical uses made of it.   Another kind of error occurs because people who give information do not always report accurately. The Bureau makes every effort to keep these errors to a minimum, obtaining prices wherever possible by personal observation, and corrects errors whenever they are discovered subsequently. Precautions are taken to guard against errors in pricing, which would affect the index most seriously. The field representatives who collect the price data and the commodity specialists and clerks who process them are well trained to watch for unusual deviations in prices which might be due to errors in reporting.   The CPI represents the average movement of prices for two specified populations but not the change in prices paid by any one family or small group of families. The index is not directly applicable to nonurban workers and others not included in the samples. The index measures only the change in prices and none of the other factors which affect family living expenses, such as changes in the size of the family or changes in buying patterns. Nor does it reflect consumption, such as fringe benefits.   Area indexes do not measure differences in the level of prices among cities; they only measure the average change in prices for each area since the base period.   86       Although the CPI has been called a cost-of-living index and used at times as if it were one, there are important conceptual differences between a price index and a cost-of- living index. A true cost-of-living index would take into account not only price changes but also changes in the market basket as consumers adjust their purchases to changes in the relative prices of what they buy. Thus,, during a period of rising prices, a cost-of-living index might rise more slowly than a price index if consumers substitute cheaper items for more expensive ones, or generally reduce expenditures on higher priced items in their budget. However, an index such as the CPI' does not directly reflect such consumer behavior, since the quality and the implicit quantity weights of the items represented in the CPI remain constant. The index indicates what it would cost to maintain the same level of living, not what consumers actually spend on their living costs. What consumers actually spend may reflect a decision to accept a lower standard of living in order to keep living costs from rising.   There are other differences between the two types of index. For example, the CPI includes only the cost of sales and excise taxes that are included in the purchase price o f goods and services, but not income taxes, whereas a cost-of-living index would include both sales and income taxes.   87                   EMPLOYMENT COST INDEX CASE STUDY   I. Purpose   The Employment Cost Index (ECI) measures change in total employee compensation and has been designed as a principal Federal economic indicator by the Office of Management and Budget. The ECI is used in monitoring the effects of monetary and fiscal policies by enabling analysts and policymakers to assess the impact of labor cost changes on the economy, both in the aggregate and by sector. The limitations of the index must be kept in mind. Because the ECI is an index, it only measures change in employee compensation; the index is not a measure of the total cost of labor. Not all labor cost (e.g., training expenses, retroactive pay, etc.) fall under the ECI definition of compensation.   II. Sponsors   The Bureau of Labor Statistics developed the ECI in 1975 to provide a comprehensive measure of employee compensation. The initial design was started in the early 70's by the Office of Wages and Industrial Relations and the Office of Survey Design of BLS. All data collection and data processing is provided by Bureau staff.   III. Sample Design   A. Private Sector Sample Design A principle concern of the ECI sample design is to provide an ongoing sample that in some sense represents an outgoing current universe. ECI accomplishes this with what is called replenishment groups. A replenishment group is an establishment sample of SICs which replaces a segment of the current sample. A new replenishment group is introduced each quarter until the entire sample has been replaced; after which, the cycle is repeated (currently every four years). The quarterly replenishment groups each have, approximately, an equal number of establishments. This equality reduces the disruption in the quarterly estimates and is within resource constraints. A replenishment group collection cycle begins every three months and the new sample is introduced into the ECI estimates after the section update.   1) Description of the Private Section Establishment Selection   Each replenishment sample is composed of a number of related two-digit SIC subsamples. Within each SIC, the frame (Unemployment Insurance File) may be sorted by Census Region, employment or establishment name. A sample of 450 establishments is selected probability proportionate to employment for the entire replenishment group. Systematic samples of about 300 establishments comprise the main replenishment sample. The remaining 150 establishments are selected for several supplemental groups. The supplemental.     89         groups are held in reserve in case additional sample is required if a larger than expected number of out- of-scope is obtained. To enable variance estimation by replication techniques, the establishments are assigned to two half-samples.   2) Description of the Occupation Selection To measure Major Occupation Group (MOG) compensation change, the Occupational Universe (currently based on the 1970 Census occupations) is partitioned into the MOGS, such as professionals, technical workers, etc. Each MOG may be further partitioned into Entry Level Occupations (ELOs), such as Teachers.   There are usually 9 to 13 ELOs, which represent all occupations within an SIC. For each ELO found in the establishment, data is collected to represent that ELO. During the initial visit to a sample establishment each detailed establishment occupation is matched into one of the ELOs. Then a probability proportionate to employment selection is made within each ELO, selecting one specific occupation. Data for wages and benefits is then collected for each of the selected detailed establishment occupations.   B. Public Sector Sample Design   The public sector sample has been fixed since June 1981, when it was introduced. There is no public sector replenishment system because of the lack of updated frame. An easily accessible frame does not exist for State and local governments.   1) Public Sector Establishment Sample Design   The public sector frames were divided into four parts: schools, hospitals, State and large local governments (all SICs except schools and hospital), and small local governments.   a. Schools: The public elementary and secondary schools frame, (SIC 821) as well as the higher education (SIC 822) frame, came from 1973-74 National Center for Education Statistics (NCES) listing of all State and local schools. Establishments were stratified by 3-digit SIC; then a sample was selected with probability of selection proportionate to enrollment within the school. A first phase mail survey was conducted to determine ELO employments for the selected schools. Using these ELO employments to obtain measures of size, the second stage sample of 206 establishments employing a two- way controlled selection technique controlling on respondent burden and the number of designated quotes within each selected ELO was selected.   90           b. Hospitals:   The hospital frame was the 1976 Health, Education and Welfare (HEW) list of public hospitals. The hospital survey design did not include a first phase occupational survey. Public hospitals were stratified by Census region and ownership and selected systematically using probability proportionate to employment. The occupation selection was essentially a systematic sample (equal probability) within each establishment. The 106 establishments in the final sample were then requested to supply data from the appropriate occupations.   c. State and Large Local Governments   No universe listing of establishments was available for State and large local governments. A refinement survey was used to develop a sampling frame. The local government jurisdictions in the refinement survey (cities, counties, special districts, etc.) were selected from 1972 Census of Government file provided by the Bureau of the Census. Only jurisdictions with more than 100 employees were included in the refinement survey (see "small local governments" below). The 3,729 local jurisdictions were stratified into size class/Census region strata. Forty-six jurisdictions were selected probability proportionate to employment.   In addition, sixteen States were selected probability proportionate to employment land included in the Refinement Survey.   Once the refinement was completed, a probability proportionate to employment sample of 780 refined units were selected for a first phase occupational employment survey. Occupational employments were requested for nine occupational groups within each of the 780 units. The final sample includes 350 units.   d. Small Local Government   Due to their small size (units with less than 100 employees), no refinement or first phase survey was done for small local governments. Instead, the list of small local governments was stratified by Census Region and then a probability proportionate to employment sample of 30 units was selected. Any refinement required was accomplished by BLS field representatives at the time of collection.   91       IV. Survey Design and Content   A. Design   1) Reporting Unit   The ECI reporting unit is the physical location of a business (establishment). Sometimes data can only be collected for a unit which is larger than the original designated establishment, Usually this is acceptable and a weighting adjustment is made later. It is also possible that data is much more accessible at a finer level than an establishment; in this case, subsampling procedures are available to randomly select a subunit.   2) Following Movers   If the collection unit is essentially unchanged after a physical move, then it is followed provided it remains within the same State.   3) Weighting   The weights for each establishment/ELO is the reciprocal of the selection probability times the ELO employment. There is also a nonresponse adjustment factor applied to the weight.   4) Interview Schedule   Each establishment reports wage and benefit data four times a year (March, June, September and December). The typical private sector establishment will be included in the survey for a four year period, at which time the sample is replaced. Currently, there is no definite date when the public sector sample will be replaced.   5) Interview Mode   The initial data collection is always a personal visit. During subsequent quarters a mail update form is used. When necessary, telephone calls are made to obtain required data.   6) Questionnaire   There are two basic types of ECI collections -- initiation and quarterly update collections. During the initiation, the field representative selects a detail establishment occupation to represent each ELO. Once the establishment occupation is selected, benefit usage, benefit plan, wage and work schedule data are collected for each selected detail establishment occupation.   92     During the quarterly update, wage data and benefit plan change data are collected. When a benefit plan changes, the new plan is incorporated into the database using the initiation usage.   B. Content   The Employment Cost Index is a relatively new Bureau of Labor Statistics survey measuring the change in the employer cost of employing workers. When the ECI first started its publication in December 1975, it measured quarterly wage change covering the private non-farm sector, excluding Alaska, Hawaii and private households. Publications included overall National, Major Industry Division (MID) like wholesale trade, manufacturing and services; Major Occupation Group (MOG) like Professionals, Managers and Clerical Workers; Census Region (Northeast, South, North Central and West); Union/Non Union and Metropolitan/Non-Metropolitan Area quarterly change numbers. Currently, the ECI is an index measuring total compensation change covering the total non-farm civilian sector excluding private household and the federal government. Compensation is composed of wages and twenty-three benefits (hours related benefits, such as vacation; supplemental pay, such as shift differentials, insurance, such as health benefits; pension and legally required, such as social security). The National series (Overall National, MID, MOG indices) use Laspeyres (fixed weight)/industry/occupation estimates. For each of the non-National series.1/ (Census Region, Union/Non Union and Metropolitan/Non- Metropolitan), estimates (e.g.,union/industry/occupation) are obtained by allocating the fixed weight industry/occupation estimates using current sample data; so that the non-national series cannot be considered Laspeyres.   V. Response   A. Determination of Private Sector Replenishment Cycle Assuming the sample is completely replaced after n, 2n, 3n,..., quarters and that the response and attrition rates are equal across replenishment, then the response rate obtained after n quarters should be maintained each quarter thereafter. We call this the maintainable response rate. The determination of the appropriate time length for the complete replenishment cycle can be made by computing the maintainable response rates for various cycles and comparing the rates.   To compute the maintainable response rate, the following wage information from the original sample is used:     proportion of initial sample in scope, 0.85;   proportion of initial in scope sample responding, 0.82;   proportion of sample remaining each quarter, 0. 98; and __________________________   .1/ For an economic interpretation of the non-national estimates see:   Estimation Procedures for the Employment Cost Index, G. Donald Wood, Jr., Monthly Labor Review, May 1982.   93       number of establishments required at the end of   the replenishment cycle 2000.   Using the above information the following table on quarterly sample size and maintainable response rate is determined.   Estimated Replenishment Number of units Maintainable response Cycle (Years) initiated per quarter rate (wages)   2 385 0.76 3 267 6.74 4 208 0.71   Considering the initial work required introducing an establishment into the survey, a two year cycle was not considered cost effective. A 0.71 wage response rate with a four year cycle is lower than desired considering the fact that the benefit response rate would be closer to 0.6 than to 0.7. A three year cycle would keep respondents in the survey for a reasonable length of time and provide a benefit response rate at least close to 0.65. Therefore, the initial decision was to proceed with a three year cycle.   After the first year of replenishment samples, it became apparent that field resource constraints would not allow a three year cycle. We are currently working on a four year cycle.   B. Public Sector   The Public Sector does not have a replenishment system in place at this time. The initial response rate, in June 1981, was 81%. Since then the attrition rate has averaged 0.3% each quarter. These numbers are considerably better than the private sector. Even though there is no replenishment system, the response rate does not decrease quickly. In addition, the number of establishment births and deaths within the public sector should be much less than the number within the private sector. The universe, therefore, should remain relatively stable until 1990.   C. Imputation Schemes   There are three levels of imputation in the ECI. The first level is a weight adjustment to compensate for the initial nonresponse. The second level is an imputation for temporary nonrespondents. (Those establishments that will respond next quarter, but for some reason cannot respond this quarter). This imputation is done at the item level. Its purpose is to serve as a link for periods when there is a response. The third level of   94           imputation is at the estimation cell level, whenever there is no data for the entire estimation cell. This imputation assures that the same cells are being compared each quarter.   VII. Data Product   At the present time, no public use tapes of micro ECI data are available. The only data available to researchers are that contained in the quarterly news release which is available on Labstat. The feasibility of developing a public use tape is being explored.   95               CASE STUDY 4   NATIONAL LONGITUDINAL STUDY   OF THE HIGH SCHOOL CLASS OF 1972   I. Purpose:   The basic purpose of NLS-72 is to provide data on the experiences that affect the development and attainment of a current generation of young people. Specifically, this study provides data on: . the transition of young people from high school to postsecondary education . the transition from high school to the world of work, . persistence in postsecondary education (as opposed to dropping out), . the transition from postsecondary education to the world of work.   II. Sponsor   NLS-72 has, since its inception, been sponsored by the National Center for Education Statistics (NCES) within the U.S. Department of Education.   The principal contractors who have played major roles in WLS- 72 are: 1. Education Testing Services (ETS) -- Base-year survey in 1972. 2. Research Triangle Institute (RTI) -- First four follow-up surveys 1974, 1975, 1977, and 1980. 3. National Opinion Research Center (NORC) -- Fifth follow- up survey and Postsecondary Transcript Study in 1984-85.   III. Sample Design   The sample design for NLS-72 is a stratified multistage probability sample of students from all schools, public and private, in the 50 states and the District of Columbia, which contained a 12th grade class. Stratification variables were: type of control (public vs. private), geographic, region, enrollment size, proximity to a college, percent minority, income level of community, and urbanicity.   The original sample design for the base-year survey called for selecting a probability sample of 1,200 schools from the population of schools with a 12th grade, and within each school random selecting 18 seniors. Since 231 of these schools refused to participate and 21 had no seniors enrolled, the number of schools actually participating was 948. The number of students participating was 16,683.   At the time of the first follow-up, in 1974, 205 of the nonparticipating schools were induced to participate and former seniors from those schools were administered retrospective surveys. Ultimately the reconstituted base-year sample consisted of 22,652 students from 1,318 schools.   IV. Survey Design and Content   In the base-year survey, questionnaires and cognitive tests were administered to groups of students in each participating school. Information on courses taken and grades earned was extrated from school records.   97       Follow-up surveys have been conducted primarily by mail but when repeated reminders failed to elicit a response, resort was had to personal interviews, either by telephone or face- to-face. About one third of the mail respondents in each follow-up survey were telephoned to resolve response inconsistencies.   The fifth follow up, which is now in the field-test stage, is being funded by NCES with the help of a consortium of interested agencies. It will also be conducted primarily by mail. To reduce costs only a subsample of the original sample will be used.   The various questionnaires tap numerous content areas, including: background characteristics, cognitive ability, socioeconomic status, home background, community environment, relative importance of significant others, current and planned educational and occupational activities, school characteristics, performance in school, work performance and satisfaction, goal orientations, marriage and family, opinions of school, et al. A more detailed listing of survey content areas is displayed in the attached "Table 2."   The content areas for the fifth follow-up survey are being reduced somewhat in order to make room for certain new topics. Education and work history items are retained, however. In addition, special new questionnaires are included to be filled out by hose respondents who have become teachers, or parents.   V. Response Rates   As a result of extraordinary tracking efforts and intensive data collection activities, the response rate to the various student questionnaires' has remained quite high over the 12 years of RLS-72 operation. Student responses rates for each of the surveys thus far completed were:   Base year 87.8%* 1st FU 94.2% 2nd FU 92.1% 3rd FU 88.7% 4th FU 82.2%   * This figure is the percentage supplying data, based on all targetted students in participating schools in the original base-year survey. The corresponding figure for the reconstituted sample was 73.6%.     VI. Evaluations   To maximize the validity and reliability of the data, several procedures were followed:   1. For each of the surveys thus far completed, the student questionnaire was first pretested on a sample of 1971 seniors. (This will not be possible for the 5th follow up because tracing efforts for those students were not adequate to retain a sufficiently large subsample).   98         2. For the base-year survey, a reliability check was conducted in which 500 respondents were asked to reanswer 10 questions 3 months later.   3. For the base-year survey, a validity check was conducted by asking the parents of 500 students to confirm or correct the student's report of family income.   4. To improve the quality of mail responses, all questionnaires were checked for completeness and consistency. Respondents whose forms failed these edit checks were telephoned for clarifications.     VII. Data Products and Analysis   NCES makes all NLS-72 data files available to the public at cost. As each new data file becomes available, an Announcement to that effect is widely disseminated to potential users.   As of 1981, over 320 research reports based on NLS-72 data had been published. These are listed and annotated in the following publication: National Longitudinal Study of the High School Class 1972; Study Reports Update: Review and Annotation by M. E. Taylor, C. E. Stafford, and C. Place. Research Triangle Institute, June 1981.     Click HERE for graphic.     99     Click HERE for graphic.       100     CASE STUDY 5   HIGH SCHOOL AND BEYOND   I. Purpose:   High School and Beyond is a longitudinal study of a nationally representative sample of 1980 high school sophomores and seniors in the United States. Its basic purpose is to replicate, eight years later,the National Longitudinal Study of the High School Class of 1972. Specifically HS&B; would provide updated information on: . factors influencing persistence vs. dropping out of high school or college, . the transition of young people from high school to postsecondary education or to the world of work, . persistence in postsecondary education, . the transition from postsecondary education to the world of work . courses taken and grades received, both at the high- school and the college level.     II. Sponsor   Since its inception HS&B; has been sponsored by the National Center for Education Statistics (WCES) within the U.S. Department of Education.   The principal contractor who has been primarily responsible for the details of research design and for data collection, coding, and storage,has been the National Opinion Research Center (NORC).     III. Sample Design   HS&B; employs a two-stage, highly stratified sample design. In the first stage 1,122 schools that had either 10th or 12th grade students (or both) were drawn. To make the sample more useful for policy analysis, the following types of schools were oversampled: alternative public schools, public schools with high percentages of Hispanic students, Catholic schools with high percentages of minority group students, and high performing private schools. In the second stage, 36 sophomores and 36 seniors were randomly selected, school size permitting, yielding total samples of 30,030 sophomores and 28,240 seniors.   In the first follow-up survey, conducted in spring 1982, all sophomore cohort members who were still in the same schools were included with certainty, as were all dropouts and other subgroups of policy interest, yielding a sophomore cohort sample size of 29,737. Of these, a subsample of 18,000 was selected for a detailed study of high school transcripts.   In the first follow-up survey a subsample of 11,995 of the 1980 senior sample were selected.   The second follow-up survey took place in spring, 1984. At that time, samples of 15,000 members of the sophomore cohort, and 11,995 members of the senior cohort were selected for further data collection.   101       IV. Survey Design and Content   In the base-year survey, questionnaires and cognitive tests were administered to groups of students in each participating school. The administrator in each school filled out a questionnaire about the school; teachers in each school were asked to comment on students in the sample; and a sample of parents of sophomores and seniors (about 3,600 for each cohort) was surveyed primarily for information about their plans for financing their child's postsecondary education.   The first follow-up survey of the sophomore cohort took place in spring 1982 when most respondents were seniors. Questionnaires and tests were group administered to all base- year sample members still attending the same school. Dropouts, and transferees were contacted by mail or as a last resort, by personal interview.   For the second follow-up of the sophomore cohort and for all follow-ups of the senior cohort, contact was by mail or, when necessary, by personal interview.   The student questionnaires cover a large number of content areas, including: school work, gainful employment, demographic characteristics, physical condition, parental characteristics, social relations, and life plans. Marital and fertility history are also covered in the follow-up questionnaires.   V. Response Rates   A total of all (72 percent) of the 1,122 eligible schools selected for the base-year survey actually participated. Of the 311 schools that were unable or unwilling to participate, 204 were replaced with schools which matched them with regard to geographical area, enrollment size, community type, and other characteristics. This brought the total number of participating schools to 1,015, or 90 percent of the 1,122 target.   The student-level base-year response rate within participating schools was 85 percent. The first follow-up survey response rate was about 94 percent for each cohort.   Response rates for the second follow-up survey were 92 percent and 91 percent for the sophomore and senior cohorts, respectively.   VI. Evaluations   To maximize the validity and reliability of the data, several steps were taken: (1) all data collection instruments were pretested on a group of respondents similar to those who would participate in the main survey. (2) Ambiguous or inconsistent responses to mail questionnaire items were clarified by means of telephone calls. (3) A special analysis was performed by NCES to compare the estimates of family income given by the students with those given by the parents.   102     VII. Data Products and Analysis   NCES makes all HS&B; data files available to the public at cost. As each new data file becomes available, an Announcement to that effect is widely disseminated to potential users.   As of summer 1984, over 150 different research studies based on HS&B; data had been published. The principal contractor of HS&B;, NORC, is developing a computerized bibliography of all HS&B-based; publications.     103                     CASE STUDY 6   NATIONAL LONGITUDINAL SURVEYS OF LABOR MARKET EXPERIENCE   I. purpose   The National Longitudinal Surveys of Labor Market Experience (NLS) were designed to identify factors that influence the labor market behavior and experience of a group of workers (Parnes:12). Five cohorts were selected to represent workers with labor market problems of special concern to national policy makers.   The NLS was the first national survey of employment-related phenomena to focus on individual labor market behavior through time. Since 1940, cross-sectional data on labor force participation had been available from the Current Population Survey.   Since the 1950's, information on earnings and employer characteristics had been available from the Continuous Work History Sample, based on a sample of the Social Security Administration's records. Longitudinal data on associated topics is available from the Panel Survey on Income Dynamics, the Longitudinal Retirement History Study, and the Continuous Longitudinal Manpower Survey of CETA participants. None of the other surveys, however, has provided data like those from the NLS on individual gross flows linked to attitudes and experience.   II. Sponsors   In 1965 the U.S. Department of Labor's Manpower, Development and Training Administration (now the Employment and Training Administration) undertook a series of longitudinal studies of the labor force. The Department of Labor (DOL) set up a contract with the Ohio State University Center for Human Resource Research (OSU) under which OSU was responsible for planning and analyzing the surveys. The DOL set up a separate contract with the U.S. Bureau of the Census for data collection for the original cohorts. Data collection for the new youth cohorts was subcontracted to the National Opinion Research Center (NORC).   III. Sample Design   Respondents in the original four cohorts were selected from an area probability sample of the non-institutionalized civilian U.S. population. Primary Sampling Units were selected on the basis of the 1960 Census. For each cohort reliable statistics for Whites and Blacks were ensured by selecting about 1,500 Black respondents and 3,500 White respondents in each cohort. This was accomplished by classifying enumeration districts by race, and using a sampling rate between 3 and 4 times higher in predominately Black ED's.   Forty-two thousand housing units were contacted for screening interviews in early 1966. From these, interviewers identified just over 22,000 eligible respondents in 13.500 households. (A number of households contained more than one respondent, sometimes belonging to more than one cohort.)     105       The new youth cohort selected in 1979 is arranged in 8 strata, by race, ethnicity, income, age and sex. For these cohorts, the Census Bureau drew a sample from an area probability sample of the U.S. stratified so as to produce segments of varying size but equal with respect to the characteristics of the target sample (OSU,1979:11). Seventy five thousand addresses were selected for screening interviews and from these the WORC identified a final sample of about 12,000 respondents between 14 and 21 years of age.   The new young men's cohort includes respondents who are serving in (or returned from) the armed forces. The Department of Defense provided lists of persons on active military duty to NORC for sample selection. In the first stage a sample of military units was drawn, then within these units separate samples of males and females were selected, including some respondents not living on military bases.     IV. Survey Design and Content   A. Design   1. Respondent Rules   Proxy responses are only accepted from relatives or other members of a sample person's household, if the sample person is temporarily incapable of answering questions. Specific questions eliciting opinions or attitudes are excluded from proxy interviews.   2. Reporting Units   Separate questionnaires are completed for each respondent in a multiple respondent household. Separate household record cards are also prepared, but data from one may be transcribed to another by the interviewer.   Household composition is recorded at certain interviews. CPS definitions are used for "household members." Household characteristics are tabulated as respondent attributes at each wave. OSU has prepared special tabulations of multiple respondent households, such as a fathers-and-sons tape, a siblings tape, etc.   3. Following Movers   Local government agencies, the Postal Service, neighbors and relatives, and others recorded at the first interview as knowledgeable about the respondent's whereabouts, are among the contacts that may be questioned to obtain the current address of a sample person who has moved. Respondents who have moved are con- tacted through the field office closest to their new location.   4. Weighting   The basic weight for each sample case is a reciprocal of selection probability, and reflects the differential sampling ratio by race. The samples have been weighted so that the characteristics for each wave match the known distribution of the characteristics in the population.     106     5. Interview Schedule   The original NLS plan called for annual interviews of each cohort for five years. To reduce costs, after 1968 the cohorts of adult men and adult women were interviewed only every other year. In 1972 all four cohorts were extended by including two annual telephone surveys and a personal interview on the tenth anniversary (1976-77). The entire survey was extended an additional 5 years in 1977, on the recommendation of a group of analysts and data users convened by the department of labor. After 1983 the older and younger men's cohorts were dropped, and the older and younger women's cohorts were extended 5 years (along with the new youth cohorts).   6. Interview Mode   For the original four cohorts the first and final waves consisted of personal interviews. Four of the intervening waves were conducted by telephone (5 for mature women), and one mail questionnaire was sent in 1968. The interview, schedule for the new youth cohort called for persona interviews in each year from 1979 to 1964.   B. Content   The NLS was originally composed of 4 separate longitudinal cohorts: Adult men, adult women, young men and young women. The cohorts represent four groups important to policy makers: men in the years leading to retirement (between 45 and 59 years old in 1966); women likely to be re-entering the labor market (between 30 and 44 years old in 1967); and young men and young women likely to be finishing their education and entering the labor market (boys between 14 and 24 years old in 1966 and girls between 14 and 24 years old in 1968).   The longitudinal survey of adult men was planned to answer specific research questions about retirement decisions, about skill obsolescence, about the duration of unemployment in this age group, and about the relationship between health and labor market experience.   The sample of adult women was designed to study women's entry or re-entry into the labor force after a period spent primarily in raising children. Special attention was paid to attitudes toward employment in general and towards the propriety of labor market activity for women in particular.   The cohorts of young men and young women were planned to provide information on the extent of occupational knowledge among teenagers, and on attitudes toward education and toward employment experiences. The new youth cohort was developed in 1979 to study employment patterns in low income and minority groups, and to look at changes since 1960.   Many of the interviewing procedures and labor force concepts used in the NLS were similar to those used in the Current Population Survey (CPS) and the Census Bureau's CPS interviewers were often assigned to do NLS     107     interviewing as well. Coding of occupation and industry continue to conform to the definitions used in the 1960 Census. Although for most recent 1980 codes are used as well.   Older Men's Cohort:   In each wave data were collected to measure employment and unemployment. For all jobs held since leaving school, the interviews collected occupation, industry, location and duration of employment. In addition, annual income and earnings were collected for each job, along with measures of job satisfaction.   Mature Women's Cohort:   The surveys of adult females contained similar questions about background and labor force participation. But in place of questions about retirement, there were questions designed to study the process of leaving and re-entering the labor force.   Background questions for women were designed to distinguish labor market participation before and after any interregnum that began with marriage. A large number of questions dealt with household structure and responsibilities for dependents,, including attitudes toward child care, costs and preferences for child care, the husband's health limitations, and husband's attitudes toward women working.   Young Men's and Young Women's Cohorts:   The questionnaires for the original youth cohorts Were similar in most ways the adult questionnaires. Among the unique variables were an inventory of current job characteristics which included variety and autonomy of tasks, feedback from supervisors, and opportunities for contact and friendships on the job. Union membership was measured in several waves, and a large number of questions measured educational performance and experiences. These included curriculum preferences in high school and college, college finances, and reasons for leaving school.   For young men, only, retrospective data on military service were collected, including military job series. For the young women's cohort, questions were asked relating to household dependents and child care responsibilities. These were identical to questions asked in the survey of adult females,, including the repeated measures of attitudes toward women working.   Intermittent Questions:   For the adult males, questions were asked in some waves pertaining to physical health, retirement plans, and attitudes toward women working. In other waves questions were asked about commuting times and costs, collective bargaining coverage, training after leaving school, spouse's health limitations, and military service. In two-waves there were questions calling for retrospective evaluations of career experiences, including perceptions of age, sex and race discrimination, perceptions of individual career progress, and perceptions about job pressures.   108       For the adult women's cohort, there were questions in some waves about volunteer activities, and questions on attitudes toward women working were repeated at intervals.   A number of attitude measures were collected intermittently for the young men's cohort. In the first interview a score for occupational knowledge was compiled, and in the final interview a standard index of job satisfaction was derived for young men. Questions were asked at intervals to evaluate job aspirations and expectations about education and training.   Data from Administrative Records:   For the adult cohorts, the size of the local area labor force, and the annual local unemployment rate were recorded in each file at each wave. In addition, for the adult female cohort, an index of local demand for female labor was also included.   For the youth cohorts, a standard IQ test was administered once to each respondent. The presence of an accredited college in the local area was recorded in each file during the first interview. An index of local demand for female labor was included in six waves for young women. For all the youths, background data were collected on the quality and curriculum of the schools that the respondents were attending at the tine of their selection for the sample.   V. Response   The possibility of sample attrition worried the designers of the NLS, but it does not appear that any major attrition biases have detracted from the reliability of generalizations about the populations which the MLS cohorts represent (OSU, 1982).   Over all, after 12 years of the survey, an average 80 percent of the eligible respondents were still being interviewed (U.S.:321). When a 5 year extension was considered for the original 4 cohorts, the Census studied the known characteristics of non- respondents, and concluded that after 15 years those still being interviewed were not significantly different from those who had dropped out of the survey, judging by most socio-demographic characteristics (OSU, 1982).   The attrition rates have differed by cohort. Three years after the first interview for adult males, almost 5 percent of these respondents were no longer eligible (through death or institutionalization) and about 92 percent of the remainder were interviewed.   The worst attrition has been in the original young men's cohort, perhaps due to the exclusion of those serving in the armed forces (Parnes:25). Of those interviewed in 1966, 1.4 percent were dead or institutionalized in 1968, and an additional 12.4 percent were out of scope because they were in the armed forces. Just under 89 percent of the remainder were interviewed.   109     The figures for women and girls were slightly better. One percent of the women were ineligible after 2 years, and almost 94 percent of those eligible were interviewed. For girls, over 93 percent of the eligible respondents were interviewed in 1970, 2 years after selection.   To monitor sample attrition in the four original cohorts, every 5 years the distribution of such characteristics as occupation, educational attainment, age and marital status was compared to national estimates. To compensate for attrition, interviews and non-interviews are stratified by race, education, and residential mobility, and the weight of interviews in each cell is adjusted for the proportion of non-interview cases in each wave. A final adjustment is made for the re-entry of young men serving in the armed services during the year the sample was selected (1965- 66).   There are no allocations or amputations for missing data to prevent inconsistencies with data from other waves. Only when missing data are clearly due to a record-keeping error are data from one item used to replace those from another.   In 1982, the characteristics of respondents still in the sample were compared to the characteristics of the sample interviewed in the initial year. Age, race, educational attainment, employment status, industry, occupation, marital status, SMSA, and annual income were all compared. For most cohorts, the differences in distribution of characteristics between the 2 samples were less than 2 percent. It was concluded that attrition had not seriously distorted the representativeness of the cohorts, and that any potential bias could be dealt with through weighting (Rhoton:7).   VI. Evaluations   To reduce attrition in the new youth cohort, several procedures were modified, based on experience with the 4 original cohorts. First, some questionnaire items that had caused response problems were changed. Second, more information was collected at the first contact that could be used in tracing mobile respondents. Third, more information about the NLS was provided to respondents, both before and after the interviews, and a newsletter is mailed to respondents to report on survey results. Finally, the NORC traced and contacted persons who were non-respondents in earlier waves. (Previously nonrespondents were dropped from the sample after 2 years of noninterviews.) This tracing was successful in over one- third of attempted cases (Rhotor :2-12).   VII. Data Products and Analysis   The Ohio State University makes WLS data files and documentation available to other researchers at cost. By 1979, data files were available for adult males 1966-76, for adult females 1967-76, for young men 1966-75. and for young women 1968-75. The data at any release point are composed of the entire longitudinal record, and include revisions to remove errors found in previous releases.   110       CASE STUDY 7   RETIREMENT HISTORY STUDY   I. Purpose   The Social Security Administration's Retirement History Study ( RHS) is a multiwave panel survey designed to address a number of policy questions relating to the causes and consequences of retirement. Among these questions are: Why do individuals retire before age 65? How well does income in retirement replace proretirement earnings? What happens to the standard of living after retirement How do Social Security and other laws affect retirement patterns?   Until the RHS was undertaken, data bearing on these Issues were based on retrospective questions from cross sectional surveys. A prospective longitudinal study permits accurate analyses of the factors influencing the retirement decision and an accurate description of the complex of personal adjustments required during preretirement and postretirement years.   II. Sponsors   The RHS was sponsored by the Social Security Administration under direction of staff in the Division of Retirement and Survivor Studies, Office of Research and Statistics. Early consultation was provided by an outside advisory committee. Data Collection was performed by the Bureau of the Census.   III. Sample Design   The original sample of 12,549 persons was a multi-stage area probability sample selected from members of households in 19 retired Current Population Survey rotation groups. The sample was nationally representative of persons age 58 through 63 in 1969. The sample included men of all marital status categories and women with no husband in the household. Married women were excluded because they were found in early pretests to have no independent retirement plans. Institutionalized persons were also excluded from the original sample.     IV. Survey Design and Content   A. Design   1. Respondent Rule   Proxy responses were accepted only for that part of the questionnaire dealing with spouse's labor force history. Sample persons who were not interviewed in the first wave (1969) were dropped from the survey. Respondents who were institutionalized 90 days or more at the time of subsequent waves were kept in the sample. All other noninterviews in later waves were dropped from the sample.   111       2. Reporting Units   The reporting units were designated sample members (individuals) only.   3. Following Movers   A year before each interview (after the first) the SSA provided the Census with current address listings for all sample persons and/or Spouses who were benefit recipients. In addition the Census checked all previous addresses with the post office to identify movers. Both these procedures reduced the number of unanticipated movers (especially between data collection regions) encountered at the time of interviewing. All movers were followed except those who emigrated or who lived more than 50 miles from any PSU.   4. Weighting   The weighting procedure began with a basic weight based on factors relating to the original CPS rotation groups and was followed by several stages of ratio estimation. Weighting for noninterviews was adjusted after 1969. No further weighting adjustments were made because by 1979 SSA ha,d determined that the differences between weighted and unweighted estimates were too small to justify the procedures.   5. Interview Schedule   Initial interviews were conducted in 1969 and then in alternate years through 1979. In each wave the interviews were conducted over a 3 to 4 month schedule (usually February to-June).   6. Interview Mode   The interview mode was personal and face-to-face. At each wave contact began with a letter from the Census informing the sample of the upcoming interview. Interviewers were encouraged to use telephone contacts to schedule their visits, but all interviews were by personal visit. Questionnaires with missing information could be completed by telephone.   B. Content   The interview schedule was designed to elicit a wide range of Information about preretirement lives and attitudes of sample members. The schedule was divided into six sections: (1) respondent's labor force history; (2) preretirement and retirement plans; (3) health; (4) household, family and social activities; (5) income, assets and debts for respondent, spouse and children under age 18; and (6) spouse's labor force history. Base-line labor force history was collected only in the first   112     interview (1969). This explains why all noninterviews In the 1969,wave were dropped from the sample. By collecting labor force history for the sample person's spouse, longitudinal data was available If a surviving spouse later replaced a deceased sample person as respondent. Survey data were also supplemented with individual Social Security earnings and benefit records, yielding information on the continuity of work history and the amount or benefits to which the workers were entitled.   V. Response   Of the original sample Of Just over 12,500 selected in 1969, 8,700 were interviewed in 1977. This included over 1,000 surviving spouses who were eligible to serve as respondents after the death of a sample person. At each wave nonresponse (composed of refusals, no contact, and persons institutionalized) seldom rose over 4 percent. The remaining attrition was caused by deaths among the sample. The low nonresponse rate was in part attributable to efforts made to contact respondents: no limits Were placed on the number of attempts interviewers should make. Some refusals were related to the length of the interviews. The first averaged an hour and 15 minutes long. In subsequent years the length of the Interview was the Most frequently cited reason for refusal to respond.   VI. Evaluation   (Unknown)   VII. Data Products and Analysis   Most of the published analyses have been organized into a series of reports that are available from the Social Security Administration.   113         CASE STUDY 8   WORK INCENTIVE EXPERIMENTS   I. Purpose   Section 505(a) of the "Social Security Disability Amendments of 1980" (Pub. L. 96-265) directs the Secretary of Health and Human Services (HHS) to develop and carry out experiments and demonstration projects designed to encourage disability insurance beneficiaries to return to work and leave the benefit rolls. The objectives of these experiments, specified in the law, are to generate long-range savings to the trust funds and to facilitate the administration of title II of the Social Security Act. Section 505(a) itself contains several suggestions for experimental variables, specifically   - Benefit reductions based on amount of postentitlement earnings.   - Lengthening the trial work period.   - Altering the 24 month waiting period for Medicare benefits.   - Changing the manner in which the program is administered.   The language in section 505(a) states explicitly that the experiments should be carried out in a way that permits thorough and complete evaluation and on a large enough scale so that the results may be generalized reliably to the future day-to-day operation of the disability program. In addition, the report of the House Ways and Means Committee indicates Congress' desire that no individual be disadvantaged compared to existing law.   II. Sponsors   This project, mandated by law (Pub. L. 96-265), directs the Secretary of HHS to carry out the experiments. Planning the experiments has been delegated to SSA. The law authorizes the use of disability insurance trust fund monies to pay for the experiments and authorizes the Secretary to waive the present benefit and eligibility requirements of titles II, XVI and XVIII to the extent necessary to carry out the experiments.   ____________________________   *Since its mandate in the Disability Insurance Amendments of 1980 (Pub. L. 96-265), the Social Security Administration disability program work incentive experiments have undergone a number of designs. Due to a number of administrative problems and the imminent deadline of the legislative mandate no experimental plan has yet to be implemented. Legislative extension of the experimental authority is now under consideration. For expository purposes the plans developed in the Fall of 1982 are presented.   115       III. Sample/Experimental Design   A. The Study Population   The study population for the WIE consists of all newly awarded beneficiaries except those who fall in one of the following categories:   - Under age 18 or over age 59 at time of award.   - Residing outside the 48 contiguous States or in an institution.   - Received a closed period award.   - Previously entitled to DIB.   - Dually entitled to DI and to title II auxiliary benefits.   - Statutorily blind.   - Career railroad case certified to the ERB for payment.   B. Experimental Design   1. Programmatic Changes   Sample sizes for each experimental group and the control group have been determined in an attempt to insure the ability to measure important increases in the proportion of work recoveries. Our best estimate is that under current law about three percent of a newly awarded beneficiary cohort will have their benefits terminated after successful completion of a trial work period. We estimate that for the proposed experimental alternatives, a one percentage point increase in the recovery level (that is, a change from three to four percent) would yield significant trust fund savings, on the order of $100 million per year or larger. Thus, the sample sizes we choose insure a good chance of detecting a one percentage point change if this change occurs in any experimental group. The required sample size total 21,000 cases, including 3,000 for each of the five experimental groups and 6,000 for the control group.   Schematically, the design of the WIE and the sample sizes and allocations can be pictured as follows:   116   Medicare extension     Click HERE for graphic.     Group T.1 represents a control group operating under the provision of the current law. For each of the experimental groups, T.2-T.6 inclusive, only the programmatic change(s) specified applies.   2. Administrative Changes   Two administrative changes will be instituted to assure that the WIE operates effectively and efficiently. These changes are (1) a face-to-face interview at the start of the experiments that explains the experimental changes to the participating beneficiaries, and (2) use of a quarterly report of work and earnings. With these up-to-date reports it is Possible to minimize the problems benefit overpayments. These changes in themselves may alter beneficiary behavior. The experiments is therefore designed to test whether these administrative changes have a direct effect an recovery.   117     The following experimental group make up this portion of the WIE experimental design:   Click HERE for graphic.   This scheme takes advantage of the 6,000 cases that will already serve as the control group for the WIE. As a result, only an additional 9,000 cases would be required to study the impact of the two administrative changes being tested.   The considerations used in determining sample size and the allocation of cases among the four test groups involved in this portion of the experiment are essentially the same as those discussed for the programmatic revisions. It should be pointed out that none of these cases (the 6,000, as well as, the additional 9,000) will involve either increased benefit payments or Medicare reimbursements. They all operate under present program provisions.   C. Sample Design   1. Stratification   In order to improve the efficiency of the experimental design the award population will be Stratified by two factors -- age and medical diary status. Since younger beneficiaries are more likely to return to work and leave the benefit rolls, they are likely to take advantage of the experimental provisions than older beneficiaries. Beneficiaries who are scheduled for medical reexaminations might be less likely to be granted trial work periods because they are judged more likely to recover.   118       The following table defines four age/diary strata.   Medical Stratum Age diary   S1 18-44 Yes (young)   S2 18-44 No   S3 45-59 Yes (old)   S4 45-59 No   Taking these strata into account, the full experimental design has the following dimensions:   Experimental Stratum group Total S1 S2 S3 S4     Total 36,000 3,600 7,200 3,600 21,600   T1 6,000 600 1,800 600 3,000   T2 3,000   T3 3,000   T4 3,000   T5 3,000 300 900 300 1,500   T6 3,000   T7 3,000   T8 3,000 T9 3,000   T10 6,000 600 1,800 600 3,000     The allocation to stratum will be roughly proportionate to size.     119     Note that an additional experimental group, T.10, is shown. This group represents a "silent" control group. The beneficiaries in this group do not receive any program or administrative changes, as is the case for group T.9. The beneficiaries in T.10, however, will not be processed by the WIE review unit. This allows us to test the experimental effect of establishing the special unit itself through comparison of T.9 and T.10 outcomes. Thus, the total number of beneficiaries with any involvement in the WIE is now 36,000.   2. Geographic Clustering and Stratification by Dat a of Award   In order to limit the impact of the face-to-face treatment application on SSA field staff and Costs, SSA's Office of Field Operations has asked that WIE sample cases be in no more than 200 SSA districts. (A district is defined to be an SSA district office and its associated branch offices.) We, therefore, group the WIE population into clusters of SSA districts. The selection of a sample of clusters is the first stage of selection for the WIE sample.   The size of these clusters depends on a number of interrelated requirements. The first requirement is our desire to put a full replicate of the experimental design (or multiples thereof) into each cluster of districts as indicated in the following table:     Experimental Stratum group Total S1 S2 S3 S4     Total 120 12 24 12 72   T1 20 2 6 2 10   T2 10   T3 10   T4 10   T5 10 1 3 1 5   T6 10   T7 10   T8 10 T9 10   T10 20 2 6 2 10         120         One hundred and twenty cases is the minimum number required to simultaneously satisfy the allocations discussed above among the strata and among the experimental groups .   Placing a full replicate In each cluster induces orthogonality between treatment (and strata) and cluster and facilities the analysis of experimental results. In particular, under the assumption of no interaction between treatment and cluster in producing experimental outcomes, the association between treatment and outcome can be measured by tabulating treatment (and strata, if necessary) results alone essentially ignoring geographic effects. The ability to display the results of the experiments in an uncomplicated manner is of great importance in presentations to those persons responsible for program and operating policy.   The second aspect of the determination of minimum cluster size is that each cluster should have a high probability of providing the necessary number of sample cases in each stratum to complete the design; that is, 12 cases for S1, 24 for S2,, 12 for S3 and 72 for S4. It turns out that a population of 250 will yield the needed cases with a probability greater than .998.   The third aspect to be, considered is that the number of districts in the sample must not exceed 200. This constraint has implications for the length of the sampling period. There are about 614 districts contained In the 48 contiguous States with an average of about 350 new awards per district per year. Since the sample each cluster will require 250 awards to achieve a 120 case replicate, about 75,000 awards will have to be available to obtain the full 36,000 case sample. Since 200 districts can supply about 70,000 cases a year, the 200 district constraint implies the need for a 1 year sampling period.   The 1 year sampling frame will be divided into 6 bimonthly sampling periods, with a full 120 case replicate of the design going into each cluster of districts in each sampling period. Each cluster will need to supply 1,350 awards in each year. Since each cluster supplies 720 (120 times 6) sample cases, 50 clusters are required for the sample to complete the design in 1 year (50 x 720 = 36,000).   IV. Survey Design and Content   A. Design   1. Respondent rule.   No proxy responses are accepted.   121     2. Reporting units.   Individual beneficiary and spouse.   3. Following movers.   All movers will be followed.   4. Weighting.   The basic weight for each sample case will be the reciprocal of estimation is the probability of selection. No need for ratio anticipated.   B. Interview Schedule/Mode and Content   In addition to data from administrative records, a baseline questionnaire and followup mail questionnaire will be administered.   At the start of the experiment, field personnel will contact all persons (except those in T1.1 and T1.3 and the silent control group) to explain to them in person. At that time the interviewer will administer a short questionnaire designed to obtain data on demographic characteristics, family composition, amount and source of family income and private disability insurance benefits. The questionnaires will be mailed to members of groups that are not contacted for face- to-face interviews.   A supplemental mail questionnaire will be sent every 6 months over 4 years to a subsample of 10,000 beneficiaries. The questionnaire will be designed to elicit information that will update the baseline interview and describe how beneficiaries find jobs and the factors involved in the success or failure Of sustained work.   V. Response   Since all participants will be tracked through administrative records, there will be no actual attrition from the study. Response to the supplemental questionnaires is expected to be high because they will be administered in conjunction with required administrative reports.   VI. Evaluation   None planned.   VII. Analysis Plans (See text discussion.)   122         CASE STUDY 9   NATIONAL MEDICAL CARE EXPENDITURE SURVEY   I. Purpose   The National medical care Expenditure Survey (NMCES) was designed to assess the use of health care services and to determine the patterns and character of health expenditures and health insurance for the U.S. noninstitutionalized civilian population in 1977. The survey was conducted by the National Center for Health Services Research (NCHSR), as part of a landmark study, the National Health Care Expenditures Study (NHCES), which is providing information on a number of critical issues of national Health policy. Topics of particular interest to government agencies, legislative bodies, health professionals, and others concerned with health care policies and expenditures include:   - The cost, utilization, and budgetary implications of changes in federal financing programs for health care and of alternatives to the present structure of private health insurance. - The breadth and depth of health insurance coverage. - The proportion of health care costs paid by various insurance mechanisms. - The influence of Medicare and Medicaid programs on the use and costs of medical care. - How and why Medicaid participation changes over time. - Patterns of use and expenditures as well as sources of payment for major components of care. - The cost and effectiveness of different federal, state, and local programs aimed at improving access to care. - The loss of revenue resulting from current tax treatment of medical and health insurance expenses, particularly with regard to the benefits currently accruing to different categories of individuals and employers, and the potential effects on the federal budget of proposed changes to tax laws. - How costs of care vary according to diagnostic categories and treatment settings.   The data for these studies were obtained from the National Medical Care Expenditure Survey (NMCES), which has provided the most comprehensive statistical picture to date of how health services are used and paid for in the United States. The survey was completed in September, 1979.   Data were obtained in three separate, complementary stages. About 14,000 randomly selected households in the civilian, noninstitutionalized population were interviewed six times over an 18-month period during 1977 and 1978. This survey was complemented by additional surveys of physicians and health care facilities providing care to household members during 1977 and of employers and insurance companies responsible for their insurance coverage.   123           II. Sponsors   Funding for NMCES was provided by National Center for Health Services Research, which co-sponsored the survey with the National Center for Health Statistics. Data collection for the survey was done by Research Triangle Institute, NC, and its subcontractors, National Opinion Research Center of the University of Chicago, and Abt Associates, Inc., of Cambridge, MA. Data processing support is being provided by Social and Scientific Systems, Inc. of Washington, D.C.   III. Sample Design   The survey sample was designed to produce statistically unbiased national estimates that are representative of the civilian noninstitutionalized population of the United States. To this end, the study used the national multi-stage area samples of the Research Triangle Institute and the National Opinion Research Center. Sampling specifications required the selection of about 14,000 households. Data were obtained for about 91 percent of eligible households in the first interview and 82 percent by the fifth interview.   The NMCES area sampling design can be characterized as a stratified three-stage area probability design from two independently drawn national area samples. The fourth stage involved the selection of ultimate sampling units (e.g., housing units and a special class of group quarters). An essential ingredient of this design is that each sample element has a known, nonzero selection probability. Also, the national general purpose area samples from the Research Triangle Institute (RTI) and the National Opinion Research Center (NORC) used in the survey are similar in structure and, therefore, compatible. Except for difficulties associated with survey nonresponse and other nonsampling errors, statistically unbiased national and domain estimates can be produced from each sample or from the two samples combined.   The first stage-in both designs consists of primary sampling units which are counties, parts of counties, or groups of contiguous counties. The second stage consists of secondary sampling units which are census enumeration districts or block groups (Bureau of the Census, 1970). Smaller area segments generally consisting of at least 60 housing units constitute the third stage in both designs; a subsample of households was randomly selected from each of these segments in the final stage of sampling. Combined stage specific sample sizes for the two designs were 135 primary sampling units (covering 108 separate localities), 1,290 secondary sampling Units, and 1,290 segments. Here, the number of separate primary areas is less than the sum of the number of primary sampling units in the two national primary samples since units from some of the large Standard Metropolitan Statistical Areas (SMSAS) were selected in both samples. Selection procedures for the fourth stage included a disproportionate sampling scheme to obtain a target of 3,500 uninsured households.   IV. Survey Design and Content   As noted, about 14,000 households participated in six separate rounds of interviews during 1977 and early 1978. The first interviews began in mid January 1977; subsequent rounds of interviews were conducted at intervals of about three months. The first, second, and fifth rounds of interviews were     124           conducted in person, as were about 20 percent of the third and fourth rounds and about half of the sixth round; the remainder were conducted by telephone.   During each of the first five rounds of interviews, information was obtained on use of medical services, charges for services and sources of payment, numbers and types of disability days, and status of health insurance coverage. Data collected during the first interview covered the period from January 1, 1977, through the date of interview. Data collected during the second, third, and fourth rounds covered the period from the immediately preceding interview through the date of the current interview. The fifth interview covered the period from the previous interview through December 31, 1977.   Beginning in the second round of interviews and continuing through the fifth, the household respondent was asked to review a computer-generated summary of data previously reported on health care services received and costs. This review permitted a check for accuracy and completeness and provided the necessary information to check continuity among the interview rounds for such data as health insurance coverage and charges for multiple services.   The sixth round of interviews consisted of a series of supplemental questions covering limitations of activity, status of income tax filing, and the amount of itemized medical deductions. Supplemental questions also were asked during the second through fifth round interviews. These questions covered employment, health insurance, access to health care, barriers to care, ethnicity, and income and assets.   In addition to answering questions, each survey participant was asked to sign a permission form so that each physician or facility that had been reported as providing medical care during 1977 could release information about the patient. In cases where a person had not reported receiving medical care in 1977 from his usual source of medical care, a permission form for his usual source of medical care was requested. Persons with health insurance policies were asked to sign a permission form authorizing release of information by the employer, union group, or insurance company. When employed persons reported no health insurance coverage, they were asked to sign a permission form authorizing the employer to provide information about the insurance coverage that was available. These forms were collected at various times during the survey and provided data which was the basis for the subsequent surveys of medical providers and health insurers.   V. Response Rates   Data were obtained for approximately 91 percent of eligible households in the first interview and 82 percent by the fifth interview. Of 38,815 participants in the NMCES, 4146, or 10.7 percent failed to respond for the entire time period of 1977 for which they were eligible to respond. For example, a person could have refused participation after initially cooperating -in the first interview by not responding for the remainder of the interviews. Similarly, the inability to reestablish contact with a participant after change of residence would result in this type of nonresponse. This problem of partial nonresponse is not limited or unique to the NMCES, but characteristic of national panel surveys in general.   125         VI. Evaluation Component   The NMCES used several methodological innovations to insure data reliability. During each round of interviews, respondents were asked to report the diagnosis, total charge and sources of payment for each inpatient hospital stay, medical provider visit, dental visit, prescription drug, or purchase of eyeglasses or other medical equipment. In addition, respondents were asked to provide information about their health insurance coverage. Data on health care use and expenditures were updated each round through the use of a computerized summary of the information reported in the previous interview. Respondents were asked to review this information and make any needed additions or corrections. In particular, the summary was expected to allow respondents a means to provide more complete charge and payment data at a later date if it was unknown at the time of the interview. All respondents were asked to complete the summary. Approximately 32 percent of household survey respondents were also included in the medical provider survey. The medical provider survey (MPS) was a record check or verification procedure to obtain expenditure and diagnostic data from physicians and hospitals who treated a sample of household respondents during the year. Thus, for each person in the household survey the data obtained from the questionnaire was checked in a subsequent interview through the summary mechanism and in about a third of the cases, subjected to verification through the MPS. In addition, household data on health insurance coverage was verified through the Health Insurance/Employer Survey (HIES) which collected, for each private health insurance plan reported in the household survey, data from employers, insurance carriers or other insuring organizations.   VII. Data Products and Analysis   NCHSR has developed National Medical Care Expenditures Survey data files and documentation for public use. As of spring 1985, over 100 different research studies based on NMCES data had been published. A detailed Annotated Bibliography of Studies from the National Medical Care Expenditure Survey is available from the National Center for Health Services Research.   126         CASE STUDY 10   NATIONAL MEDICAL CARE UTILIZATION AND EXPENDITURE SURVEY   I. Purpose   The National Medical Care Utilization and Expenditure Survey (NMCUES) was designed to collect data on health, access to and use of medical services, charges and sources of payment for medical services, and health insurance coverage for the U.S civilian noninstitutionalized population during 1980. NMCUES was developed from a series of surveys concerning health, health care, and expenses for health carp. However, NMCUES drew most heavily from two surveys -- the National Health Interview Survey (HIS) and the National Medical Care Expenditure Survey (NMCES).   The HIS is a continuing survey that began in 1957 and is conducted by the National Center for Health Statistics (NCHS). Its primary purpose is to collect information on illness, disability, and use of medical care. Although some medical expenditure and insurance information has been collected in the HIS, a cross- sectional survey design was inefficient for obtaining complete and accurate Information of this type. It was concluded that a panel survey procedure would be required, and a pilot survey was conducted for the NCHS by the Johns Hopkins University Health Services Research and Development Center and by Westat Research, in 1975-76.   Based on information obtained during the pilot study, the National Center for Health Services Research (NCHSR) and NCHS cosponsored the National Medical Care Expenditure Survey in 1977 - 78. This was a panel survey for which households were interviewed six times to obtain data for 1977.   NMCUES was similar to the NNCES in survey design and questionnaire wording, to allow analysts of change during the 3 years between 1977 and 1980. Both NMCUES and NMCES are similar to the HIS in terms of question wording in areas common to all three surveys. However, each survey is different with special emphasis on different areas. Together they provide extensive information on illness, disability, use of medical care, costs of medical care, sources of payment for medical care, and health insurance coverage at two points in tine.     II. Sponsors   NMCUES was cosponsored by NCHS and the Health Care Financing Administration (HCFA). Data collection was provided under contract by the Research Triangle Institute (RTI) of Research Triangle Park, North Carolina, and its subcontractors, National Opinion Research Center (NORC) of Chicago, Illinois, and SysteMetrics, Inc., of Santa Barbara, California. The contract was awarded in September, 1974.     III. Sample Design NMCUES utilized two frames, the first to provide a national household sample and the second to provide a State Medicaid household sample. The process of selecting each sample was different, and is described separately.   127       A. The National Household Sample:   The NMCUES sample of dwelling units is derived from two independently selected national samples; one provided by RTI and the other by NORC. The sample designs used by RTI and NORC are quite similar with respect to principal design features. Both can be characterized as self-weighting, stratified, multistage area probability designs. The principal differences between the two designs are the type of stratification variables and the specific definitions of sampling units at each stage.   B. The State Medicaid Household Sample:   The November, 1979 Medicaid eligibility files in California, Michigan, New York and Texas Were used as frames to select a sample of cases for the State Medicaid household component of the survey. A case generally consisted of all members of a family receiving Medicaid within the same category of aid. The State aid categories were collapsed into three or four strata, depending on the State. These were: (1) aid to the blind and disabled; (2) aid to the elderly (those with Supplementary Security Income); (3) Aid to Families With Dependent Children (AFDC);and (4) State only aid in California, Michigan, and New York, which provided some Medicaid coverage without Federal reimbursement. Cases in other Federal aid categories were excluded from the target population because the counts were too few to permit separate stratification. Approximately equal numbers of cases were selected from each stratum, and cases were clustered by zip codes for ease of interviewing. The lack of a central automated eligibility file in New York State (outside of the five New York City boroughs and a few other counties) required selection of counties before stratification. Within many of these counties, the lack of automation also required cases to be selected without consideration of zip codes.   C. Links to Administrative Records:   In addition to the data collected during interviews with sample households, another phase of data collection occurred after the final round of household interviewing was completed. Medicaid and Medicare numbers provided by the household were used to extract data from the Medicaid files of the Federal government. Data from the administrative records were merged with the household data to increase the analysis capabilities of the data.   IV. Survey Design and Content   A. Design   1. Respondent Rules --   The respondent for the interview was required to be a household member, 17 years of age or older. A non-house- hold proxy respondent was permitted only if all eligible household members were unable to respond because of health, language, Or mental condition.   128           2. Following Movers --   The rules for following movers were slightly different for the national household samples and the State Med4caid sample. First, for the national household survey all persons living in the housing units or group quarters at the tine of the first interview contact became part of the sample. Unmarried students 17 - 22 years of age who lived away from home were included in the sample if the - parent or guardian was included in the sample. In addition, persons who died or were institutionalized between January 1st and the date of first interview were included in the sample if they were related to persons living in the sampled housing units or group quarters. All of these persons were considered "key" persons, and data were collected for them for the full 12 months or 1980 or for the proportion of time they were part of the U.S. civilian noninstitutionalized population. In addition, babies born to key persons were also considered key persons, and data were collected for them from the time of birth.   Relatives from outside the original population (i.e., institutionalized in the Armed Forces, or outside the United States between January 1 and the first interview) who moved in with key persons after the first interview also were considered key persons, and data were collected for then from the time that they joined the key person. Relatives who moved in with key persons but were part of the civilian noninstitutionalized population on January 1, 1980, were classified as "non-key" persons. Data were collected for,non-key persons for the time that they lived with a key person. Because non-key persons had a chance of selection in the initial sample, their data will not be used for general analysis. However, data for non-key persons are used for family analysis because they do contribute to the family's utilization of and expenditures for health care during the time that they are a part of the family.   For the State Medicaid sample, interviewers obtained information for each eligible member of each case. Case members who d4ed before January 1, 1980, or who were continuously institutionalized between January 1, 1980 and the first interviewer contact, were excluded from the survey. Any related person living with a case member when the interviewer contacted the Household also was designated a key person, and was tracked for the complete year.   In addition, babies born to key persons were considered key persons, and data were collected for them from the time of birth. Relatives outside the U.S. noninstitutionalized population between January 1 and the date of the first interview who moved in with a key person after the first interview also were considered key persons. Data were collected for them for the remainder of 1980. Persons who     129           were part of the U.S. noninstitutionalized population on January 1, 1980 and who moved in with a key person after the first interview, were classified as non-key persons; data were collected only for the time that non-key persons lived with a key person. These non- key persons are included only in family analysis.   3. Weighting --   For the analysis of NMCUES data, sample weights are required to compensate for unequal probabilities of selection, to adjust for the potentially biasing effects of failure to obtain data from some persons or households (i.p., nonresponse), and failure to cover some portions of the population because the sampling frame did not include them (i.e., undercoverage).   Basic Sample Design Wieghts' -- Development of weights reflecting the sample design of NMCUES was the first step in the development of weights for each person in the survey. The basic sample weight for a dwellIng unit is the product of four weight components which correspond to the four stages of sample selection. Each of the four weight components is the inverse of the probability of selection at that stage (when sampling was without replacement), or the inverse of the expected number of selections (when sampling was with replacement and multiple selections of the sample unit were possible).   - Two Sample Adjustment Factor -- As previously described, the NMCUES sample is comprised of two independently selected samples. Each Sample, together with its basic sample design weights, yields independent unbiased estimates of population parameters. As the two NMCES samples were of approximately equal size, a simple average of the two independent estimators was used for the combined sample estimator. This is equivalent to computing an adjusted basic sample design weight by dividing each basic sample design weight by two. In the subsequent discussion, only the combined sample design weights are considered.   Ratio Adjustment (Household Level) -- The basic sampling weights were adjusted decrease sampling variation and to compensate for household level nonresponse and undercoverage. In total there were 63 ratio adjustment cells which were formed by cross-classify4ng race, age, and type of household head and size of household. Estimates from the 1980 CPS were used for population controls.   - Ratio Adjustment (Person Level) -- The household level adjusted weights were further ratio adjusted at the person level. A total of 59 ratio adjustment cells (based on age, race and sex) were utilized. Population controls, which were provided by the U.S. (Census bureau, were based on projections from the 1980 Census. 130         4. Interview Schedule   The sample dwelling units were interviewed at approximately 3 month intervals beginning in February, 1980 and ending March, 1981. The core questionnaire was administered during each of the five interview rounds to collect data on health, health care, health care charges, sources of payment, and health insurance coverage A summary of responses was used to update information reported in previous rounds. Supplements to the core questionnaire were used during the first, third, and fifth interview rounds to collect data that did not change during the year, or that were needed only once.   b. Interview Mode   Approximately 80 percent of the third and fourth round interviews were conduct by telephone; all remaining interviews were conducted in person.   6. Survey Costs   The basic survey design and data collection contract with RTI and NORC cost approximately $18.9 million dollar.   B. Content:   1. Core and Intermittent Questions --   The repetitive core of questions for NMCUES included health insurance coverage episodes of illness, the number of bed days, restricted activity days, hospital admissions, physician and dental visits, other medical care encounters, and purchase of prescribed medicine. For each contact with the medical care system, data were obtained on the nature of the health conditions, characteristics of the provider, services provided, charges, sources, and amounts of payment. Questions asked only once included data on access to medical care services, limitation of activities, occupation, income, and other sociodemographic characteristic.   2. Cross-wave Controls   Collection of data from the households was facilitated by the use of a calendar and a summary. At the time of the first interval, the household respondent was given a calendar on which to record information about health problems and health services utilization, and to assemble physician and other provider bills between interviews. Following each household interview, information about health provider contracts and the payment of charges associated with them was used to generate a computer summary of information provided. This summary was then printed out in a simple format and mailed to the household for review of its accuracy and completeness prior to the next interview. At the subsequent interview, the interviewers reviewed this information with the household respondent to ensure accuracy and to obtain information not available during a previous interview.   V. Response   A. Survey Nonresponse   Response rates for households and persons in the NMCUES were high, with approximately 90 percent of the sample households agreeing to participate in the survey, and approximately 94 percent of the individuals in the participating house   131     holds supplying information. Even though the overall response rates are high, survey based estimates of means and proportions may be biased if nonrespondents tend to have different health care experiences than respondents, or of there is a substantial response rate differential across subgroups of the target population. Furthermore, annual totals will tend to be underestimated unless allowance is made for the loss of data cue to nonresponse.   Two methods commonly used to compensate for survey nonresponse are data imputation and,the adjustment of sampling weights. For NMCUES, data imputation was used to compensate for attrition and for item nonresponse, and weight adjustment was used to compensate for total nonresponse. The calculations of the weight adjustment factors were discussed previously in the section on sampling weights.   1. Attrition Imputation --   A special form of the sequential hot deck imputation method was used for attrition Amputation. First, each sample person with incomplete annual data (referred to as a "recipient") was linked to a sample person with similar demographic and socioeconomic characteristics who had complete annual data (referred to as a "donor"). Secondly, the time periods for which the recipient had missing data were divided into two categories: Imputed eligible days and imputed ineligible days. The imputed eligible days were those days for which the donor was eligible, in scope) and the imputed ineligible days were those days for which the donor was ineligible (i.e., out of scope).   The donor's medical care experiences such as medical,provider visits, dental visits, hospital stays, etc., during the imputed eligible days were imputed into the recipient's record for those days. Finally, the results of the attrition imputation were used to make the final determination of a person's respondent status. If more than two-thirds of the person's total eligible days (both reported and imputed) were imputed, then the person was considered to be a total nonrespondent and the data for the person was removed from the data file.   2. Item Nonresponse and Imputation --   Among persons who are classified as respondents, there is still the possibility that they may fail to provide information for some or many items in the questionnaire. In the NMCUES, item nonresponse was particularly a problem for expenditures for health care, income, and other sensitive topics. The extent of missing data varied by question, and imputation for all items in the data file would have been expensive. Imputations were made for missing data on key demographic, economic, and expenditure items across the five data files in the Public Use Data Tape. Table 1 (page 13) illustrates the extent of the item nonresponse problem for selected survey measures which received imputations in the four data files used in this report.   Demographic items tend to require the least amount of imputation, some at insignificant levels such as for age, sex, and education. Income items had higher levels of nonresponse, and for total personal income, which is a cumulation of all earned income and 11 sources of unearned income, nearly one-third of the persons required imputation for at least one component. The bed disability days, work loss days, and cut down days have levels of imputation that are intermediate between the   132       demographic and income items.   The highest levels of imputation occurred for the important charge items on to various visit, hospital stay, and medical expenses files. Total charges for medic visits, hospital stays, and prescribed medicines and other medical expense records were imputed for 25.9, 36.3, and 19.4 percent of the events, respectively. Among the source of payment data, the imputation rates for the source of payment were small, but the rates for the amount paid by the first source of payment was genera subject to high rates of imputation. Nights hospitalized on the hospital stay file was imputed at a rate comparable to the first source of payment.   The methods used to impute for missing items were diverse and tailored to the measure requiring imputation. Three types of imputation predominate: Editing or logical amputations; a sequential hot deck; and a weighted sequential hot deck.   The imputation process will be described for two items t o illustrate the nature of imputation for the NMCUES. For Hispanic Origin, two different imputation procedures were used; logical and sequential hot deck. Since Hispanic Origin was not recorded during the interview for children under 17 years of age, a logical Amputation was made by assigning the Hispanic Origin of the head of the household to the child. For the remaining cases which were not assigned a value by this procedure, the data were grouped into classes by race of the head of the house-hold, and within classes the data were sorted by household identification number, primary sampling unit, and segment. An unweighted sequential hot deck was used to impute values of Hispanic Origin for the remaining cases with missing values.   The imputations for medical visit total charge were made after extensive edit, had been done to eliminate as many inconsistencies as possible between sources of payment data and total charge. The medical visit records were then separated into three types: Emergency room, hospital outpatient department, and doctor visit Within each type, the records were classed and sorted by several measures which differed across visit types prior to a weighted hot deck imputation. For example, for doctor visits the records were classified by reason for visit, type of doctor seen, whether work was done by a physician, and age of the individual. Within the groups formed by these classing variables, the records were then sorted by type of insurance coverage and the month of visit. The weighted hot deck procedure was then used to impute for missing total charge, sources of payment, and sources of payment amounts for the classified and sorted data file.   Since amputations were made for missing items for a large number of the important items in the NMCUES, they can be expected to influence the results of the survey in several ways. In general, the weighted hot deck is expected to preserve the means of the nonmissing observations when those means are for the total sample or classes within which amputations were made. However, means for other, subgroups, particularly small subgroups, may be changed substantially by imputation.   In addition, sampling variances can be substantially underestimated when impute values are used in the estimation process. For a variable with one-quarter of its values imputed, for instance, sampling variances based on all cases will be based on one-third more values than were actually collected in the survey for the given item. That is, the variance would be too small by a factor of one-third, at least. Finally, the strength of relationships between measures which received imputations can be substantially attenuated by the imputation.   133           VI.Analysis and Evaluation   Since 1980 NCHS has awarded a number of contracts for the review and analysis of NMCUES data to evaluate the quality of the data and the data collection and processing methods. This includes a contract with Westat (of Rockville, Maryland) to evaluate NMCUES data collection,and data processing and a series of 3 contracts with the University of Michigan to analyze findings related to physicians charges, patient expenditures and sources of payment. Another contract, with Applied Management Sciences, examined family characteristics and expenditures for healthcare.   VII. Data Products   Data from the NMCUES are available with documentation on public use tapes from the National Technical Information Service, a division of the Department of Commerce in Springfield, Virginia. Additional information concerning the public use tapes is available from the Utilization and Expenditure Statistics branch, NCHS.   Findings from the survey were presented in official publications primarily from the government's Public Health Service and Health Care Financing Administration 1983 - 85. A number of analyses of NMCUES appeared in a Working Paper series published by the NCHS which now has over 20 titles, as well as in professional journals dealing with public administration and public health.   134           Table 1. Percent of Data Imputed for Selected Survey Items in Four of the NMCUES Public Use Data Fi1es     Tape Location Survey Item Percent Imputed   Person File (n = 17,123)   Age 0.1 Race 20.1 (1) Sex 0.1 Highest Grade Attended 0.1 Perceived Health Status 0.8 Functional Limitation Score 3.2   Number of Bed Disability Days 7.9 Number of Work Loss Days 8.9 Number of Cut Down Days 8.2   Wages, Salary, Business Income 9.7 Pension Income 3.5 Interest Income 121.6 Total Personal Income 30.4(2)   Medical Visit File (n = 86,594)   Total Charge 25.9 First Source of Payment 1.8 First source of Payment Amount 11.6   Hospital Stay File (n = 2,946)   Nights Hospitalized 3.1 Total Charge 36.3 First Source of Payment 2.2 First Source of Payment Amount 17.6   Medical Expenses File (n = 58,544),   Total Charge 19.4 First Source of Payment 2.9 First Source of Payment Amount 10.0   (1) Race for Children under 14 imputed from race of head (2) Cumulative across 12 types of income   135     CASE STUDY 11   LONGITUDINAL ESTABLISHMENT DATA FILE   Historically the economist has relied upon aggregate economic information from various sources (including the Census of Manufactures and Annual Survey of Manufactures (ASM) programs) to investigate the changing structure of the manufacturing sector of the United States economy. It has not been possible to observe the variations in behavior among establishments (plants) or to determine how changes in the behavior of individual establishments affected the enterprise (firm) or the aggregate statistical totals. The Census Bureau has developed a Longitudinal Establishment Data (LED) file which, when coupled with recent advances in econometric computer software, makes possible a wide range of empirical analysis at the manufacturing establishment level.   The LED file was developed in cooperation with the National Science Foundation under the general direction of Nancy and Richard. Ruggles of Yale University. The LED file is a time series of economic variables collected from manufacturing establishments in the Census of Manufactures and Annual Survey of Manufactures programs. The LED file contains establishment level identifying information; basic information on the factors of production (inputs, such as levels of capital, labor, energy and materials) and the products produced (outputs); and other basic economic information used to define the operations of a manufacturing plant. The LED file resides in a random access database environment which facilitates immediate access to individual data values.   History   The ASM program was initiated in 1949 and provides detailed economic information on the functioning of manufacturing plants in intercensal years. Since the inception of the ASM program the Census Bureau has understood the potential of linking establishment records across ASM survey years to create a longitudinal micro level data file suitable to perform time series analysis. The Ruggles' were particularly interested in developing such a file for various types of macroeconomic studies.   The first real attempt at creating such a file was undertaken in the late 1950's using the 1954 Census of Manufactures as a starting point. This first attempt tried to match establishments across time using survey identification numbers as keys. While a significant portion of the establishments had retained their identification numbers for several years, many identification numbers had been changed and no audit trail was maintained. There was really no way of linking such establishments except by laborious search of the name and address records in the mailing directory. In those days, shuttle forms were used and thus the linkage of identification numbers in different years was not critical in order to measure year-to-year change in manufacturing establishments.   This first attempt at a matching of identification numbers required a labor intensive effort to ensure accurate matches. This experience led to modifications in the ASM processing that placed greater responsibility on the directory to document identification number changes and to link old and new identification numbers. It also led to the introduction of the concept of the permanent plant number that would be assigned to an establishment throughout its life in the ASM program. This permanent identification number became critical not only to the directory controls but also to new methods of editing and tabulation.   137           Considerable staff and computer time were expended on this first effort and a large segment of the ASM file was successfully matched for the years 1954-1962. However, since the computer record for many establishments did not include all corrections resulting from the survey review, and because many nonmatches were left unresolved, the file was not developed to the extent necessary to be usable for a wide variety of longitudinal studies.   The first effort at creating a time series file of establishment level microdata was discontinued in 1968 because of budget restrictions. However, the experience gained from the first effort added significantly to the directory, editing and tabulation techniques used in the ASM; specifically the computer edit of the Census and ASM programs were modified to incorporate more year-to- year analysis.   During the 1970's several major advances were made at the Bureau which made it possible to renew the effort to develop a longitudinal establishment file. First, the Industrial Directory was started in 1972 which solved the problems of linkage of identification numbers due to changes in ownership. Second, the establishment correction system introduced into the Census and ASM programs in 1979 assures that all corrections made by the staff during the review of the data are applied to the data records. Prior to 1975, budgetary constraints prevented the complete correction of the computer data files, although the corrected data were included in the official published statistics.   The current effort to develop the LED file was undertaken as a joint effort by the Census Bureau and Richard and Nancy Ruggles of Yale University, with funding provided by the NSF and the Small Business Administration. The Census Bureau has created a longitudinal data file of individual manufacturing establishment data from the Census of Manufactures and ASM for the years 1972 to 1981. This process required the linkage of establishment level records based upon identification numbers. This linkage process was complicated by the numerous plant closings, plant openings, mergers and acquisitions that transpired during the decade covered by the file.   A computer match was performed to link establishment records over time, linkage problems were resolved by the data analysts so that a consistent series of economic surveys is available for each establishment in operation during the period covered. The linked data were reformatted into a data structure suitable for such a file and extraction routines were developed so that data can be removed from the file.   Contents of the File   The basic unit of collection for the Census of Manufactures and the ASM is the manufacturing establishment. Thus the establishment is the basic unit of data storage in the LED file. An establishment is defined as a single,physical location engaged in one of the categories of industrial activity in the Standard Industrial Classification (SIC) system. The SIC system is used in the classification of manufacturing establishments by type of activity in which they are engaged; it facilitates the collection, tabulation, presentation and analysis of census data relating to establishments.   The data are stored as a time sequence of survey responses for establishments rather than as a time series of annual observations for variables. The data are sorted by a permanent establishment identification number and survey year.   138       The data for a particular year are stored in modular sets of fixed length records; data for a module (a set of variables) have a consistent format for all years.   The variables available from the LED file are presented in Table 1, the LED Directory. As this table indicates, basic economic information on the factors of production (inputs) such as employment, payrolls, supplementary labor costs, worker hours, cost of fuels and electricity, cost of materials, capital expenditures, rental payments, inventories and on the products produced (outputs), such as value of shipments and value added, are available for all years. In recent years, a number of new items have been added, including the consumption of specific types of fuels, methods of valuation or inventories, purchases of used structures and machinery, retirements, and depreciation. The detailed information obtained in census years on materials consumed and on products shipped are not available from the ASM, thus a continuous time series is not available for those variables.   Methodological Problems   Data Comparability through Time:   The main objective of survey processing is to identify "significant errors", i.e., those that affect the quality of the aggregate data or the test for confidentiality. We cannot afford the cost of cleaning up "insignificant" data errors. Therefore, we do not always insist on complete and correct data for each establishment, even in a sample, and rely instead on our computer edit to maintain the completeness of the record, to "estimate" data for establishments that fail to report, and to identify "significant" errors (edit failures) that are referred to the analysts for review. This means that some data errors remain in the records of the individual establishments. It should be noted that data "flags" included in the longitudinal file will indicate which cells have been computer changed or analyst corrected.   Most importantly, because of cost, we have concentrated on year-to-year comparisons of establishment data. Our computer edit has been designed to work with only two periods of data; current year and previous year. Our aggregate review focuses on two years of data, current and previous, although trends are also considered. For economic research purposes, where micro data for several years are needed this type of editing and review may not be sufficient. Different problems will come into focus when establishment data are edited and reviewed over a long-period of time as compared to using only two years.   Another factor that affects data comparability over time involves the errors that are identified during the survey processing, but which are not carried back to the file because of cost considerations. As noted earlier, this situation was virtually eliminated with the introduction of an establishment correction system for the 1975 ASM. For the, 1972 Census and the 1973 and 1974 ASM, this system was not available, but efforts were taken to assure that most of the corrections were carried back to the file. Therefore for these years a tabulation of the computer file will yield results very close to the publication totals.   Data comparability over time may also be affected by two other factors. The first involves a change in the definition of an individual item. An example of this will occur for the 1982 Census of Manufactures in regard to inventories.   139   Prior to 1982, information on the book value of inventories was Collected. Investigations of methods used by individual companies to compile inventories indicates that the best way to obtain consistent data among different companies and even among individual establishments of the same company is to request LIFO (last-in- first-out) inventories before the application of the LIFO adjustment or reserve. Therefore, the inventories inquiry has been revised for 1982 ,to collect data on a pre-LIFO basis (i.e., gross value before any LIFO reserve or adjustment). However, since we will be requesting additional information including the amount of the LIFO reserve, we will be able to "estimate" book value for 1982.   The second factor that would affect data comparability involves modification of the computer editing procedure used for a particular item. An example occurred in the 1977 census when the addition of retirements and detailed capital expenditure items to the report form resulted in a complete change of the editing procedure used for the assets-expenditures-retirements complex. Assets data continue to be collected as in the past, but the new computer editing procedure probably resulted in a "break" in the series for a few establishments whose assets data were edited differently for 1972 through 1976 as compared to 1977 and subsequent years.   Availability of "Processed" rather than "Raw" data:   In analysis of an establishment file, some researchers feel that the actual data reported by the respondent are preferable to the data that have been edited and changed (without verification by the company). However, the data files used for the development of the time series file include a mixture of "raw" (originally reported) and computer-corrected data. The "raw" data are no longer available for all establishments.   Therefore, researchers who advocate economic research based only on "raw" microdata will find the Census/ASM LED to be of limited use. We have already noted that data "flags" included in the longitudinal file will indicate which cells have been computer changed or analyst corrected. As a result, researchers may choose to isolate only the "raw" microdata that remain unchanged as a result of Census Bureau processing procedures.   Disclosure   The last problem to be discussed, and the most complex, involves disclosure implications. Data collected by the Bureau of the Census are protected by Title 13 of the U.S. Code from disclosure to outside parties. All tabulations and analysis of longitudinal data must be analyzed to ensure that no individually identifiable confidential data are released to outside users. Bureau of the Census policy also requires that the Center for Economic Studies prevent actual estimation or close approximation of individual confidential data from released statistics. This is accomplished by applying the Census Bureau's respondent and concentration rules, which may require suppression of individual data cells. Additional suppression of nondisclosure cells may be required in cross-tabulations to avoid complementary or indirect disclosure of confidential data.   After a request for tabulation or analysis is received by the Center, a comprehensive analysis of possible disclosure of sensitive information will be performed . The user will be notified of Possible disclosure which would require   140         the suppression of information. Due to the complex nature of the LED file, each disclosure analysis will be handled on a case by case basis. Under no circumstances will the Bureau release names or addresses of establishments in the file. Also the Bureau will not release microdata in any format which would allow identification of individual establishments.   The results of each project must be carefully scrutinized in terms of disclosure implications before the data can be released to the researchers. The effects of ownership changes, industry changes, corrections made as a result of reviewing the establishment data, and so forth, must be taken into consideration. Furthermore, if the time-series data are subject to regression analysis or other mathematical analysis, interesting questions are raised on what information can be released. Finally, the results of each project must be compared against the results of previous studies in order to avoid complementary disclosure problems. This is quite an undertaking, and, at present, a systematic approach to handling disclosure problems has not been developed.   How will the File be Used   Users of the LED file will work through the staff of the Center for Economic Studies (CES). A major purpose of the CES is to make industrial data available to the data user community of economic policymakers and researchers to facilitate analysis and research. The result of that analysis and research will then help the Bureau to improve its economic measurement programs. The Census confidentiality policies and the U.S. Code limit direct access to individual establishment data to Census employees who have sworn to protect their confidentiality. This regulation precludes direct access to the LED data by outside researchers only sworn Census employees will have direct access to the LED file.   The CES will act as the interface between the data user community and the LED file by processing requests by outside researchers for tabulations and analyses of the LED file. The CES is creating a computer environment that will permit low-cost expeditious processing of user requests. It will be possible for an outside analyst to request cross-tabulations of aggregate statistics, estimations of econometric models, and other economic and statistical relationships based on the establishment level data. These tasks will be performed on a cost-reimbursable basis.   The types of tasks that can be performed using the LED file include:   1. Analysis of a wide range of issues from the field of industrial organization including diversification, concentration, ownership patterns and changes, and monopolistic and oligopolistic industries.   2. Analysis of productivity, technological change and efficiency and their diffusion within and across establishments, enterprises and industries.   3. A wide range of descriptive statistics such as cross- tabulation of important variables (productivity value added, wage rates) by size of establishment or enterprise, by industry or by geographic area.   141           4. A wide range of studies of various economic surveys by comparing detail and summary statistics across surveys.   5. Analysis of the sources and nature of productivity growth, including geographic, size and industry dimensions.   6. Analysis of geographic patterns in input markets, especially labor and energy markets.   7. Analysis of energy use in manufacturing establishments.   8. Analysis of the geographic dimensions of, for example, labor and energy markets.   The data user/research community benefits by analysis of a rich longitudinal data base for manufacturing establishments and (through integration with other economic survey results) whole enterprises. The Bureau's economic survey programs will benefit from validation and evaluation studies through time and across economic surveys. Feedback on the scope of the surveys, uses of the data, and data anomalies discovered during analysis will improve both the content and the quality of the survey data and statistical products based on theory. Also generalized data manipulation and analysis software produced for analytical uses of the file can be made available for use in the economics division for their use in production processes.   142   Click HERE for graphic.     143       Click HERE for graphic.     144       Click HERE for graphic.     145     CASE STUDY 12   STATISTICS OF INCOME PROGRAM   I. Purpose   The internal Revenue Service, in addition to its primary mission of enforcing the Federal tax laws, is also charged with publishing statistics on the operation of the tax laws. The data, based on tax returns, are released in a series of reports called Statistics of Income (SOI).   The SOI reports from the very beginning (1916) have been used extensively for tax research and for estimating revenue, especially,by officials in the Department of the Treasury. The main emphasis of the annual statistics has always been individual and corporation income tax data. Other subjects based on other types of returns for which data have been tabulated either annually or periodically have been partnerships, estates and gifts, fiduciaries, farmers' cooperatives, and foundations and other tax exempt organizations. Data are also published on the international income and taxes of U.S. persons and corporations.   Traditionally, the SOI Program has been based on cross- sectional samples. However, these statistics told very little about the relationships between events that were being described. For example, was it the people who moved who achieved increases in income? Did people whose tax rates went down give more or less to charitable organizations? Only with longitudinal studies has IRS been able to relate status at one point in time to status at another. This is done by focusing on specified observational units in one Year, and following their status through successive (or preceding) years. In addition, when dealing with attitudes, such as the response of taxpayers to tax law and economic changes, longitudinal samples are as close as SOI can come to performing controlled experiments.   Most of the longitudinal studies have been panel studies. The same variables are measured for the same observational units at different periods in time. This is done by creating a file of individual tax return data for a group of taxpayers for each of a succession of years. The IRS has also done transtemporal studies, in which different variables have been measured in different years for the same taxpayers. An example would be the matching of individual income tax returns filed during a taxpayer's lifetime with the estate tax return (which indicates the taxpayer's wealth) filed after his or her death, A third type of longitudinal study is the non-identical study, in which one set of variables is measured for one set of observational units at one time, and another set of variables is measured for a related but not identical group of observational units at another. This occurs when the estate tax return of one individual is matched to the income tax returns filed in later years by his or her heirs.   Because IRS is dealing with administrative files, one more set of distinctions deserves to he made. Each of the types of longitudinal studies mentioned above can be either prospective or retrospective in nature. In other words, the historical data can be built by going either backwards   From a paper presented to the American Statistical Association by Robert A. Wilson and John DiPaolo, and a presentation to the Joint U.S. and Canadian Conference on Tax Modelling by Peter J. Sailer.   147       or forwards in time from the point at which the sample was selected. The SOI Division has created both types of files, as well as hybrids which move in both directions.   II. Sponsorship   The SOI program is the responsibility of the Statistics of Income Division of the IRS Office of Returns and Information Processing. The Statistics of Income Division is responsible not only for SOI, but also for conducting special statistical studies and providing advice on sample designs for use in helping other organizations in IRS to conduct studies of their own.   III. Sample Design   The SOI program has the following basic character. Returns filed with the ten service centers are processed for administrative purposes to determine the correct tax liability. During processing, the returns are entered on tape for eventual posting to the IRS Master File. It is when the return records are on tape that they are designated for SOI After the returns are designated, they are subjected to additional editing and relational testing for the SOI program.   A. Design Problems   The first task is to identify the same observational units. In the case of individual taxpayers, this is not too difficult, at least in theory. All records are identified by social security number (SSN), and most of the electronic files are sorted in SSN order.   There are many reasons, however, which can cause non-matches. Deaths (in the case of prospective studies) and births (in the case of retrospective studies) guarantee that not all records will match to a record for another year. (Births and deaths mean coming into the system or leaving the system. This leads to the phenomenon that a taxpayer can be born into the estate tax system only by dying.) Unfortunately (for the SOI program), many taxpayers show a tendency to die only temporarily, and then to be reborn a few years later.   However, neither processing errors, nor births, nor deaths create as many problems as marriages. When a male in an SOI panel gets married, he will generally start filing a joint return with his wife, using his SSN as the primary SSN on the return. This means that he will still be in the panel but, in contrast to earlier years, he may well have a second persons's income and taxes mixed in with his. On the other hand, when a female gets married, she is generally lost to a panel, especially if the sample selection is performed at the service centers, where secondary SSN's are not always key-entered. No matter how much effort is made to keep all the observational units from one year to the next, the fact remains that it will not be possible to include completely comparable data items, since joint returns always combine data items for both taxpayers.     148           The problem of marriages is compounded when one is trying to establish a panel of corporations. While multiple marriages do occur among individuals, at least they occur serially. In the case of corporations, the frequent and cumulative merging of observational units often with units from totally unrelated industrial groupings, can wreak havoc with corporation panels. For that reason, corporation panel studies undertaken by the Statistics of Income Division have been confined to very small pilot efforts.   Although setting up a panel file may be much more complicated than simply selecting a series of cross-sectional samples, panel files have one additional benefit. While the sampling variability of the estimates for each year should be about the same as they would be for a cross-sectional sample of the same size for each year, the sampling variability of the changes from one year to the next should be considerably smaller. This happens because the differences between one year and the next truly are differences, not the results of selecting different samples.   IV. Survey Design and Content   A. The 1967-73 Individual SOI Panel   The 1967-73 panel was created by incorporating two four-digit social security number endings in each stratum of each Statistics of Income sample for those years. In other words, anybody whose SSN ended in one of those two combinations of digits was included in the larger, stratified sample selected to produce the annual Statistics of Income report. In theory, at least, this created a general-purpose panel at a very low cost. The cost of abstracting, keying, and testing important data items from selected tax returns was absorbed as part of the regular statistical processing.   One problem arose because an annual 2 percent delinquency rate added tip to quite a few incomplete observational units over a seven-year period -- over 10 percent, as a matter of fact. Further complications arose because of the many tax law changes and consequent redesign of the tax forms over the 7-year period of the panel. Because of these changes, the file format changed considerably over the period, with old items being dropped and new ones added. IRS finally decided to create a completely new file format, which would work for all the years in the panel. Fields were created for all items that existed over the 7-year period, and were filled in for those years for which they existed.   When the completeness of the file was evaluated, going back only one year (i.e., to 1972), returns for 11.7 percent of the taxpayers in the sample were missing. Going back another year, some of the lost taxpayers reappeared, while others dropped out, for a net loss of 18.4 percent. By the time IRS had gone back 6 years to the beginning of the panel, no returns could be found for 32.6 percent of the 1973 taxpayers. The number for which IRS did not have complete records was closer to 50 percent. In spite of its limitations, the file proved useful in studying a number of issues.   B. The Capital Gains Panel   Beginning with Tax Year 1973, the Statistics of Income Division began assembling "capital gains panels." These are 5-year, retrospective/prospective panels, with the base year in the middle. A highly stratified sample of Schedule D returns (Capital Gains and Losses) with sampling rates ranging from 1/48,000     149           to 1/5, is selected for the middle year. The IRS Individual Master File is then used to locate the returns for the two previous years and, eventually, for the two following years. The returns are pulled, and details on each capital transaction are edited and transcribed.   C. The Estate Collation Study   While a panel of Forms 1040 can provide information about the realization of capital gains, and a panel of Schedule D data can indicate what type of assets have been traded and how long they have been held, neither shows how these relate to the total wealth of the taxpayer. Wealth, in fact, is reported at most once for any given taxpayer -- on Form 706 (Estate Tax Return), by the taxpayer's estate, after he or she has died. The purpose of the SOI's estate collation studies is to establish a connection between the income and the wealth of taxpayers, and to trace the transfer of wealth (and consequent changes in income) when a taxpayer dies. This is done by matching a decedent's estate tax return first to his or her income tax returns prior to death, then to the beneficiaries' income tax returns both before and after the death. In other words, this is a hybrid of every type of longitudinal study mentioned above: a retrospective and prospective, non- identical, transtemporal panel.   For the 1976 Estate Collation Study, IRS matched estate tax returns filed in 1977 with the decedent's income tax returns filed for the two previous years. In addition, IRS matched the income tax returns for nonspousal heirs to whom a bequest of $50,000 or more had been made, obtaining data for the two years before and the three years after the bequest.   D. Taxpayer Migration Data   This project is probably one of the largest panel studies ever undertaken. It is not done by the Internal Revenue Service, but, it involves data files that are provided by IRS to the Bureau of the Census. The Census matches every computer record of individual income tax returns filed from January through September of a given year to the previous year's record. The Census Bureau is given access to return records, among other things, to make intercensal population and income estimates, and to provide county and minor civil division level data to the Treasury Department for the Federal Revenue Sharing program. The matching of return records is in part an operational necessity. Taxpayers frequently use a business or Post Office Box address on their returns. Therefore, the Bureau persuaded IRS to put a question on the return about the exact governmental unit in which a taxpayer lives. However, this is done only once every few years -- the most recent year was 1980.   Among the series of data which Census creates from these files are matrices which show from where to where the population is shifting; and county migration data which show how many taxpayers entered and left each county within a given period of time, how many exemptions they claimed, and, for some years, the amount of income for the in-migrants, the out-migrants, and the non-migrants.   150       E. Department of Defense (DOD) Salary Study   The DOD Salary Study is the result of a public law passed by the U.S. Congress which requires the Department of Defense to perform an evaluation of the military pay structure at least once every four-years. Part of this study entails following the earnings of persons who leave the Armed Forces-separatees, as DOD calls them -- to learn what the "opportunity costs" are for persons who stay in the Armed Forces.   The sample of separatees is chosen by DOD. New separatees are sampled each year. Once selected for the sample, the individual stays in it forever. DOD gives IRS the social security numbers of the new designees, along with codes indicating their DOD characteristics. By going to Forms 14-2 (Wage and Tax Statements), rather than to income tax returns, IRS gets only the salaries of the individuals in the sample.   Because of the taxpayer's right to privacy, no identifiable data are returned to DOD. All SSN's are removed from the data before they are sent back to DOD. Furthermore, DOD supplies IRS with at least three individuals with any given combination of DOD characteristics codes, so that there will not be any way to match back to the SSN's.   One of the limitations of this panel is that of missing data. There are no indicators on the Form W-2 to indicate whether a person for whom data are missing is self-employed, unemployed, retired, or dead, or whether IRS has made a processing error. At this point, there is no alternative to simply leaving these individuals out of the analysis.   F. The Individual Panel Beginning with Tax Year 1979   The Tax Year 1979 sample was designed to study certain questions related to mortality and morbidity rates by occupation of taxpayer. Funds had been made available for this purpose by the Social Security Administration and the National Cancer Institute. Since future links to certain data items from the Social Security Administration's Continuous Work History Sample (CWHS) were anticipated, five SSN endings were chosen to overlap with the CWHS sample. There is now a 3-year panel of some 45,000 randomly selected tax return records, and a 4-year panel of 9,000 records.   G. Corporation Tax Adjustment Study (CORTAX)   This study is intended to quantify the effects of adjustments (through carrybacks of net operating losses and unused credits, IRS examination activity, etc.) to corporate tax liability after the corporation's original tax return (Form 1120 series) has been filed. By linking SOI corporate sample EIN's to their Business Master File -(BMF) accounts, SOI expects to tabulate these adjustment amounts for all tax years on the BMF extract -- usually the most recent five or so.   For example, CORTAX 86 will commence in 1986 by extracting these adjustment data for Tax Years 1978 - 1982, using the Tax Year 1982 sample file of EIN's as the extract or link variables. While a significant portion of the SOI corporate on sample (like other SOI sampling frames) is already longitudinal , CORTAX will lend an additional longitudinal aspect with its five years of     151       adjustment data for each CORTAX year's record. In addition, CORTAX will show cumulative adjustment effects (and, thus, annual changes) for certain tax years over time for the longitudinal "core" of records in the SOI corporate samples.   CORTAX 87 is expected to provide tax liability adjustment data for an accounting period range ending with Tax Year 1985, and may expand tabulations to include interest and penalty assessment amounts as well. Thereafter, CORTAX studies are planned for annual occurrence, and should continue to provide Treasury's Office of Tax Analysis and Congress' Joint Committee on Taxation with the supplemental data bases necessary for the development of more current and detailed tax policy/legislation analyses.   V. Future Studies   There is no doubt that longitudinal studies are essential to the IRS mandate to produce statistics on how the internal revenue laws are operating. A new estate collation study is being planned for 1982 decedents. In this new, improved study, wealth transferred to trusts and other estates, as well as to individuals, will be traced. One of the most ambitious plans is the study of Intergenerational Transfers of Wealth. The only time an actual accounting is available for an heir's wealth will be when that heir, in turn, passes away. This is what the study of intergenerational transfers is all about. By linking estate tax returns filed by succeeding generations of heirs a classic non- identical longitudinal study -- it is possible to study changes in the concentration of wealth during the history of the tax system, and the role intergenerational transfers of wealth have played in this process.   Additional plans for the future include improved individual panel studies using data from the Individual Master File of all tax return records, including one in which the postal ZIP code will be used to trace migration patterns; Also planned are additional capital gains panels, and a panel study of large private foundations.   152           REFERENCES   ANDERSON, T.W. 1957 "Maximum Likelihood Estimates for a Multivariate Normal Distribution When Some Observations are Missing." Journal of the American Statistical Association, 51, 200-203.   ARTZROUNI, MARC 1980 "Tracing Respondents in Longitudinal Surveys: A Bibliographic Overview." Unpublished Ms., U.S. Bureau of the Census, Statistical Research Division.   BARTHOLOMEW, D.J. 1973 Stochastic Models for Social Processes, John Wiley and Sons.   BENUS, J. 1975 "Response rates and data quality." Five Thousand American Families -- Patterns of Economic Progress, Vol. III, Ann Arbor: Institute for Social Research, G.J. Duncan and J. N. Morgan (eds.).   BIDERMAN, A., CANTOR, D. and REISS. A. 1982 "A Quasi-Experimental Analysis of Personal Victimization Reporting by Household Respondents in the National Crime Survey." Paper prepared for the Joint Statistical Meetings of the American Statistical Association, Cincinnati, Ohio.   BISHOP, YVONNE M.M.; FIENBERG, STEPHEN E.; HOLLAND, PAUL W.; 1975 Discrete Multivariate Analysis, The MIT Press.   BLALOCK, H.M. 1970 Causal Models in the Social Sciences, Chicago: Aldine.   BURKHEAD, D., and CODER, J. 1985 "Gross Changes in Income Recipiency from the Survey of In come and Program Participation". Proceedings of the Social Statistics section, American Statistical Association.   BYE, BARRY V. and SCHECHTER, EVAN S. 1980 "Estimating Response Variance from Latent Markov Models: An Application to Self Reported Disability Status", ORS Staff Paper no. 37.   BYE, BARRY V. and SCHECHTER, EVAN S. 1986 "A Latent Markov Model Approach to the Estimation of Response Errors in Multiwave Panel Data", Journal of the American Statistical Association, forthcoming, June, 1986.   CAMPBELL, RICHARD T. and MUTRAN, ELIZABETH, 1982 "Analyzing Panel Data in Studies of Aging, Research on Aging, Vol.4 , no. 1 , 3-41 .   CITRO, C.F. 1985 "Alternative Definitions of Longitudinal Households and Poverty Status in the ISDP", Proceedings of the Survey Methods Research Section, American Statistical Association.     153           CLOGG, CLIFFORD C. 1979 "Latent Structure Models of Mobility" The Pennsylvania State University.   CODER, J., and FELDMAN, A.; 1984 "Early Indications of Item Nonresponse on the Survey of Income and Program Participation", in Proceedings of the Survey Methods Research Section, American Statistical Association.   COLEMAN, JAMES 1981 Longitudinal Data Analysis, Basic Books Inc.   COOK, MARTIN A., and ALEXANDER, KARL L. 1982 "design & Substance in Educational Research" Adolescent Attainment, A Case in Point" in Sociology of Education: 53 no. 4: 197-202.   COX, B., and BONHAM, G. 1983 "Sources and Solutions for Missing Data in the NMCUES," in Proceedings of the Survey Research Methods Section, American Statistical Association, Washington, D.C.   COX, B. and COHEN, S. 1985 Methodological Issues for Health Care Surveys. Marcel Dekker, New York.   DAVID, M. (ed.); 1983 Technical, Conceptual and Administrative Lessons of the Income Survey Development Program. New York, Social Science Research Council.   DAVID, M. and LITTLE, R., and McMILLEN, D. 1983 "Weighting Adjustments for Nonresponse in Panel Surveys." Unpublished working paper, U.S. Bureau of the Census.   DAVID, M., and LITTLE, R. 1983 "Concepts and Strategies for Imputation of ISDP and SIPP." Unpublished working paper, U.S. Bureau of the Census.   DUNCAN, G.J., JUSTER, F.T. and MORGAN J.N. 1982 "The role of panel studies in a world of scarce research resources." Paper prepared for Social Science Research Council Conference on Designing Research with Scarce Resources, Washington, D.C.   DUNCAN, G. & KALTON, G. 1985 Issues of Design and Analysis of Surveys Across Time, paper presented to the I.S.I., August, 1985, Amsterdam.   DUNTEMAN, GEORGE H. and PENG, SAMUEL S. 1977 "Some Analysis Strategies Applied to the National Longitudinal Study of the High School Class of 1972," Research Triangle Institute,   ELANDT-JOHNSON, REGINA C., and JOHNSON, NORMAN L. 1980 Survival Models and Data Analysis, John Wiley and Sons.   ERNST, I., HUBBLE, D., and JUDKINS, D. 1984 "Longitudinal Family and Household Estimation in SIPP". Proceedings of the Survey Research Methods Section, American Statistical Association, Washington, D.C.   154   FELDMAN, A., NELSON, C., and CODER, J.; 1980 "Evaluation of Wage and Salary Income Reporting on the 1978 Income Survey Development Program Test Panel", in Proceedings of the Section on Survey Research Methods, American Statistical Association.   FERBER, R., and FRANKEL, D, 1981 Evaluation of the Reliability of the Net Worth Data in the 1979 Panel: Asset Ownership on Wave 1. Prepared under contract with the Survey Research Laboratory, University of Illinois.   FIENBERG, S.D., and TANUR, J.M. 1983 "The Design and Analysis of Longitudinal Surveys: Controversties and Issues of Cost and Continuity." Technical Report no. 289, Department of Statistics, Carnegie-Mellon University, Pittsburgh,,Pennsylvania.   FOX, ALAN 1976 "Work Status and Income Change, 1968-72: Retirement History Study Preview," Social Security Bulletin.   FRANKEL, D. 1985 Survey of Income and Program Participation: Selected Papers given at the 1985 Annual meeting of the American Statistical Associating. Las Vegas, Nevada, 1985.   GINSBERG, RALPH B. 1972a "Critique of Probabilistic Models: Application of the Semi-Markov Model to Migration," Journal of Mathematical Sociology, Vol. 2, 63-82.   GINSBERG, RALPH B. 1972b "Incorporating Causal Structure and Exogenous Information with Probabalistic Models" With Special Reference to Choice, Gravity, Migration and Markov Chains," Journal of Mathematical Sociology, Vol. 2, 83-103.   GROVES, R. M. and KAHN, R.L. 1979 Surveys by Telephone: A National Comparison with Personal Interviews. New York: Academic Press.   HAUSER, ROBERT M. 1978 "some Exploratory Methods for Modeling Mobility Tables and other Crossclassified Data", CDE Working Paper 78-19.   HECKMAN, JAMES J. and SINGER, BURTON 1982 "The Identification Problem in Econometric Models for Duration Data." Advances in Econometrics, Cambridge University Press.   HENNESSEY, JOHN C. 1982 "Testing the Predictive Power of a Proportional Hazards Semi- Markov Model of Postentitlement Work Histories of Disabled Male Beneficiaries", Social Security Administration ORS Working Paper no. 29.   JEAN, A. and McARTHUR E. 1984 "Some data collection issues for panel surveys with application to SIPP." Proceedings of the Survey Methods Section, American Statistical Association   155         JONES. B. 1982 "Development of Sample Weights for the National Household Component of the National Medical Care Utilization and Expenditure Survey." Research Institute, Research Triangle Park, NC. RTI/1815/05-01F.   J™RESKOG, KARL G., and S™RBOM, DAG 1976 "Statistical Models and Methods for Analysis of Longitudinal Data", In D.J. Aigner and A.S. Goldberger (eds.), Latent Variables in Socioeconomic Models, pp. 285-325. Amsterdam, Holland.   1978 LISREL User's Guide: Version IV, International Educational Services.   1979 Advances in Factor Analysis and Structural Equation Models, Abt Books.   KALACHEK, E. 1978 "Longitudinal surveys & labor market analysis" background paper no. 6. National Commission on Employment A Unemployment Statistics, Washington-, D.C.   KALTON, G., KASPRZYK, D., and SANTOS, R. 1981 "Issues of Nonresponse and Imputation in the Survey of Income and Program Participation " in Current Topics in Survey Sampling. Krewski, D, Platek, R., Rao, J.N.K. (eds), Academic Press, New-York.   KALTON, G. and LEPKOWSKI, J. 1992 "Longitudinal Weighting in the ISOP? Chapter 12 in David, op. Cit. Lessons of the ISDP, D,C., SSRC.   KALTON, G and LEPKOWSKI, J. 1983 "Cross-Wave Imputation," in Technical, Conceptual and Administrative Lessons of the Income Survey Development Program (ISDP). M. David, (ed.), Social Science Research Council, New York.   KALTON, G., LEPKOWSKI, J., and LIN, T. 1985 "Compensating for Wave Nonresponse in the 1979 ISDP Research Panel" in Proceedings of the Survey Research Methods Section, American Statistical Association. Washington, D.C.   KALTON, G., LEPKOWSKI, J., and SANTOS, R. 1981 "Longitudinal Imputation." Survey Research Center/University of Michigan, Income Survey Development Program. Unpublished report, of the ISDP. Department of Health and Human Services, Washington D.C.   KASPRZYK, D., and FRANKEL, D. 1985 Survey of Income and Program Participation and Related Longitudinal Surveys: 1984; Selected Papers Given at the 1984 Annual Meeting of the American Statistical Association. Philadelphia, Pa.   KASPRZYK, D., and KALTON, G. 1983 "Longitudinal Weighting in the Income Survey Development Program," in Technical. Conceptual and Administrative Lessons of the Income Survey Development Program (ISDP). M. David (ed.), Social Science Research Council, New York.   156           LAND, K.C. 1971 "On the Definition of Social Indicators", in American Sociologist. 6:322.   LANDIS, RICHARD J., and KOCH, GARY G. (N.D.)"The Analysis of Categorical Data in Longitudinal Studies of Behavioral Development", (source unknown).   LANDIS, RICHARD J.; STANISH, WILLIAM M.; FREEMAN, JEAN S.; KOCH, GARY G. 1976 "A Computer Program for the Generalized Chi-Square Analysis, of Categorical Data Using Weighted Least Squares (GENCAT)", Computer Programs in Biomedicine, 6: 196-231.   LITTLE, R. 1984 "Survey Nonresponse Adjustments* in Proceedings of the Survey Research Methods Section, American Statistical Association,.Washington, D.C.   LITTLE, R. 1985 "Nonresponse Adjustments in Longitudinal Surveys: Models for Categorical Data." Paper prepared for the meeting of the International Statistical Institute, August 1985.   MARINI, M., OLSEN, A., and RUBIN, D. 1980 "Maximum-Likelihood Estimation in Panel Studies with Missing Data," in Sociological Methodology. Schuessler,, K.F. (ed.), Jossey-Bass, San Francisco.   McARTHUR, E., and SHORT, K.; 1985 "The Characteristics of Sample Attrition in the Survey of Income and Program Participation" in Proceedings of the Survey Research Methods Section, American Statistical Association.   McMILLE14, D., and HERRIOT, R.A.; 1984 "Toward a Longitudinal Definition of Households", in Proceedings of the Social Statistics Section, American Statistical Association.   McMILLEN, D., and KASPRZYK, D.; 1985 "Item Nonresponse in SIPP", in Proceedings of the Survey Research Methods Section, American Statistical Association.   MOORE, J., AND KASPRZYK, K. 1984 "Month-to-Month Recipiency Turnover in the ISDP", in Proceedings of the Survey Research Methods Section, American Statistical Association.   NELSON, D., McMILLEN, K., and KASPRZYK D. 1983 "An overview of the survey of income and program participation." U.S. Bureau of the Census, Washington, D.C.   OHIO STATE UNIVERSITY, The; 1979 The National Longitudinal Surveys Handbook. First Edition. Columbus: Center for Human Resource Research.   1982 The National Longitudinal Surveys Handbook. Second Edition. Columbus: Center for Human Resource Research.     157       PARNES, H.S. 1972 Longitudinal Surveys: Prospects and Problems" in Monthly Labor Review, 95 no.2:11-15.     RHOTON, P. 1983 "Attrition and the National Longitudinal Surveys of Labor Force Behavior: Avoidance, Control and Correction". Unpublished mss.,   RUBIN, D. 1974 "Characterizing the Estimation of Parameters in Incomplete Data Problems." Journal of the American Statistical Association, 69, 467-474.   SATER, D., 1985 "Enhancing Data from the Survey of Income and Program Participation with Data from Economic Censuses and Surveys". SIPP Working Paper series no. 8505, Bureau of the Census.   SINGER, BURTON. 1983 "Longitudinal Data Analysis", in N. Johnson and S. Kotz (eds.), Encyclopedia of Statistical Sciences, Vol. IV, John Wiley and Sons.   SINGER, BURTON and SPILERMAN, SEYMOUR 1976 "Some Methodological Issues in the Analysis of Longitudinal Surveys", The Annals of Economic and Social Measurement, Vol. 5 no. 4 Fall, 447-474.   TUMA, NANCY BRANDON 1976 "Rewards Resources, and the Rate of Mobility: A Nonstationary Multivariate Stochastic Model," American Sociological Review, Vol. 41, 338-360.   TUMA, NANCY BRANDON and HANNAN, MICHAEL T. 1984 Social Dynamics: Models and Methods., Academic Press.   U.S. BUREAU OF THE CENSUS; 1982 Wage and Salary Data from the Income Survey Development Program, Current Population Reports, Series P-2 , no. 11 . U.S.G.P.O., Washington, D.C.   1983 Economic Characteristics of Households in the United States: - third Quarter, 1983, Current Population Reports, Series P-70, no. 1. U.S.G.P.O., Washington, D.C. (This series is published with quarterly information. no. 5 in the series, containing data for fourth quarter 1984, was released November, 1985.)   U.S. DEPARTMENT OF COMMERCE; 1978 A Framework for Planning U.S. Federal Statistics for the 80's. Office of Federal Statistical Policy and Standards. Washington, D.C.   U.S. SOCIAL SECURITY ADMINISTRATION, ORLANDO, FLORIDA 1982 "disability Insurance Work Incentive Experiments: Project Statement", SSA/OP/ORDS/DDS, March.   U.S. SOCIAL SECURITY ADMINISTRATION (N.D.)"Retirement History Study Report Series", Social Security Administration Publication no. 73-11700.   158         VAUGHN, D., WHITEMAN, T., and LININGER, C.; 1984 "The Quality of Income and Program Data in the 1979 ISDP Research Panel: Some Preliminary Findings", in Review of Public Data Use,"Vol. 12, no. 2, pp. 107-131.   WHITE, G.D. JR., and HUANG, H. 1982 "Mover Followup Costs for the Income Survey Development Program" paper presented at the Joint Statistical Meetings of the American Statistical Association et al., Cincinnati, Ohio, August.   WHITMORE, R. , COX, B. , and FOLSOM, R. 1982 Family Unit Weighting Methodology for the National Household Survey Component of the National Medicaid Care Utilization and Expenditure Survey. Research Triangle Institute, Research Triangle Park, N.C. RTI/1898/06-03F.   WILLIAMS, W.H. and C.L. MALLOWS 1970 "Systematic Biases in Panel Surveys," in JASA 65: 1338-1349.   YCAS, MARTYNAS A. 1982 "survey Design and Panel Attrition". Paper no. 11 in David, Op. Cit. pp. 147-154.   YCAS, MARTYNAS A. and LININGER, C.; 1981 "The Income Survey Development Program: Design Features and Initial Findings" in Social Security Bulletin, vol. 44, no. ii.     159               Reports Available in the Statistical Policy Working Paper Series   1. Report on Statistics for Allocation of Funds; GPO Stock Number 003-005-00178-6, price $2.40   2. Report on Statistical Disclosure and Disclosure-Avoidance Techniques; GPO Stock Number 003-005-00177-8, price $2.50   3. An Error Profile: Employment as Measured by the Current Population Survey; GPO Stock Number 003-005-00182-4, price $2.75   4. Glossary of Nonsampling Error Terms: An Illustration of a Semantic Problem in Statistics (A limited number of copies are available from OMB)   5. Report on Exact and Statistical Matching Techniques; GPO Stock Number 003-005-00186-7, price $3.50   6. Report on Statistical Uses of Administrative Records; GPO Stock Number 003-005-00185-9, price $5.00   7. An Interagency Review of Time-Series Revision Policies (A limited number of copies are available from OMB)   8. Statistical Interagency Agreements (A limited number of copies are available from OMB)   9. Contracting for Surveys (Available through NTIS Document Sales, PB83-233148)   10. Approaches to Developing Questionnaires (Available through NTIS Document Sales, PB84-105055)     11. A Review of Industry Coding Systems (Available through NTIS Document Sales, PB84-135276)   12. The Role of Telephone Data Collection in Federal Statistics (Available through NTIS Document Sales, PB85- 105971)   13. Federal Longitudinal Surveys (Available through NTIS Document Sales, PB86-139730)   Copies of these working papers, as indicated, may be ordered from the Superintendent of Documents, U.S. Government Printing Office, Washington, D.C. 20402 (202-783-3238) or from NTIS Document Sales, 5285 Part Royal Road, Springfield, VA 22161 (703-487-4650).      

(sw13.html)

ARROW UP

 


Page Last Modified: April 20, 2007 FCSM Home
Methodology Reports