Home > Patron Survey (2003) Index Page > Section 3. Methodolofy

NLS: That All May Read

Patron Survey (2003)

Step back to 2. Study Background   Step forward to   4. Data Analysis ---

3.0 Methodology

The methodology employed to conduct the NLS Patron Survey is presented in the following sections: 

Sampling Plan and Procedures

The population of interest consisted of active NLS subscribers who were adults (ages 18 and over, born in 1985 or before) living in one of the 50 United States. The sampling frame was constructed from the Comprehensive Mailing List System (CMLS).  Since there were likely to be important age or generational differences in subscribers’ needs with respect to the new Talking Book technology, the design involved stratified sampling based on age.  Study participants who represented a given age stratum, based on CMLS data, were systematically selected from the total population of NLS patrons in that age group. The strata were defined by the following age groups:

The first four strata corresponded to young adulthood, middle-age, early-to-mid old age, and late old age. The fifth stratum consists of the subscribers for whom CMLS information on year of birth is missing (2% – 3%). These subscribers were used in the sampling frame to avoid bias from any systematic differences that may have existed between patrons for whom birth year is available from the CMLS and patrons for whom it is missing. For final analysis, these "age unknown" cases were reclassified into the appropriate age group based on information that participants supplied during the interviews.

Stratified random sampling, specifically disproportionate stratified random sampling, was used rather than simple random sampling in order to reduce confidence intervals, increase the precision of the findings, and make sure the sample contained enough subscribers in each age group for comparisons.  Instead of sampling each age group at a rate that reflects its prevalence in the total NLS population, each age group was sampled at a rate that produced age subgroups of approximately equal size. Stated differently, the younger age groups (which are less numerous in the NLS population) were over-sampled, and the older age groups (which are more numerous) were under-sampled.

Create Sampling Frame

The sampling frame consisted of all study-eligible NLS patrons as defined in Exhibit 3-1 below:

Exhibit 3-1.  Eligibility Criteria for the Initial CMLS Sampling Frame
Eligibility Criteria Definition
An eligible subscriber is . . .  
An individual Subscriber is a person, not an institution
A U.S. resident Lives in one of the 50 states
An active subscriber Status listed in CMLS as "active"
Age: An adult age 18 or over or someone whose age is unknown The subscriber either (1) was born in 1985 or before or (2) date of birth is unknown

To create the frame, we removed ineligible records, removed duplicate cases, created age group-specific sampling frames, and scrambled the five frames in preparation for sample selection. 

Sample Selection

We used systematic sampling with a random start (SSRS) to select the sample for each age group. After determining the sampling interval for each age group, we used a table of random numbers to select the first person. After that, participants were selected systematically, according to the appropriate sampling interval. 

The main sample was to include 725 subscribers equally distributed among four age groups (175 or 176 subscribers in age groups 1-4) plus 22 subscribers with birth dates unknown.  The (first) reserve sample was to include 250 subscribers equally distributed among four age groups (60 or 61 subscribers in age groups 1-4) plus 8 subscribers with birth dates unknown. (The actual main and reserve samples received from NLS included 723 and 250 subscribers, respectively).  We anticipated that this sample would generate a total of 500 completed interviews (125 completes in each of the  4 groups).   However, the original main and reserve lists included subscribers with no or incomplete phone information as well as subscribers who were deceased or suspended.  Once the NLS main and reserve sample lists were cleaned, the main and reserve sample lists included 522 and 180 subscribers respectively.   These 702 subscribers did not produce the number of completed interviews needed (because of inability to reach subscribers or finish the interview process) so NLS drew an additional sample of 1,001 subscribers.  From this second reserve sample, 744 subscribers were considered eligible; however, only 549 subscribers were needed as replacements for the main and first reserve sample.  Exhibit 3-2 provides a breakdown of the sample size and number of interviews completed from each sample. 

Exhibit 3-2.  Sizes for the Main Sample, the Reserve Samples, and Number of completed Interviews by Sample Category
  Sample Category Total Population
Main Reserve 1 Reserve 2
Size of sample received from NLS 723 250 1,001 1,974
Size of cleaned sample (ineligibles removed prior to calling) 522 180 549* 1,251
% of original NLS sample determined eligible prior to time of calling 72% 72% 55% 63%
Total completed interviews by category 210 77 160 447

* A total of 744 (or 74%) of reserve 2 was considered eligible prior to calling; however, only the replacements needed to complete age group interviews (a total of 549 subscribers) were included in the Reserve 2 sample.

Questionnaire Development

At the start of the project, key issues of the survey were identified and served as the foundation for developing the questionnaire.  These key issues were as follows: 

The questionnaire consisted of 82 questions and was designed to be completed within 25 minutes.  Appendix A presents the NLS Patron Survey Questionnaire.   In addition to the survey questions, the response codes for the survey were developed and are defined in Appendix B.  

Pre-Test and Interviewer Training

Prior to conducting the survey, interviewers received training which included an overview of NLS, its subscribers, its playback equipment, and the NLS Patron Survey.  Training included the rationale for the information collection, the survey questions, refusal script, and use of the computer-assisted telephone interviewing (CATI) system.  Training activities also included role-playing and mock interviews to prepare the interviewers for potential questions about the survey and refusals to participate.   Following training, the questionnaire was pre-tested on 20 subscribers from the sample.   The results of pre-testing indicated that subscribers understood the questionnaire and interviewers were able to complete the questionnaire within 25 minutes.  Only minor script and questionnaire adjustments were necessary.

Once NLS approved the content of the questionnaire, the survey questions and the interviewers’ script were programmed into a CATI system.  The programmed version was carefully checked for proper execution.  Test data were keyed into the system to ensure proper operation.  This method ensured that once the data collection began, the risk of operational errors or delays was minimized. 

Additionally, a pre-survey letter was sent to all eligible subscribers in the main and first reserve sample to announce the survey and inform subscribers they might be asked to participate.  The pre-survey letter also indicated that participation was voluntary and responses were confidential.  A copy of the pre-survey letter is provided in Appendix C.

Data Collection

Data collection took place between October 10 and November 14, 2003. CR Dynamics conducted telephone interviews Monday-Saturdays from 10am to 9pm Eastern Time.

After the first two weeks of calling, it became clear that the main and first reserve sample would not produce 500 completed interviews.  As described previously, NLS drew a second reserve sample of 1,001 subscribers.  Of those, 744 subscribers were determined eligible; however only 549 were used as replacements in the age groups for which we had not already achieved 125 completed interviews. 

The CATI system managed the sample, automatically produced status reports, dialed the subscribers’ telephone numbers, and scheduled at least 5 callbacks to subscribers who were not reached on the initial call. 

Cooperation Rates

The cooperation rate is all people interviewed as a percentage of all eligible people for whom contact was ever attempted. NLS sought to achieve a cooperation rate of 70 percent or better based on the Cooperation Rate 2 (COOP2) formula for random digit dialing (RDD) surveys as described in the American Association for Public Opinion Research report Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys.

As defined in the American Association ofr Publid Opinion Research. 2000. Standard Definitions: Final Deispositions of Case Codes and OUtcome Rates for Surveys. Ann Arbor, Michigan: AAPOR, the formula for COOP2 is (1+P)(I+P)+R+O.

The COOP2 formula counts as respondents people who provided either a complete or a partial interview. The denominator encompasses all eligible people selected for the sample, including

For the NLS Patron Survey, the criteria used for defining complete and partial interviews reflected NLS’ particular interest in subscribers’ demographic characteristics and the physical and cognitive capabilities most likely to affect machine operation. The criteria were as follows: 

Using the COOP2 formula, the NLS Patron Survey achieved a cooperation rate of 51 percent for the entire sample. The rate, however, varied considerably among age groups, as Exhibit 3-3 shows. We met our goal of a 70% cooperation rate for subscribers between the ages of 65 and 84 (72%), and we approached that goal for subscribers in the 40 to 64 age group (65%). The cooperation rates for the youngest and oldest subscribers, however, were much lower (33% and 44% respectively).

Exhibit 3-3.  Cooperation Rate 2 Data for Age Groups and Entire Sample
Interview Status Age Group Total
18-39 40-64 65-84 85+ Unknown*
Complete plus Partial Interview (I+P) 69 134 129 115 0 447
Refusal plus Break Off  (R) 43 41 37 102 14 237
Other Eligible Non-Respondents for whom contact was attempted (O)  99 31 14 42 7 193
Total (I+P+R+O) 211 206 180 259 21 877
COOP2 Rate
33% 65% 72% 44% -- 51%

*Note for column, Unknown: Eligible non-respondents for whom their age was unknown based on CMLS data are included in the COOP2 rate. Their status as unknown is kept since an interview was unable to establish their birth date. Completed interviews of age unknowns were reclassified to the appropriate age group following completion of the survey.

Data Analysis Issues

The following section presents two factors that affected the analysis of the NLS Patron Survey:  1) assessment of the survey’s eligible non-respondents, and 2) how the completed interview data was weighted.

Assessment of Non-Respondents

By a margin of almost two to one, non-participation of eligible non-respondents resulted from unwillingness to participate rather than subscribers’ apparent busyness or mobility (63% vs. 37% of all eligible non-respondents). The single most common reason for non-participation by far is refusal to be interviewed. Subscribers who refused make up almost half of all eligible non-respondents (207 out of 430, or 48%). Further, data collection took place just after the Federal government established its do-not-call registry. Although the program does not prohibit calls for research purposes or calls to current customers, we lost an additional 34 eligible subscribers who said they had signed up for this registry. When added to the subscribers who declined to participate on other grounds, people who refused to begin the interview make up more than half of all eligible non-respondents (241/430, or 56%).  Exhibit 3-4 presents the interview termination codes for non-respondents by age group.

The reasons for non-response are strongly age-related. Although unwillingness to participate occurs in all age groups, it increases with age. It is least common among the young adults (39%). Unwillingness accounts for much of the non-response among working-age adults (64%) and nearly all of the non-response among retirement-age subscribers (78% to 80%).

The youngest subscribers’ apparent busyness or mobility accounts for much of the non-response in this age group (61%). People between the ages of 18 and 39, moreover, make up more than half of all eligible subscribers whose non-participation is attributable to our inability to establish contact with them.

Ineligible non-respondents were those subscribers that were either found to be ineligible at the time of interviewer contact or were never reached due to inaccurate phone information.  A significant percentage of all of the non-respondents in this group were ineligible due to their contact information (78% of the ineligible non-respondents).  Again, the younger age groups (18 – 64) had the highest rate (81%) of having a disconnected, fax line, or wrong number; however, in all age groups inaccurate phone information proved to be an obstacle in contacting potential respondents.

Exhibit 3-4.  Interview Termination Codes for Eligible Non-Respondents and Ineligible Subscribers by Age. Age is based on CMLS information.
  Age Category Based on CMLS Data  
18-39 40-64 65-84 85+ Unknown TOTAL
I. Eligible non-respondents:            
A. Uncooperativeness            
Break-off   6 6 5 12 1 30
Do not call   13 5 4 11 1 34
Refusal   37 35 32 90 13 207
Total uncooperative   56 46 41 113 15 271
B. Busyness or mobility              
Answering machine   20 0 0 6 4 30
Away   39 15 8 10 0 72
Busy   2 3 0 3 0 8
No answer   25 8 2 12 2 49
Total busy or mobile   86 26 10 31 6 159
Total eligible non-respondents: 142 72 51 144 21 430
% of eligible non-respondents who were . . .            
  Uncooperative 39% 64% 80% 78% 71% 63%
  Busy or mobile 61% 36% 20% 22% 29% 37%
II. Ineligible subscribers              
A. Found ineligible at time of interviewer contact   24 16 10 28 6 84
B. Ineligible due to inaccurate phone info              
Disconnected   38 32 18 26 5 119
Fax line   1 0 5 1 0 7
Wrong phone number   61 35 18 44 6 164
Total ineligible due to inaccurate phone info.   100 67 41 71 11 290
Total ineligible   124 83 51 99 17 374
Ineligibility due to inaccurate phone information: % of ineligible subscribers   81% 81% 80% 72% 65% 78%

All data were weighted during analysis to adjust for the disproportionate stratified sampling strategy. The weight applied to each member of a particular subgroup was based on a formula that took into account the rate at which members of that group were sampled, the total sample size, and the total population size. In short, weighting restored the ability to generalize from the stratified sample to all NLS subscribers.  Exhibit 3-5 presents the sample weights for the survey and Exhibit 3-6 presents the unweighted and weighted data for each age group.

Exhibit 3-5.  Sample Weights for NLS Patron SurveyNote
Age group Pop. N Sample n Sampling interval (N/n) Sampling Rate (1/sampling interval) 1 1/sampling rate Total sample N /Total pop. N 1 Stratum Weights
18 – 39 56,476 69 818 1.2224 0.8181 1.2155 0.9944
40 – 64 89,056 134 665 1.5037 0.6650 1.2155 0.8083
65 – 84 129,725 129 1,006 0.9940 1.0060 1.2155 1.2228
85+ 92,472 115 804 1.2437 0.8040 1.2155 0.9773
Total 367,729 447          

Exhibit 3-5 note. Sample Weight Formula:  The weight for each stratum = (1 / sampling rate) x (Total sample N/ Total population N)

Figures were multiplied by 1,000 to make them less awkward to work with. Because 1,000 was used as a constant, i.e., it was applied to both elements of the weight formula, this strategy does not distort the resulting weights.

Exhibit 3-6.  Unweighted and Weighted Data by Age Group
  Number of subscribers in each age group Percent of subscribers in each age group 1(n/447)
Age Group Unweighted n Weighted n
18 – 39 69 69 16%
40 – 64 134 108 24%
65 – 84 129 158 35%
85 & over 115 112 25%
Sample total 447 447 100%

The percentages, although calculated using weighted sample data, describe the age distributions both for the sample and for the portion of the NLS population from which the sample was drawn (i.e., subscribers age 18 or over who were eligible for the study based on CMLS information). Our weighting procedure ensured that the age distributions in the sample and in the population are the same. One would obtain identical results by using the NLS population information shown in Exhibit 3-5. For example, according to CMLS information, there are 92,472 study-eligible subscribers who are at least 85 years old; dividing this figure by 367,729 (the total number of study-eligible subscribers) would show that people in the 85+ age group make up 25% of the total study-eligible population, the same figure presented in Exhibit 3-6.

Once the data were weighted, the data were analyzed using SPSS 12.0.  The next section presents the data analysis.

Step back to 2. Study Background   Step forward to   4. Data Analysis ---

Library of Congress Home    NLS Home    Comments about NLS to nls@loc.gov

About this site    Comments about this site to the NLS Reference Section

Posted on 2006-05-30