Transcript of Interview with Kathy Cronin, Ph.D., mathematical statistician for NCI

Why use statistical modeling to study breast cancer?

There have been a number of clinical trials that have been done in the past, and based on those trials, it’s accepted that screening does save lives by reducing breast cancer mortality. There are also a number of questions that are left unanswered like what is the best screening schedule, and it’s unlikely that those types of questions will be addressed directly by clinical trials for a number of reasons. One is that screening is already widespread in the US population and has been for over 20 years. Also, there are guidelines that are out there that exist based on the evidence that we have today. And randomizing people into screening schedules that conflict with the guidelines might be problematic. So it’s a little unclear if women would be willing to participate in a trial to look at these specific questions. And on top of the ethical reasons, screening trials are very difficult because they start with women that are healthy and follow them all the way through for the outcome of breast cancer morality. So they’re very large trials. They take a long time to follow up.

How did you decide which screening regimens to study?

There were a number of screening schedules that were considered. We started with [ages] 50 to 69 as the basis because the evidence is strongest for that age group, and then we considered a number of screening strategies that started earlier to get an idea of what would be the additional benefit from starting at 40 rather than 50. And also looking at later ages to see what would be the additional benefit of extending screening past 69. We also look at the interval between screenings and considered annual screening versus screening every other year to see what the change and benefit and the change in harms would be.

How do the models you studied compare, and how do they differ?

The 6 models are all independently designed models in that they all have their own approach to modeling disease progression, and they will approach it understanding what the benefit of screening would be. But to do this joint analysis, they all agreed to use a common set of inputs and a common set of results so that we could compare across the models. So, all 6 models looked at the same screening strategies. They also assumed the same characteristics of the test and the same treatments. And they all produced, for each of those screening strategies, measures benefits which were the percent mortality declines compared to no screening and the years of life gained. And they looked at the same harms which were false positive exams, unnecessary biopsies, and over-diagnosis.

Can the statistical models account for all of the variability that real-life conditions present?

Like any model, they really cannot account for everything. The approach that we’ve used is to identify the elements that we think are the most influential for the question of interest or the hypothesis we’re looking at, and then focus our energies on those elements and getting those elements right. So in this case, we designed the study specifically to look at different screening schedules. And what was modeled was an idealized version where for screening schedule every woman would follow it exactly. So we weren’t trying to model what would happen in the population but sort of a hypothetical situation would give insight into how the screening schedules vary as far as harms and benefits. It’s important to realize that it was a population study, so we looked at the whole population. We didn’t specifically look at women that were at higher risk of breast cancer. And we didn’t consider whether a woman was healthier or sicker than the general population. And we didn’t include every possible outcome that could be associated with screening. We didn’t look at morbidity that might be reduced if a woman was diagnosed earlier with breast cancer, and we didn’t consider quality of life issues that might be associated with false positive tests or over-diagnosis.

For the models to work, what assumptions did you have to make about the progression of breast cancer?

Well the assumptions made were actually a range of assumptions. They have to assume something about disease progression before it’s clinically detected or in the preclinical phase. And so this is not directly observable and that’s why assumptions need to be made. Some of the models assume that there is a series of discrete states and that a tumor progresses from one state to another. Others consider continuous tumor growth model. For a tumor to be screen detected, a test has to occur during the preclinical stage. And whether or not that screening test would detect it would depend on what stage the tumor was in or the size of the tumor at that time of the disease. There are also a number of assumptions related to DCIS and the possibility of tumors that are non-progressive. Some of the models include DCIS as a precursor to invasive disease. Other models feel that there is too much uncertainty around the natural history of DCIS and they don’t include them in the models. There are also some models that specifically assume that there’s a fraction of disease that are either non-progressive or progress so slowly that they’re likely to be over-diagnosed. Other models do not specifically assign a fraction of disease as having limited or limited potential. So whether or not you believe that there is a percent of disease that would not progress would certainly affect your estimate of over-diagnosis. One of the goals of CISNET is to make modeling as transparent as possible and so one of the things that they’ve done is create a website where they describe their models in detail. So if somebody wanted to go and look specifically at the assumptions they made and how they differ between models, they could.