National Cancer Institute Home at the National Institutes of Health | www.cancer.gov

Comparability Test

The main goal of the comparability test is to compare two sets of trend data whose mean functions are represented by joinpoint regression. Specific interests are on testing:

  1. whether two Joinpoint regression functions are identical (test of coincidence) or
  2. whether the two regression mean functions are parallel (test of parallelism)

and the details can be found in Kim et al. (2004):

H.-J. Kim, M. P. Fay, B. Yu, M. J. Barrett and E. J. Feuer (2004), Comparability of Segmented Line Regression Models, Biometrics, 1005-1014.

Consider the Joinpoint regression mean function at the jth year of the ith group, x = xij ,

Equation of the joinpoint regression mean function

where κ is the unknown number of joinpoints, the τ's are the unknown joinpoints, the β's and the δ's are the regression coefficients. In general, the two groups may have different numbers of joinpoints, say κi for the ith group, but the program fits both groups with a larger model with kmax joinpoints.

What are the hypotheses?

The null hypotheses for the test of coincidence and the test of parallelism are

Null hyphothesis for the test of coincidence

and

Null hyphothesis for the test of parallelism

respectively.

What is the test statistic?

The test statistic with k = kmax is

Test statistic

where RSS denotes the residual sum of squares obtained from the least squares fitting, and d1 and d2 are appropriate degrees of freedom.

How is the kmax chosen?

The recommended data driven choice of kmax is the largest number of joinpoints estimated under the null and alternative models:

Equation of the largest number of joinpoints estimated under the null and alternative models

where Kappa sub zero is the number of joinpoints estimated for the two groups combined under the null model, and Kappa sub one and Kappa sub two are the numbers of Joinpoints estimated separately.

How is the p-value estimated?

The P-value of the test is estimated using the permutation distribution of the test statistic. The residuals, obtained under the null model, are permuted to generate the permutation distribution of the test statistic and the P-value is estimated as the proportion of the permutation data sets whose test statistic values are greater than the original test statistic value.