Multistage Carcinogenesis and the Incidence of Thyroid Cancer in the US by Sex, Race, Stage and Histology

Rafael Meza; Joanne T. Chang


BMC Public Health. 2015;15(789) 

In This Article

Materials and Methods

Data Sources

Thyroid cancer incidence data was obtained from the Surveillance, Epidemiology, and End Results (SEER)-9 registries for the years 1973–2010. We extracted reported thyroid cancer cases by sex, race, age, stage, histology and calendar year in the nine SEER geographic areas, which together represent an estimated 9.5 % of the U.S. population.[34] Thyroid cancer cases were coded using the International Classification of Diseases for Oncology, third edition (ICD-O-3).[35] We restricted our analysis by histology to papillary (8050, 8052, 8130, 8260, 8340–8344, 8450, 8452) and follicular (8290, 8330–8332, 8335) types. Person-years by sex, race, age, and calendar year were obtained from the SEER registry. We analyzed combined thyroid cancer incidence to allow for race/gender comparisons, and also perform independent analyses by histology for all males and all females separately.

Multistage Model and Age-Period-Cohort Analyses

We performed likelihood-based analyses of the incidence of thyroid cancer in the SEER registries using APC models, where we replaced the non-specific age effects of traditional APC analyses with the hazard of multistage models. Secular trends, i.e., period and cohort effects, were modeled in the usual fashion as described below.[28,31,32] This approach constrains the age-effects parametrically and solves in principle the non-identifiability issues of APC models allowing us to estimate jointly the age, period and cohort trends.[36] Briefly, the thyroid cancer age-specific incidence at age a occurring in calendar year j is modeled as:

where h(a) is the Two-Stage Clonal Expansion (TSCE) model hazard described below, cj is a coefficient that adjusts for calendar year j, and the coefficient bi adjusts for birth cohort i (i = j-a , stratified in 5-year groups; <1890, 1890–1894,… ,1985–1990, and ≥1991). We used single ages from 0–84 and single calendar years from 1973–2010. We then fitted the model to the number of observed thyroid cancer cases (papillary and follicular combined) stratified by age and calendar-year. We obtained parameter estimates by maximizing the likelihood across all age-calendar strata assuming that the number of cases in each stratum is Poisson distributed with mean Nij*hij(a), where Nij is the population at risk in age group i and birth cohort j, and hij(a) is as defined above. Separate analyses for all sex, race, stage, and histology combinations were also performed. Multistage model analyses were done using the Bhat likelihood minimization package in R (R version 3.0.3).

In addition, we also fitted traditional APC models for comparison using the Epi package in R,[37] and performed a joinpoint regression analysis using the statistical software Joinpoint, version 3.5 (Surveillance Research Program, US National Cancer Institute)[38] to characterize trends in age-adjusted incidence rates by sex, race, histology, and stage.

Two-Stage Clonal Expansion Model

The TSCE model posits that cells initiated via a Poisson process undergo clonal expansion and malignant conversion via a birth–death–mutation process, and is based on the initiation-promotion-malignant conversion paradigm in carcinogenesis. The details of this model are presented elsewhere.[27,29] Although the TSCE model is a simplification of the carcinogenesis and does not necessarily incorporate current knowledge about the natural history of thyroid cancer, it does capture the main aspects of tumor initiation-promotion and malignant conversion and thus has been used to analyze the temporal trends of a variety of cancers. In particular, this model and its generalizations have been used to analyze the incidence of a variety of cancers in SEER including colorectal cancer[28,31,33] esophageal cancer,[30,36] mesothelioma,[32] and pancreatic cancer.[31] The TSCE model has four biological parameters: the rate of initiation, μ0X, the rate of division, a, and death, b, of initiated cells, and the rate of malignant conversion, μ1 . Figure 1 shows a schematic representation of the TSCE model. Not all four parameters can be estimated from cancer incidence data alone. We estimated three identifiable parameters as described below. With constant parameters, the hazard function for this model takes the following form:

Figure 1.

Two-stage clonal expansion (TSCE) carcinogenesis model

where p and q are the roots of a quadratic equation, with p + q = -g = -(a - b-μ1) and p*q = a*μ1 . We estimated p(-g), q, and r = μ0X/a; which comprise a set of identifiable parameters. Note that p is roughly the net rate of proliferation of initiated cells (since μ1 is a mutation rate and thus much smaller than a and b), q ~ μ1/(1-b/a), and r is related to the rate of tumor initiation.