Incidence of Extrahepatic Cancers Among Individuals With Chronic Hepatitis B or C Virus Infection

A Nationwide Cohort Study

Chai Yeong Hong; Dong Hyun Sinn; Danbee Kang; Seung Woon Paik; Eliseo Guallar; Juhee Cho; Geum-Youn Gwak


J Viral Hepat. 2020;27(9):896-903. 

Materials and Methods

Study Population and Design

Korea has a single-payer national health system, the National Health Insurance Service (NHIS), which maintains records of all reimbursed inpatient and outpatient visits, procedures and prescriptions. We used the NHIS-National Sample Cohort (NHIS-NSC) for this retrospective population-based cohort study using data of 2.2% of all Korean citizens.[14] Sampling was systematically stratified and random, with proportional allocations within each stratum. The sampling procedures and cohort representativeness have been described elsewhere.[14] We used individual, longitudinal NHIS-NSC registration and claims data collected from 1 January 2002 to 31 December 2013.[14] In Korea, the NHIS provides free annual or biennial health examinations. Approximately 72% of all eligible persons undergo health examinations.[15] Our study population (N = 549 395) included all subjects ≥20 years of age in the NHIS-NSC who underwent at least one examination during our study period. We then excluded participants who had previous history of cancer (N = 12 290). The final sample size was 537 103 (272 856 males and 264 247 females). The Institutional Review Board of the National Cancer Center approved this study and waived the requirement for informed consent because we only used de-identified data.


The NHIS-NSC database includes information about insurance eligibility, medical treatments, medical institutions visited and general health examinations. The insurance eligibility subdatabase contains information about age, sex, residential area, type of health insurance, income level and any disability. To ascertain vital status, the NHIS was linked to mortality data from the population records of the Ministry of the Interior.[14] The medical treatment subdatabase contains information about treatments, including diseases and prescriptions.[14] The general health examination subdatabase contains results of the NHIS-funded health examinations.

All NHIS claims for inpatient and outpatient visits, procedures and prescriptions were coded using the International Classification of Diseases, Tenth Revision (ICD-10), and the Korean Drug and Anatomical Therapeutic Chemical Codes.[16] The NHIS routinely audits all claims. Thus, these data are considered reliable. They have been used in many peer-reviewed publications.[14,17]

Cancer was defined as the presence of the same C code more than three times within a year or an inpatient hospitalization with a C code. C codes are carefully reviewed by the NHIS because they have implications for additional benefits for patients. In Korea, once a person receives a cancer diagnosis, he/she is registered to the National Cancer Registry with a specific code (called C code) that informs the system that the person has been diagnosed with cancer so that special insurance benefits can be provided to that patient. Once a person has a C code, it is carried forward in medical records and claims created for that patient. Therefore, cancer diagnoses based on claims are considered reliable. A 1-year look-back window can exclude patients with a prior diagnosis of cancer (C code) effectively. It excludes patients with a prior diagnosis at any time (had history of cancer), not just patients who had a diagnosis of cancer in the prior year.

Chronic HBV or HCV infection was defined as the presence of any disease codes from 2002 to 2013 or death (HBV: B18.0, B18.1, B18.10, B18.18 or Z22.5; HCV: B18.2). Comorbidities during the year prior to the start of follow-up were also defined using ICD-10 codes[10] and summarized using the Charlson comorbidity index (CCI) score.[18,19] The CCI included diabetes with and without complications, myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, dementia, chronic pulmonary disease, connective tissue disease, peptic ulcer disease, mild liver disease, moderate/severe liver disease, paraplegia/hemiplegia and renal disease. Data on baseline income and residential area were collected using the insurance eligibility database. Income level was categorized by percentile: ≤30%, 30%-70% and >70%. Residential area was classified as metropolitan or rural. Metropolitan areas were defined as Seoul, six other large cities, and 15 cities with populations >500 000 legally designated as municipal cities. Smoking habits, history of diabetes and medication use were collected using self-administered questionnaires. Smoking status was categorized into never or past and current smoker. Current alcohol consumption was categorized into none, moderate (<30 g/day in men and <20 g/day in women) and heavy alcoholic (≥30 g/day in men and ≥20 g/day in women). Height, weight and blood pressure were measured. Body mass index (BMI) was calculated as weight in kilogram divided by height in metre squared. It was classified according to Asian-specific criteria (underweight, BMI <18.5 kg/m2; normal weight, BMI of 18.5–22.9 kg/m2; overweight, BMI of 23–24.9 kg/m2; and obese, BMI ≥25 kg/m2).[20]

Statistical Analyses

The study endpoint was the development of cancer. Participants were included in the study at the baseline screening examination. They were followed up until the development of cancer, death or the end of the study period (31 December 2013).

As age is a strong determinant of the probability of cancer development and of death, we used age as the time scale.[21] As vertical transmission is a major mode of HBV infection in Korea,[10] participants with chronic HBV infection were considered as 'exposed' to HBV from the baseline visit. Chronic HCV infection was regarded as a time-varying variable. The onset of chronic HCV infection was defined as the age at the first appearance of an ICD code for chronic HCV infection (B18.2). Participants who developed chronic HCV infection during follow-up contributed to the 'unexposed' group prior to chronic HCV infection, and to the 'exposed' group from the onset of chronic HCV infection. In sensitivity analysis, chronic HBV infection was regarded as a time-varying variable in a similar way to chronic HCV infection. Cumulative incidence curves were generated using the Kaplan-Meier product-limit method and compared using the log-rank test. We calculated hazard ratios (HRs) with 95% confidence intervals (CIs) for developing cancer using Cox's proportional hazard regression models. We adjusted for sex (male or female), BMI (underweight, normal, overweight, obese and unknown), smoking status (never or past, current or unknown), alcohol intake (none, light, moderate, heavy or unknown), income percentile (≤30%, 30< to ≤70% and 70%<), residential area (metropolitan and rural) and CCI score.

We examined the plausibility of the proportional hazards assumption using plots of the log (–log) survival function and Schoenfeld residuals. A P-value <.05 was considered to reflect statistical significance. All analyses were performed using STATA ver. 14 software (StataCorp LP).