Risk of Hepatic and Extrahepatic Cancer in NAFLD

A Population-based Cohort Study

Karl Björkström; Linnea Widman; Hannes Hagström


Liver International. 2022;42(4):350-362. 

In This Article

Materials and Methods

Study Population

We used the Swedish National Patient Registry (NPR) to identify all patients diagnosed with NAFLD in Sweden from 1 January 1987 to 31 December 2016. The NPR contains data on all patients discharged from hospitals in Sweden, and from 2001 the registry also contains data on all specialized care outpatient visits. The NPR's positive predictive value (PPV) for most chronic diseases ranges between 85% and 95%.[22] The PPV is 91% for HCC patients with established liver disease.[23] Each patient with NAFLD was matched on sex, age, county of residence and calendar year of diagnosis with up to 10 controls free of NAFLD obtained from Statistics Sweden. The International Classification of Disease (ICD) codes used to identify patients with NAFLD were 571.8 in ICD-9 and K75.8 or K76.0 in ICD-10. We defined the presence of cirrhosis using the ICD codes 571.5 in ICD-9 and K74.6 in ICD-10.

We excluded patients with NAFLD and controls with any of the following: liver diseases other than NAFLD, a history of drug or alcohol abuse, previous liver transplant and any cancer except for non-melanoma skin cancer before baseline (Figure 1). We censored patients diagnosed during follow-up for another liver disease or a diagnosis of alcohol or drug abuse. Study participants were further censored at emigration from Sweden, death or liver transplantation, whichever is applicable (eTable 1 presents specific ICD codes).

Figure 1.

Flow chart of study participants


The primary outcome was time to the first diagnosis of any type of cancer (ICD-7: 140–207 except 191), except non-melanoma skin cancer in the Swedish Cancer Registry (SCR). The SCR has data on approximately 96% of all cancer diagnoses in Sweden.[24] Secondary outcomes were time to the first diagnosis of the following cancers in the SCR: HCC, colorectal, gastric, kidney, bladder, cervical, ovarian, uterine, breast, lung, oesophageal and prostate cancer (see eTable 1 for ICD codes). Diagnoses of cancer were not deemed outcomes if they occurred earlier than 1 year after baseline. When analysing secondary outcomes, we were interested in specific cancer diagnoses even if they did not occur as a first cancer during follow-up. Hence, if a patient had a first diagnosis of cancer that occurred earlier than 1 year after baseline, and then the second diagnosis of another kind of cancer specified as secondary outcome later than 1 year after baseline, the second diagnosis was counted as the outcome in the analysis of secondary outcomes.

To investigate whether the risk of cancer could be influenced by differences between patients with NAFLD and controls in non-cancer mortality, we used data from the Causes of Death Registry, which contains information on the cause of death for all citizens in Sweden.[25] We defined causes of death as either cancer related or non-cancer related. Cancer-related death was defined as having an ICD code of any cancer, except for a non-melanoma skin cancer diagnosis as a primary or secondary cause of death.


We included diabetes, hypertension, hyperlipidaemia and chronic obstructive pulmonary disease (COPD) as covariates in the regression models. Diabetes, hypertension and hyperlipidaemia were included as markers of metabolic health, which is related both to NAFLD and risk of cancer. The used registers do not contain more detailed data on possible confounders, such as waist circumference, body mass index plasma glucose or blood lipid profiles. Because of the lack of direct data on smoking, we included COPD as a proxy for smoking. All covariates were defined as the corresponding ICD codes in the NPR, and both primary and secondary codes were used to identify covariates (eTable 1 lists the specific ICD codes).

Statistical Analysis

Differences between baseline variables of patients with NAFLD with and without cirrhosis at baseline were calculated using Fischer's exact test for categorical variables and Wilcoxon rank-sum test for continuous variables. We estimated incidence rates (IRs) per 1000 person-years for primary and secondary outcomes as the total number of outcomes divided by person-years of follow-up. Univariate and multivariable (adjusted for diabetes, hypertension, hyperlipidaemia and COPD) Cox regression models were used to estimate the association between NAFLD and the primary and secondary outcomes. We chose to use the same adjustment factors for all cancer subtypes to improve model comparability between these outcomes.

Sensitivity Analyses

First, we investigated the impact of cirrhosis on cancer risk, comparing patients without cirrhosis to their respective controls.

Second, given that the risk of any cancer might be positively influenced by an unbalanced risk for HCC in patients with NAFLD, one sensitivity analysis excluded all individuals in which HCC was the first diagnosed cancer.

Third, because the risk of HCC might be attributed to cirrhosis not diagnosed at baseline but instead detected during follow-up, one analysis excluded any patient who developed cirrhosis during follow-up.

Fourth, in an adjunct analysis, we applied a competing risk regression in which non-cancer death was the competing risk and adjusted for the same covariates as the adjusted Cox model. This analysis was done because the risk of death might be higher in the NAFLD population, possibly inflating the estimates for cancer risk.

Finally, we examined the risk of all cancers and secondary outcomes in males and females separately.