Review Article

Shared Disease Mechanisms Between Nonalcoholic Fatty Liver Disease and Metabolic Syndrome

Translating Knowledge From Systems Biology to the Bedside

Silvia Sookoian; Carlos J. Pirola

Disclosures

Aliment Pharmacol Ther. 2019;49(5):516-527. 

In This Article

Results

Literature searches yielded 6270 genes/proteins associated with NAFLD, NAFLD-associated progressive phenotypes, and diseases of the Metabolic syndrome. The number of retrieved terms (genes) from the Genie server used for further analyses was as follows: non-alcoholic fatty liver disease n = 810; diabetes mellitus n = 1562; dyslipidemia n = 469; hypertension n = 552; obesity n = 455; inflammation n = 1283 and fibrosis n = 482.

To identify pathways associated with disease mechanisms, we analysed each gene list associated with disease phenotypes (NAFLD, type 2 diabetes, hypertension, obesity and dyslipidemia) using gene enrichment analysis for biological processes, as explained in the Methods section. Overrepresented biological processes are summarised in Figure 1A-D and Figure S2. Inflammatory response was significantly overrepresented in the list of genes associated with almost all phenotypes, including NAFLD, type 2 diabetes and arterial hypertension. In addition, some expected pathways—for example, glucose and cholesterol metabolism, and positive regulation of cold-induced thermogenesis (GO:0120162)—were significantly overrepresented in the list of genes associated with NAFLD, type 2 diabetes and obesity. In contrast, response to hypoxia was significantly overrepresented in the list of genes associated with type 2 diabetes and hypertension, and cytokine-mediated signalling pathway was significantly associated with NAFLD and type 2 diabetes.

Figure 1.

Non-alcoholic fatty liver disease (NAFLD) and Metabolic syndrome associated biological pathways. Gene enrichment analysis for biological processes (Gene Ontology database, http://www.geneontology.org/) of the list of genes associated with NAFLD (A), type 2 diabetes (B), arterial hypertension (C), and obesity (D) in biomedical literature annotations using the FunRich (http://www.funrich.org) tool. Bonferroni and Benjamini-Hochberg and FDR (false discovery rate) methods were used to correct for multiple testing. Pathways were ranked according to the P value (purple bar), whereby P < 0.05 was considered statistically significant. The light blue bar indicates the percentage of altered genes in the whole pathway. GO:120162: positive regulation of cold-induced thermogenesis. The input list of genes for each phenotype was generated by searching the Genie web server (http://cbdm.uni-mainz.de/genie). This resource performs literature-enrichment analysis by computing associations of genes with keywords using biomedical literature annotations

Supplementary Figure S2.

Gene enrichment analysis for biological processes associated with dyslipidemia. Gene enrichment analysis for biological processes (Gene Ontology database, http://www.geneontology.org/) with dyslipidemia in biomedical literature annotations using the FunRich (http://www.funrich.org) tool. Bonferroni and Benjamini?Hochberg and FDR (false discovery rate) method were utilised to correct for multiple testing.

Next, we sought to assess the degree of shared genes/proteins among NAFLD, NAFLD progression-related disease phenotypes (inflammation and fibrosis), and the Metabolic Syndrome disease components. To accomplish this aim, we used the FunRich tool to create a matrix of pairwise comparisons between two phenotypes, which shows the number and percentage of shared genes between phenotypes (Figure 2A). It is noteworthy that a high percentage of genes (above 35%) in the NAFLD overlapped with the genes associated with type 2 diabetes, obesity and dyslipidemia (Figure 2A), whereas NAFLD and dyslipidemia shared 40.4% of the genes. Hence, we examined the intersection of the lists of genes associated with NAFLD and the Metabolic syndrome phenotypes (Figure 2B-D), as well as the unique list of genes that were shared among the cluster of the diseases (Figure 2E). The refined intersection analysis was performed using 412 genes (Figure 2B-D), and the final list of shared genes/proteins contained 50 terms (Table 1). This strategy allowed us to enrich and prioritise further analyses by focusing on the gene candidates that had the highest degree of commonality among all diseases and traits of interest. A word cloud diagram shown in Figure 2F highlights the most represented genes among the cluster of diseases, for example, adiponectin (ADIPQ), retinol binding protein 4 (RBP4), tumour necrosis factor (TNF), insulin like growth factor 1 (IGF1), insulin (INS), insulin receptor (INSR), angiotensinogen (AGT), angiotensin-converting enzyme (ACE), leptin (LEP), vitamin D receptor (VDR) and peroxisome proliferator activated receptor gamma (PPARG), among other genes.

Figure 2.

Genes that are common to Non-alcoholic fatty liver disease (NAFLD) and Metabolic syndrome. A, A matrix showing pairwise comparison of shared genes (expressed both as a number and a percentage) between components of the metabolic syndrome, NAFLD, and progression of NAFLD-related disease phenotypes (inflammation and fibrosis). The chart was generated using the FunRich tool available at http://www.funrich.org. The input list of genes for each phenotype was generated by searching the Genie web server as explained above. B, Venn diagram showing the number of genes that are common (overlapping areas) and dissimilar (non-overlapping areas) in NAFLD, arterial hypertension, type 2 diabetes (T2D) and obesity; C, Venn diagram showing the number of genes that are common (overlapping areas) and dissimilar (non-overlapping areas) in NAFLD, type 2 diabetes (T2D), inflammation, and fibrosis; D, Venn diagram showing the number of genes that are common (overlapping areas) and dissimilar (non-overlapping areas) in NAFLD, arterial hypertension, dyslipidemia and obesity; E, Venn diagram showing a common set of 50 genes that were identified as being shared among lists of genes shown in the overlapping area of Venn diagram B (n = 138), C (n = 140) and D (n = 135). Gene list is shown in Table 1. F, Word cloud diagram showing the genes in the intersection list of genes depicted in panel B (n = 138), C (n = 140) and D (n = 135). Specific terms (genes) that were overrepresented in the lists are emphasised in the word cloud by the largest font; in the word cloud, overrepresented genes are shown by scaling the size of each term by the degree of repletion in the seed lists. The shared list of genes as the intersection set from the gene lists was obtained using the platform NetworkAnalyst (http://www.networkanalyst.ca)

Once we performed the intersection analysis on the lists of genes and indentified a unique list of shared genes/proteins, we proceeded with determining the network of biological interactions associated with them. To this end, we performed REACTOME analysis utilising the PANTHER resource. The four most frequently overrepresented pathways were interleukin-10 signalling (∼65-fold), SUMOylation of intracellular receptors (∼60-fold), interleukin-4 and interleukin-13 signalling (∼40-fold) and regulation of insulin-like growth factor transport and uptake by insulin-like growth factor binding proteins (∼27-fold). The full list of significantly overrepresented reactome pathways is shown in Table 2, indicating that, overall, immune-related pathways were typically overrepresented. Connectivity network of genes in the shared list shows a high degree of protein-protein interaction (Figure S3).

Supplementary Figure S3.

Connectivity network. The connectivity gene/protein interaction network shows the set of 50 shared genes (nodes) connected by edges representing functional relationships among them. The functional products of genes (proteins) interact with each other by different mechanisms; figure shows different level of evidence extracted from curated databases, experimentally determined, predicted, etc, as explained in the figure footnote. Connectivity network was generated in the STRING's website.

NAFLD and Metabolic Syndrome Associated Network of Genes Exhibits a High Degree of Pleiotropy With Systemic Diseases

Results pertaining to reactome pathways suggest that mechanisms potentially associated with NAFLD and diseases of the Metabolic syndrome are indeed pathogenic pathways common to many complex traits. Hence, we sought to detect cross-phenotype associations between the 50 unique genes shared among NAFLD and Metabolic syndrome-related phenotypes and other systemic diseases, including cancer. To infer gene-disease associations, we used the ToppCluster tool, as explained in the Materials and Methods section.[2] Consistent with our hypothesis, the results related to the gene-diseases interaction network showed a high level of predicted associations between the seed list of 50 prioritised genes and immune-mediated disorders, respiratory, kidney, central nervous system-related and infectious diseases, as well as solid and haematological cancers (Figure 3).

Figure 3.

Non-alcoholic fatty liver disease (NAFLD) and Metabolic syndrome: Pleiotropic genes with systemic diseases. The network is shown as a Cytoscape graph. The training set consisted of the 50 shared genes(cluster 1) as explained above (Figure 2E). Prediction analysis was performed by the ToppCluster resource available at https://toppcluster.cchmc.org/. The list of predicted disease terms was manually curated to highlight examples of overrepresented systemic diseases and was restricted to the P values lower than 1 × 10−10. The enrichment map shows terms corresponding to the 50 genes in the shared gene list (red hexagons) and the predicted associated diseases that had significance scores (green squares). The analysis was gene-centred on the seed gene list of 50 prioritised genes (cluster 1)

On the other hand, pairwise comparisons between the lists of genes associated in the biomedical literature with NAFLD and those associated with type 2 diabetes (T2D), obesity, and/or hypertension, and enrichment analysis based on genes associated with all inherited diseases that are annotated in the OMIN database, reinforced the concept that NAFLD and the Metabolic syndrome-associated disorders share a similar gene burden predisposing the affected individual to cardiovascular and metabolic diseases, cirrhosis, and neoplasias. Remarkable differences in the percentage of genes associated with late-onset diseases, insulin resistance and obesity were noted (Figure 4). For example, compared to T2D-associated gene list, NAFLD-associated gene list included a significantly higher proportion of genes that were enriched with progressive diseases that usually present an insidious onset in mid- or late adulthood (9.5% vs 7.5%, respectively, P < 0.001). Conversely, relative to obesity-associated gene list, NAFLD-associated gene list contained a significantly lower proportion of genes that were enriched with late-onset diseases (9.6% vs 17.7%, respectively, P < 0.001). Of note, the pleiotropic effects of NAFLD-associated genes have been highlighted recently using several Biobank data.[16]

Figure 4.

Functional enrichment analysis based on clinical phenotypes annotated in OMIN database. Doughnut charts showing functional enrichment analysis of the individual gene lists, respectively, associated with NAFLD and the Metabolic syndrome diseases (type 2 diabetes, obesity, arterial hypertension and dyslipidemia) and bulk data deposited in the OMIN database. Charts show the proportion of genes observed in each individual list that were also associated with diverse clinical phenotypes in the OMIN database; doughnut charts were generated by the FunRich tool

NAFLD and Metabolic Syndrome Drug Interaction Network

To build a drug-gene interaction network for NAFLD and Metabolic syndrome, and to infer gene-drug associations, we used the list of 50 shared genes as explained above. We created a gene-drug interaction connectivity network using the ToppCluster resource based on the gene-centric approach. The list of predicted drug terms was manually curated to highlight the most representative examples of drugs that could be used alone or in combination in the treatment of NAFLD and associated comorbidities; the list was restricted to the P values below 1 × 10−10. Notably, the drug interaction network predicted approved pharmacological agents that are currently in use for treating cardiovascular diseases, anti-inflammatory drugs, and phytochemicals, among other compounds, as shown in Figure 5. In addition, the interaction network included antibacterial or antimalarial agents, and an antagonist of CCR1 (C-C Motif Chemokine Receptor 1), the latter being a member of the beta chemokine receptor family that plays a central role in leukocyte trafficking and that is highly expressed in autoimmune diseases. These findings highlight the possibility of repositioning the drugs currently used in the treatment of Metabolic syndrome to ameliorate NAFLD progression.

Figure 5.

Non-alcoholic fatty liver disease (NAFLD) and Metabolic syndrome: gene/protein drug connectivity network. The network is shown as a Cytoscape graph. The training set consisted of the 50 shared genes (cluster 1) as explained above (Figure 1E). Prediction analysis was performed by the ToppCluster resource available at https://toppcluster.cchmc.org/. The list of predicted drug terms was manually curated to highlight examples of drug repositioning and was restricted to the P values lower than 1 × 10−10. The enrichment map shows terms corresponding to the selected genes in the shared gene list (red hexagons) and the predicted drugs/bioactive compounds that had significance scores (orange squares)

Integration of Functional OMIC Data and Biological Implications

To understand the biological implications of the predicted NAFLD-Metabolic Syndrome interaction network of genes, we integrated the list of shared genes with functional OMICs data, including transcriptomics and proteomics. We first aimed to establish whether genes that are common to NAFLD, NAFLD-progression associated outcomes and the Metabolic Syndrome-related diseases are specifically relevant to the normal liver physiology. To achieve this aim, we constructed a heat map for the list of 50 shared genes (input gene list) that shows liver gene expression (RNA-seq expression) levels generated using data extracted from The GTEx Consortium (Figure 6A). It is noteworthy that some genes showed high levels of expression in the liver tissue, for instance, APOA1 (apolipoprotein A1), CRP (C-reactive protein), HP (haptoglobin), RBP4 (retinol binding protein 4) and AGT (angiotensinogen). Conversely, certain transcripts, such as IL1A (Interleukin 1 Alpha), PTX3 (Pentraxin 3) or LEP (leptin), exhibit none or very low expression levels in normal liver (Figure 6A). In addition, we explored protein tissue expression predominance of the shared genes under physiological conditions. For this purpose, we integrated the shared gene list with the whole proteome data, which allowed us to establish that liver, heart, and pancreas present similar levels of protein expression—either up- or downregulation—of shared relevant proteins (Figure 6B). Heat map in Figure 6B shows protein expression levels of the shared list in immune cells and platelets as well, which might be relevant to the search of circulating biomarkers. Interestingly, DPP4 is highly expressed in pancreas and liver, explaining the putative beneficial effect/s of DPP4 inhibitors in NAFLD, which is a rapidly proliferating research domain.[17,18,19]

Figure 6.

Integration of functional OMIC data and biological implications of the gene connectivity network. A, Heat map for the list of the 50 genes shared among NAFLD, progression of NAFLD-associated outcomes, and the Metabolic syndrome diseases (input gene list) showing liver gene expression levels (RNA-seq expression). Heat map was generated using data extracted from The GTEx Consortium (Data source: GTEx Analysis Release V7). Expression values are shown in transcripts per million (TPM). B, Heat map for the list of 50 genes showing protein expression levels generated using Human proteome map-quantitative dataset as background. Heat map was generated by the FunRich tool based on the information retrieved from the UniProt database

Comments

3090D553-9492-4563-8681-AD288FA52ACE

processing....