Systematic Review With Meta-analysis

Artificial Intelligence in the Diagnosis of Oesophageal Diseases

Pierfrancesco Visaggi; Brigida Barberio; Dario Gregori; Danila Azzolina; Matteo Martinato; Cesare Hassan; Prateek Sharma; Edoardo Savarino; Nicola de Bortoli


Aliment Pharmacol Ther. 2022;55(5):528-540. 

In This Article

Abstract and Introduction


Background: Artificial intelligence (AI) has recently been applied to endoscopy and questionnaires for the evaluation of oesophageal diseases (ODs).

Aim: We performed a systematic review with meta-analysis to evaluate the performance of AI in the diagnosis of malignant and benign OD.

Methods: We searched MEDLINE, EMBASE, EMBASE Classic and the Cochrane Library. A bivariate random-effect model was used to calculate pooled diagnostic efficacy of AI models and endoscopists. The reference tests were histology for neoplasms and the clinical and instrumental diagnosis for gastro-oesophageal reflux disease (GERD). The pooled area under the summary receiver operating characteristic (AUROC), sensitivity, specificity, positive and negative likelihood ratio (PLR and NLR) and diagnostic odds ratio (DOR) were estimated.

Results: For the diagnosis of Barrett's neoplasia, AI had AUROC of 0.90, sensitivity 0.89, specificity 0.86, PLR 6.50, NLR 0.13 and DOR 50.53. AI models' performance was comparable with that of endoscopists (P = 0.35). For the diagnosis of oesophageal squamous cell carcinoma, the AUROC, sensitivity, specificity, PLR, NLR and DOR were 0.97, 0.95, 0.92, 12.65, 0.05 and DOR 258.36, respectively. In this task, AI performed better than endoscopists although without statistically significant differences. In the detection of abnormal intrapapillary capillary loops, the performance of AI was: AUROC 0.98, sensitivity 0.94, specificity 0.94, PLR 14.75, NLR 0.07 and DOR 225.83. For the diagnosis of GERD based on questionnaires, the AUROC, sensitivity, specificity, PLR, NLR and DOR were 0.99, 0.97, 0.97, 38.26, 0.03 and 1159.6, respectively.

Conclusions: AI demonstrated high performance in the clinical and endoscopic diagnosis of OD.


Artificial intelligence (AI) is being extensively applied to different medical settings aiming to improve the performance in the diagnosis of various diseases, including gastrointestinal (GI) diseases. The term AI generically refers to complex computer algorithms that mimic human cognitive functions, including learning and problem-solving.[1] Machine learning (ML) is a field of AI that can be taught to discriminate characteristics of data samples and then apply experience to interpret previously unknown information.[2] Supervised ML with support vector machine (SVM) is based on hand-crafted algorithms in which researchers, based on clinical knowledge, manually indicate features of interest of an input data set (labelled data set) to train the system to recognise discriminative features and provide appropriate outputs.[1] Deep learning (DL) is a subset of ML which can autonomously extract discriminative attributes of input data through artificial neural networks, often organised as convolutional neural networks (CNNs), which are constituted of multiple layers of non-linear functions.[1,3]

AI is increasingly being integrated into computer-aided diagnosis (CAD) systems for GI diseases to improve detection (CADe) and characterisation (CADx) of pathology. Consistently, recent meta-analyses concluded that the use of AI during lower endoscopic procedures significantly increased the detection of colorectal neoplasia.[4,5]

More recently, various studies evaluating the performance of AI in the diagnosis of oesophageal diseases (ODs) have been published. The main application of AI in the upper GI tract is endoscopy and neoplasia detection. Ideally, upper GI endoscopies and biopsies should not miss lesions, but the ability to recognise endoscopic images depends on individual expertise. This is particularly relevant for subtle upper GI lesions, where experienced endoscopists can make a difference in the diagnosis. In this setting, CAD tools have the potential to successfully assist both trainee and expert physicians to reduce variability in the detection of upper GI pathology, increasing the diagnostic accuracy regardless of individual expertise and virtually overcoming inter- and intra-observer variability.[6]

In addition, deep learning using multi-layered neural networks powered by high-performance computing clusters are capable of recognizing complex non-linear patterns in datatypes that previously were intractable to process, such as endoscopic images and videos. In this setting, AI has been applied to clinical questionnaires for gastro-oesophageal reflux disease (GERD), pH-impedance and oesophageal manometry tracings, and for the evaluation of mRNA transcripts in the diagnosis of eosinophilic oesophagitis (EoE).

DL models are black boxes in which the input data and the output (diagnosis) are known, but the processes by which the diagnosis is achieved are not, and this may be counterproductive.[6] Accordingly, research is already heading to understand how DL models make decisions to solve interpretability gaps, and methods to understand the process of CNN-based choices are being developed.[7,8]

AI support in decision-making is a fascinating and rapidly evolving topic. Accordingly, we performed a systematic review with meta-analysis of currently available evidence on the performance of AI in the diagnosis of oesophageal diseases (ODs), updating previous evidence on oesophageal cancer[9,10] and assessing evidence on the performance of AI in the detection of intrapapillary capillary loops (IPCLs) and in the diagnosis of benign ODs.