Deep Learning Applications in Ophthalmology

Ehsan Rahimy


Curr Opin Ophthalmol. 2018;29(3):254-260. 

In This Article

Age-related Macular Degeneration

Recent studies have reported on the use of deep learning for automated assessment of AMD. Burlina et al.[24] applied two different deep learning algorithms to solve a two-class AMD classification problem, categorizing fundus images from the National Institutes of Health AREDS dataset (n > 130 000 images) as either disease free/early stage AMD (for which dietary supplements are not considered) versus those with the intermediate or advanced stage AMD (for which supplements and monitoring would be considered). The investigators found that both deep learning methods yielded accuracy that ranged between 88.4 and 91.6% whereas the AUC was between 0.94 and 0.96. These findings were promising and indicated performance levels comparable with physicians.

With the promising results from deep learning interpretation of fundus photography, efforts quickly expanded towards OCT analysis, given its widespread adoption and integration into routine management of retinal diseases. Several groups have successfully utilized deep learning in segmentation of OCT scans for detection of morphological features such as intraretinal fluid (IRF) or subretinal fluid (SRF) from various retinovascular diseases.[25–29] With respect to AMD, application of deep learning techniques to OCT may be advantageous to traditional fundus photography, given the superior resolution of SD-OCT and potential for more precise, earlier detection of nonneovascular and neovascular disease states.

Lee et al.[30] demonstrated that deep learning techniques were effective in differentiating OCT scans from normal individuals versus those afflicted with AMD. For their study, training and validation sets were derived using automated extraction of their institution's Heideleberg Spectralis OCT imaging database, which were then linked to the corresponding medical record extracted from their Epic electronic medical record. A total of 80 839 images (39 765 normal and 41 074 AMD) were used for training and 20 163 images (8547 normal and 11 616 AMD) were used for validation. The investigators found that at the level of each individual OCT image, the deep learning algorithm demonstrated an accuracy of 87.6%, with an AUC of 0.928. Whenever images from the same OCT acquisition were aggregated together and averaged the probabilities from each individual image, the accuracy improved to 88.9%, with an AUC of 0.938. Furthermore, whenever they averaged the probabilities from each image from the same patient, the accuracy additionally improved to 93.5%, with an AUC of 0.975. The peak sensitivity and specificity with optimal cutoffs were 92.6 and 93.7%, respectively. In a smaller scale study using a different deep learning system, Treder and colleagues similarly reported very high accuracy in detecting exudative AMD changes on OCT imaging.

Beyond diagnosing disease, researches are investigating deep learning methodologies to identify OCT structural biomarkers in hopes of predicting clinical treatment outcomes.[31,32] Schmidt-Erfurth and colleagues applied deep learning techniques to OCT images from 614 clinical trial patients (HARBOR trial) aiming to predict functional response to intravitreal anti-vascular endothelial growth factor (VEGF) therapy. In one study, a deep learning algorithm was applied to delineate retinal layers and the choroidal neovascularization (CNV)-associated lesion components, IRF, SRF, and pigment epithelial detachment.[31] These were extracted together with visual acuity measurements at baseline, months 1–3, and then used to predict vision outcomes at month 12 by using random forest machine learning. The group found that the most relevant OCT biomarker for predicting the corresponding visual acuity was the horizontal extension of IRF within the foveal region, whereas SRF and pigment epithelial detachment ranked lower. With respect to predicting final visual acuity outcomes after 1 year of treatment, the algorithm's accuracy increased in a linear fashion with each successive month of data included from the initiation phase, with the most accurate predictions being generated at month 3 (R2 = 0.70). In a separate study, the same group applied their deep learning techniques to assess whether low and high ranibizumab injection requirements from the pro re nata (PRN) arm of the HARBOR trial could be predicted based off of the OCT scans at baseline, month 1, and month 2.[32] Of 317 eligible patients, 71 had low (≤5), 176 had medium, and 70 had high (≥16) injection requirements during the PRN phase of treatment extending from month 3 to month 23. The authors found that classification within low or high treatment demonstrated an AUC of 0.7 and 0.77, respectively. Additionally, the most relevant OCT biomarker for prediction of injection burden was volume of SRF within the central 3 mm at month 2.