Google AI Matches Diabetic Retinopathy Screens

Rabiya S. Tuma, PhD

November 30, 2016

Ophthalmologists do not have to worry just yet, but data from a new machine learning study suggest computers will be able to screen for diabetic retinopathy with similar accuracy, and better cost-effectiveness, in the not-too-distant future, according to a report published online November 29 in JAMA.

Experts are divided, however, as to how and when such a technology might be integrated into care, and how specialists such as ophthalmologists and radiologists, who are trained to evaluate image data, should respond.

Prior studies have already shown that computers can learn to recognize diabetic retinopathy in retinal photographs, but until now, they have not reached the accuracy needed for clinical use.

The most promising of these efforts has used a process called deep machine learning, in which the algorithm is not told which features of an image are important but, rather, develops its own rules as it is exposed to an increasing number of annotated images.

Therefore, to improve on the earlier efforts, Varun Gulshan, PhD, a research scientist in machine learning at Google Inc, Mountain View, California, and colleagues presented their algorithm with both a larger number of images than ever used before and more annotation for each image during its training phase.

Deep Neural Networks Can Be Trained

Specifically, the investigators used 128,175 images from the EyePACS data set that were obtained from diabetic retinopathy screening sites in the United States and three eye hospitals in India. Each image was graded for diabetic retinopathy, diabetic macular edema, and image quality by from three to seven ophthalmologists or ophthalmology trainees in their last year of training.

In the training set, 118,419 images had sufficient quality to be assessed for retinal disease. Of those, 33,246 (28.1%) had referable diabetic retinopathy, "defined as moderate or worse diabetic retinopathy or referable macular edema by the majority decision of a panel of at least 7 US board-certified ophthalmologists."

The algorithm was then tested on two independent data sets. The first was from EyePACS-1 (with no overlap to the training set) and included 9963 images from 4997 patients; the prevalence of diabetic retinopathy was 7.8% among the gradable images. The second was the Messidor-2 data set, which included 1748 images from 874 patients; disease prevalence was 14.6% of the gradable images.

Using a restrictive cutpoint chosen to mimic the specificity of ophthalmologists, the algorithm had a sensitivity and specificity for detecting diabetic retinopathy of 90.3% and 98.1%, respectively, for the EyePACS-1 data set, and 87.0% and 98.5%, respectively, for the Messidor-2 set.

With a more lenient cutpoint, tailored to meet the needs of a screening program, the algorithm had a sensitivity of 97.5% and a specificity of 93.4% for the EyePACS-1 data set. For the Messidor-2 set, it was 96.1% and 93.9%, respectively.

"These results demonstrate that deep neural networks can be trained, using large data sets and without having to specify lesion-based features, to identify diabetic retinopathy or diabetic macular edema in retinal fundus images with high sensitivity and high specificity," the researchers write.

Integrating Into Care

Although the authors note that further work will be needed before such a program could be used in regular care, other experts raise questions about how that integration would work and what it might look like.

Writing in the first of two accompanying editorials, Tien Yin Wong, MD, PhD, medical director of the Singapore National Eye Centre, and Neil M. Bressler, MD, chief of the Retina Division at the Wilmer Eye Institute at Johns Hopkins Medicine in Baltimore, Maryland, and editor of JAMA Ophthalmology, note that an automated screening program could increase capacity and cost-effectiveness for an overloaded system.

In addition, the screening accuracy achieved is "substantially better than what screening guidelines would recommend (typically >80% sensitivity and specificity)," they write.

However, there are gaps in the data, according to Dr Wong and Dr Bressler. For example, the researchers did not test for the algorithm's ability to distinguish severe cases of diabetic retinopathy.

"These most severe cases typically require urgent referral and clinical care and ideally should not be missed by any screening program (whether human or software)," they stress.

Similarly, they note that a clinician screening for one disease would also be looking for other problems at the same time. Yet, by design, the algorithm is limited to looking for diabetic retinopathy, and thus misses an opportunity to pick up other vision-threatening ailments.

Dr Wong and Dr Bressler also raise the issue of how an automated system would fit into current care models. For example, would a clinician of some sort take the picture, but then leave the interpretation to the computer? Or would the clinician still feel the need to evaluate the image?

Get Ahead of the Game; but How Near Is the Future?

In a separate viewpoint article, Saurabh Jha, MBBS, MRCS, associate professor of radiology at the University of Pennsylvania in Philadelphia, and Eric J. Topol, MD, Gary and Mary West Endowed Chair of Innovative Medicine and professor of genomics at Scripps Research Institute in La Jolla, California, elaborate on that point and raise a challenge to radiologists and pathologists in particular to get ahead of the anticipated changes in their fields and embrace the pattern recognition tools of computers.

"Although reports of radiologists and pathologists being replaced by computers seem exaggerated, these specialties must plan strategically for a future in which artificial intelligence is part of the health care workforce," they write.

Images, they continue, are only the source of the information. In a world with artificial intelligence algorithms, such as that developed by Dr Gulshan and colleagues, a clinician may not be needed to identify a problem or irregularity in an image, but they will be needed to know what to do next for the patient. Thus, these specialties must be ready to be the information and clinical managers, rather than data readers.

Already, a few companies are trying to make use of such software in the analysis of other medical imaging areas, such as lung cancer detection, note Andrew L. Beam, PhD, a postdoctoral fellow at the Center for Biomedical Informatics at Harvard University, Boston, Massachusetts, and Isaac S. Kohane, MD, PhD, director of the Children's Hospital Informatics Program and the Department of Biomedical Informatics, Harvard Medical School, Boston, in a second editorial.

And integrating algorithms for image analysis into clinical care could be highly cost-effective. Dr Beam and Dr Kohane note that the computer chip required to run a program similar to that developed by Dr Gulshan and colleagues costs approximately $1000 and could process 3000 images per second.

However, even with the high degree of accuracy for diabetic retinopathy screening reported by Dr Gulshan and colleagues, Dr Beam and Dr Kohane caution that widespread adoption is unlikely to come quickly.

"Given that artificial intelligence has a 50-year history of promising to revolutionize medicine and failing to do so, it is important to avoid overinterpreting these new results," they conclude.

Dr Gulshan and several coauthors report a patent pending on processing fundus images using machine learning models. Dr Wong reports a patent on automated diabetic retinopathy screening software and receipt of consulting fees and advisory board membership for Abbott, Novartis, Pfizer, Allergan, and Bayer. Dr Bressler reports a patent on a system and method for automated detection of age-related macular degeneration and other retinal abnormalities. Dr Beam and Dr Kohane have disclosed no relevant financial relationships. Dr Jha reports speaker fees from Toshiba Medical Systems. Dr Topol reports advisory fees from Google Inc and Apple. Dr Topol is the editor-in-chief of Medscape Medical News.

JAMA. Published online November 29, 2016. Article full text, Wong and Bressler editorial full text, Jha and Topol viewpoint full text, Beam and Kohane editorial full text

For more news, join us on Facebook and Twitter


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.