AI Beats Radiologists for Accuracy in Lung Cancer Screening

Roxanne Nelson, RN, BSN

May 23, 2019

Lung cancer screening using low-dose computed tomography (LDCT) is now recommended for certain populations, and while it has been shown to reduce mortality, there are persistent challenges with this technology, including inter-grader variability and high rates of false-positive and false-negative results.

Artificial intelligence (AI) may be able to help circumvent some of those limitations, suggests a new study .

Researchers from Google trained a deep-learning algorithm to detect malignant lesions in the lungs from more than 42,000 CT scans. The algorithms then identified 11% fewer false positives and 5% fewer false negatives than trained radiologists who reviewed the same scans.

While the results are provocative, the authors caution that these findings need to be clinically validated in large patient populations.

The study was published online May 20 in Nature Medicine.

"In the study, we showed that the AI tool shows promise in better diagnosing patients with cancer, and better determining those who do not have cancer," said study coauthor Mozziyar Etemadi, MD, PhD, a research assistant professor of anesthesiology at Northwestern University Feinberg School of Medicine, Chicago, Illinois. "All of the data used in the study were retrospective and the next step is to perform a prospective study to see if the tool, when used by a radiologist, can lead to earlier and more accurate diagnosis of cancer, and hopefully, better outcomes for patients."

Etemadi told Medscape Medical News that the research team is currently in the planning phase, but moving quickly toward going forward with it. "In the ideal case, such a study would capture a large, diverse patient population, but this is a significant challenge," said Etemadi. "Hospital computer systems are not designed to play nice with each other, let alone something as cutting edge as an AI algorithm that runs in the cloud. A big part of what my team is working on now is building this 'middleware' to make this a reality."

He pointed out that work remains to be done, in collaboration with Google, to learn precisely how a radiologist or other physician would like to use the AI.

"Is it part of their existing workflow? Do we create a separate workflow? These are all very interesting questions to answer that we are actively working on," he emphasized. "As both an engineer and a physician, this is truly a dream come true."

New Methods Needed

Screening for lung cancer using low-dose computed tomography (LDCT) is recommended by the US Preventive Services Task Force for certain groups at high risk for the disease. But a persistent problem with LDCT screening is the high rate of false positives. About a quarter (24%) of LDCT screening exams produce a positive result that requires follow-up, but 96% of these findings are false positives. This has prompted researchers to investigate new methods of differentiating malignant from benign nodules.

One group from the University of Pittsburgh, for example, incorporated a machine-learning algorithm to improve the prediction of lung cancer. In their model, they integrated features of an LDCT scan with other clinical data and comorbidities.

In the current study, Etemadi and his colleagues investigated how AI could help address some of the current challenges associated with LDCT scans for lung cancer screening. In a blog post, study coauthor Shravya Shetty, MS, technical lead at Google, notes that with advances made in 3D volumetric modeling, along with datasets from their partners (including Northwestern University), "we've made progress in modeling lung cancer prediction as well as laying the groundwork for future clinical testing."

When looking at hundreds of 2D images within a single CT scan, cancer "can be miniscule and hard to spot," Shetty writes. The model that was created by her team at Google can not only generate the overall lung cancer malignancy prediction and is viewed in 3D volume, but can identify subtle malignant tissue in the lungs, she writes. In addition, this model can also factor in information from previous scans, which can be useful in predicting lung cancer risk since the growth rate of suspicious lung nodules can be indicative of malignancy.

Very Promising Results

Overall, there were three key components in this new approach. The first was a 3D model that was constructed with deep convolutional neural networks (CNN) — a type of AI architecture — that would perform end-to-end analysis of whole-CT volumes, using LDCT volumes with pathology-confirmed cancer as training data.

The second component was to train a CNN "region-of-interest" (ROI) detection model to detect 3D cancer candidate regions in the CT volume; finally, the last step was to develop a CNN cancer risk prediction model that would operate independently on outputs from both the cancer ROI detection model and full-volume model.

The Google team developed the deep-learning algorithm and applied it to 6716 de-identified CT scan sets to validate the accuracy of the new system. The model achieved a 94.4% area under the curve (AUC) for this group of cases, and then achieved a similar performance on an independent clinical validation set of 1139 cases.

Two reader studies were then conducted. When previous CT scans were not available, the new model outperformed all six radiologists with absolute reductions of 11% in false positives and 5% in false negatives. Where prior scans were available, the AI model performed on-par with the same radiologists.

Important Work Ahead

"The '3D' deep learning — with time as the added dimension — appears to markedly improve accuracy well into the 90+%, which is a welcome advance," commented Eric Topol, MD, director of the Scripps Research Translational Institute in La Jolla, California, and Medscape editor-in-chief.

"But, as the researchers pointed out in the paper, prospective clinical validation is necessary to confirm the data," he commented. "And for any AI algorithm there will always need to be careful surveillance of performance once introduced in the clinic."

Study coauthor Daniel Tse, MD, a project manager at Google, told Medscape Medical News that that they are planning clinical validation studies and that the "goal is understanding both how the model generalizes and can be tuned to new patient populations."

"We believe that there is still important work to be done on the user interface/user experience side to ensure we can surface the models in a productive way in clinical settings," Tse said.

This study was funded by Google Inc. Etemadi received funding from Google Inc to support the research collaboration. Tse is an employee of Google Inc; several of the authors are also Google employees or have other relationships with industry.

Nature Medicine. Published online May 20, 2019. Abstract

For more from Medscape Oncology, join us on Twitter and Facebook


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.