AI Thyroid Nodule Classification Could Reduce Biopsies by 50%

Nancy A. Melville

November 07, 2019

CHICAGO — In the latest effort to apply artificial intelligence (AI) to the challenging task of classifying thyroid nodules, an image similarity algorithm shows accuracy that is similar, and in some aspects better, than the best available ultrasound-based classification systems.

"By using image similarity AI models we can eliminate subjectivity and decrease the number of unnecessary biopsies," by as much as 50%, said Johnson Thomas, MD, who presented the findings here at the 89th American Thyroid Association (ATA) Annual Meeting.

"When compared to published results of the American College of Radiology's Thyroid Imaging Reporting and Data System (TI-RADS) and ATA classification system, our image similarity model has comparable negative predictive value, with better sensitivity, specificity, and positive predictive value," added Thomas, who is section chair of the Department of Endocrinology, Mercy Hospital, Springfield, Missouri.

And this system is different from other AI "black box" systems that have been developed for thyroid nodule classification, he stressed.

"The advantage of this system is it doesn't just give a binary outcome and it's very intuitive," he told Medscape Medical News.

Asked for input, Franklin N. Tessler, MD, of the Department of Radiology, University of Alabama at Birmingham, agreed that AI has the potential to add a critical element of objectivity to thyroid nodule assessment.

"I'm very much a proponent of applying computer techniques to nodule characterization — I think there's a very bright future in it...[and] in general, machine and AI imaging learning will be an overlay and assistive type of technology," he told Medscape Medical News.

"There may be a lot of demographic factors that a human might not take into account...I think this is going to have a tremendous impact on imaging — not just thyroids but pretty much everything."

Current Noninvasive Classification Systems Leave Much to Be Desired

Increased use of imaging has led to the current estimate that as many as one out of two women over the age of 50 may have thyroid nodules — yet fewer than 10% of these are cancerous.

But current noninvasive classification systems leave much to be desired, Thomas said, and "all are subjective, with significant inter- and intraobserver variation."

AI systems meanwhile have been designed to use deep learning models to decrease the subjectivity that can muddle image interpretation, and such systems are already approved by the US Food and Drug Administration (FDA) for diagnostic purposes in diabetic retinopathy, stroke, and breast lesions, Thomas noted.

And while various AI algorithms have been developed for thyroid nodule detection with machine learning based on a binary "black box" system, the system Thomas and his colleagues developed differs by using an image similarity algorithm, providing clinicians with matched comparisons.

To create the model, deep learning was used to process all available images for 482 nodules from patients who underwent a biopsy or thyroid surgery at Mercy Hospital from February 2012 to February 2017.

Nodules were excluded from the model if there was no definitive diagnosis of being benign or malignant.

In evaluating the system's accuracy, the authors further tested 103 thyroid nodules in patients who underwent biopsy or surgery from March 2017 through July 2018.

Overall, 66 nodules were malignant in the training set and 33 were malignant in the test nodules.

The results showed the AI system had a sensitivity and specificity of 87.8% and 78.5%, respectively. The model had a negative predictive value (NPV) of 93.2% and positive predictive value (PPV) of 65.9% for diagnosis.

Overall, the system's accuracy was 81.5%.

"The results suggest the use of the image similarity AI system could result in a 57.3% reduction in biopsies," Thomas said.

Comparatively, a study recently published in Endocrine Practice, which was a direct comparison of the conventional ATA and ACR TI-RADS classification systems for the diagnosis of 323 nodules, showed lower accuracy rates with both systems for most of the measures, as compared to the findings with the AI system reported by Thomas.

That study showed a sensitivity of 77.3% vs 78.4% for ATA and ACR TI-RADs, respectively; a specificity of 76.6% vs 73.2%; a PPV of 55.3% vs 52.3%; and an NPV of 90% for both, he noted.

Unique Image Vectors Used in AI-Like "Fingerprints"

Thomas explained how this new system differs. For the AI image similarity algorithm, ultrasound images are processed through convolutional layers to generate a unique image vector, which is then stored in a database.

"Each layer extracts one certain feature, such as microcalcification or irregular borders, resulting in an image vector that is similar to a unique fingerprint."

The unique image vector from a test image is then compared to the nearest neighbor from the database of other vectors, allowing for risk stratification of nodules at the point of care.

Thomas noted that in providing an image similarity system, the AI model offers an improvement over the AI "black box algorithm" model, which only gives the diagnosis of malignant or benign.

"Physicians can compare similar images with known diagnosis and make their own decision. I think once physicians start using our system, they will trust this more than a black box algorithm," he told Medscape Medical News.

The system incorporates images from the most popular ultrasound machines in the United States, however not all may have compatibility, so it's possible that this might not work well on all images captured from all ultrasound machines, he acknowledged.

"But unlike image classification AI algorithms, our image similarity algorithm will display similar images from our database and the evaluating physician can then compare his test images to the output images and verify the accuracy," he said.

Noting, nevertheless, that the system still needs to be validated with data from other institutions, Thomas said the system could eventually allow for risk stratification of nodules at the point of care and be used on mobile devices, computers, or as a cloud service.

"Hopefully this will decrease unnecessary biopsies," he concluded.

89th Annual Meeting of the American Thyroid Association. Abstract #27. Presented November 1, 2019.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.