AI Challenges Sonographers in Heart Function Assessment

Marilynn Larkin

April 06, 2023

Dr David Ouyang

For patients undergoing echocardiogram to assess left ventricular ejection fraction (LVEF), an initial evaluation by artificial intelligence (AI) was noninferior and even superior to evaluation by sonographers in a blinded, randomized study.

More than 3500 echocardiographic studies were screened, and the proportion of studies substantially corrected after review by cardiologists was 16.8% in the AI group and 27.2% in the sonographer group, suggesting superiority for AI.

The mean absolute difference between the initial and final LVEF assessments was 2.79% in the AI group versus 3.77% in the sonographer group, again showing superiority for AI.

"We were surprised that the AI did better than sonographers," David Ouyang, MD, a cardiologist at the Smidt Heart Institute, Cedars-Sinai, Los Angeles, California, and a researcher in the Division of Artificial Intelligence in Medicine at the hospital, told | Medscape Cardiology.

"We initially only hoped to show that AI and sonographers were equivalent but were pleasantly surprised to show that AI was superior," he said. "In some ways, this AI passed the Turing test for reading echocardiogram videos."

However, he said, "We very much want clinicians to still be in charge. They still need to review and confirm findings, even though the AI can make it faster and more precise. AI needs clinician supervision."

The study was published online April 5 in Nature.

Sonographer or AI?

The investigators assessed initial interpretations by AI and sonographers of echocardiographic studies screened at a different institution in 2019.

In total, 1740 of the echocardiographic studies were randomly assigned to the AI group and 1755 to the sonographer group. Patients were a mean age of 66 years; 57% were women; and 58% were non-Hispanic White; 14%, Black; 12%, Hispanic; 8%, Asian; and 8%, other or unknown.

The primary endpoint was change in LVEF between the initial AI or sonographer assessment and final cardiologist assessment, evaluated by the proportion of studies with substantial changes (greater than 5% change).

In addition, after completing each study, cardiologists were asked to predict whether a sonographer or AI made the initial interpretation.

Overall, cardiologists were unable to tell which assessments were made by AI and which were made by sonographers. They correctly predicted the initial assessment method for 32.3% of studies, guessed incorrectly for 24.2%, and were unsure for 43.4%.

Substantial changes between the initial and final assessments were made in 16.8% of studies in the AI group versus 27.2% in the sonographer group; in other words, cardiologists more frequently agreed with the AI initial assessment.

The mean absolute difference between the initial and final assessments of LVEF was 2.79% for AI versus 3.77% for sonographers.

The mean absolute difference between previous and final cardiologist assessments was 6.29% for AI and 7.23% for sonographers.

Furthermore, according to the authors, the AI-guided workflow saved time for both sonographers and cardiologists.

Study limitations included the single-center population, lack of power to assess long-term outcomes based on differences in LVEF assessments, and the need for more training examples for the AI model.

The authors also noted that experienced sonographers were used as an active comparator for the initial LVEF assessment, but "different levels of experience and types of training can change the relative impact of AI compared with clinician judgement."

"We are deploying this algorithm in general clinical practice at Cedars and also seeking to get US Food and Drug Administration approval for general use," Ouyang said. "AI of this sort, once trained on more than 100K videos, should generalize to most institutions. We think this is strong validation that once models get enough data, they can generalize."

Strengths, But Limited for Now

Commenting on the study for | Medscape Cardiology, Y. Chandrashekhar, MD, editor-in-chief of JACC: Cardiovascular Imaging, said: "The low-hanging fruit for AI will likely be improving logistics and throughput, for example, using AI for automating and optimizing many of the processes that happen before a clinician starts to read a study, such as test protocoling, image acquisition, perhaps image denoising, and creating a preliminary 'pre-read' report."

"Fully autonomous image interpretation and multiparameter reporting via AI will remain the holy grail for many more years and will have to overcome multiple methodology, external validation, generalizability, and regulatory hurdles before coming to clinical fruition," he continued.

Although the strength of the study is blinding and randomization, he said, "it ends up addressing a rather limited question — EF as a 'pre-read' report," which is known from previous studies to be feasible.

As such, the current study is "a demonstration of what AI is known to do best, within the limited tasking and testing parameters, rather than what it can realistically substitute in the reading cardiologist's workflow," he said. "Cardiologists will still need to read and make an EF determination independent of what was written in the 'pre-read.' As currently structured, it does not make the cardiologist any more efficient or optimized."

Douglas Mann, editor-in-chief of JACC: Basic to Translational Science, also commented on the findings. "There are at least 50 other measurements that are time-consuming and laborious for sonographers and cardiologists to manually read on 2D echoes. It would have been more interesting to see how AI did with measurements other than LVEF, which is a relatively low bar," he said.

"Clinicians often have to render clinical interpretations based on less-than-ideal images," he noted. "AI is trained on optimal images, and...this represents a significant unknown for AI in the clinical setting." The fact that technically difficult images were excluded from the current study "significantly biases the study in favor of AI."

With regard to the future of AI in cardiology, Mann said: "While there may come a time when AI technology replaces what a skilled technician or clinician can do, all current AI technology is only as good as the dataset that it trains on. While this limitation may go away as (the) technology evolves, at the time of writing, AI is still not there."

Chandrashekhar said: "Stay engaged and stay tuned. This is an exciting area that is replete with unbelievably powerful advances that are coming down the pike at great speed. Many of these innovations will change how you function as a clinician in the future. Some clinical activities (for example, those that mostly depend on pattern recognition) will be gradually replaced by AI, but many more have the potential to be optimized or even enhanced, with significant benefit for both the clinician and the patient."

No external funding was obtained for the study. Ouyang and two coauthors have reported holding a provisional patent for the previously published AI model. Chandrashekhar and Mann have reported no relevant financial relationships.

Nature. Published online April 5, 2023. Full text

Follow Marilynn Larkin on Twitter:  @MarilynnL.

For more from | Medscape Cardiology, follow us on Twitter and Facebook.


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.