Diagnosis by AI 'As Effective As Healthcare Professionals'

Peter Russell

September 25, 2019

Experts welcomed results from a review which suggested that artificial intelligence (AI) could be as effective as health professionals at diagnosing disease from medical imaging.

However, there was agreement that a paucity of high quality research meant that the true value of AI remained uncertain.

The authors of the study, published in The Lancet Digital Health , called for higher standards of research into the specific challenges posed by 'deep learning' in order to improve future evaluations.

Some experts said a combination of AI with judgements made by healthcare professionals might be the preferred future of medical diagnosis.

First Study Found a Lack of Good Quality Data

The systematic review and meta-analysis – the first to be undertaken – found only a few studies of sufficient quality since 2012.

Prof Alastair Denniston from University Hospitals Birmingham NHS Foundation Trust, who led the research, said in a news release: "We reviewed over 20,500 articles, but less than 1% of these were sufficiently robust in their design and reporting that independent reviewers had high confidence in their claims. What's more, only 25 studies validated the AI models externally, and just 14 studies actually compared the performance of AI and health professionals using the same test sample."

However, despite reservations about high quality data, he said: "Within those handful of high-quality studies, we found that deep learning could indeed detect diseases ranging from cancers to eye diseases as accurately as health professionals. But it’s important to note that AI did not substantially out-perform human diagnosis."

That close call was borne out in an analysis of data from the 14 studies which found that AI could detect disease in 87% of patients, compared to 86% by healthcare professionals.

When it came to excluding people who did not have disease, AI scored 93% against 91% for healthcare professionals.

Among major drawbacks identified by the study authors were that diagnoses by AI were frequently made in isolation in a way that did not reflect clinical practice. For example, only four studies provided health professionals with additional clinical information that they would normally have used to make a diagnosis in clinical practice.

Co-author Dr Livia Faes from Moorfields Eye Hospital in London said: "Evidence on how AI algorithms will change patient outcomes needs to come from comparisons with alternative diagnostic tests in randomised controlled trials. So far, there are hardly any such trials where diagnostic decisions made by an AI algorithm are acted upon to see what then happens to outcomes which really matter to patients, like timely treatment, time to discharge from hospital, or even survival rates."

The View of Experts

A number of experts shared their view of the study with the Science Media Centre.

David Curtis, honorary professor at University College London Genetics Institute, commented: "I think the most striking aspect is that out of over 20,000 studies of applications using AI for medical imaging published in scientific journals only 14 were good enough to use. That's fewer than one in a thousand.

"Almost all published studies of AI for medical imaging did not use proper methods, and can be safely ignored.

"Of the tiny handful of studies which were actually valid, the results show that AI may interpret imaging about as well as medical professionals but in many cases the professionals were denied access to information which would have been available to them in a real clinical scenario."

Richard Mitchell, professor of cybernetics at the University of Reading, commented: "There are some examples where a combination of human and artificial intelligence gives an even better result, and that may be the better route to take."

Prof David Spiegelhalter, chair of the Winton Centre for Risk and Evidence Communication at the University of Cambridge, said: "Deep learning can be a powerful and impressive technique, but clinicians and commissioners should be asking the crucial question: what does it actually add to clinical practice?"

Dr Nils Hammerla, director of machine learning and natural language processing at Babylon, said: "Machine learning can have a massive impact on problems in healthcare, big and small, but unless we can convince clinicians and the public of its safety and ability then it won't be much use to anybody."

Dr Nils Hammerla: Works at, and is a shareholder in, Babylon, which uses artificial intelligence and machine learning to provide healthcare tools for patients and clinicians.

A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis, The Lancet Digital Health. Paper .


Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.
Post as: