Challenges of Artificial Intelligence Models in Thrombosis

Javier Cotelo, MD

November 08, 2022

MADRID, Spain — Artificial intelligence (AI) algorithms in thrombosis are complex and face several challenges. These challenges include obtaining quality data and good explicability without loss of accuracy. These emerging tools require more trained staff and patient input to become established in clinical practice.

Rosa Vidal, MD, of the Jiménez Díaz Foundation University Hospital in Madrid, Spain, and Adrián Mosquera, MD, PhD, of the Santiago de Compostela University Clinical Hospital in Coruña, Spain, coordinated the symposium of the 64th National Congress of the Spanish Society of Hematology and Hemotherapy (SEHH). This symposium highlighted the opportunity that AI provides, in terms of health outcome programs that help improve the quality of life of patients with thrombosis, thereby optimizing the efficiency of our health system. "As healthcare professionals, we must train ourselves on these new technologies and implement them in our daily practice, bearing in mind that this is no easy task."

Sara Martín Herrero, MD, of the Jiménez Díaz Foundation University Hospital in Madrid, explained in detail the difficulties involved in implementing a big data program in thrombosis. "Venous thromboembolic disease is complex, in terms of patient identification and definition of variables for data exploitation," she said. The challenges for learning models in general and for this pathology in particular focus on the quality and quantity of data, privacy, the transparency and explicability of algorithms, robustness, autonomy and human oversight, responsibility, regulation, and diversity and equality.

The cornerstone of all AI systems under the protection of the data protection law is that they are varied, representative of reality, and certified, said Martín. In healthcare, 80% of data are unstructured, and the goal is to convert them into knowledge that brings value.

Explicability vs Accuracy

Regarding the explicability of the models, Martín stated, "In the healthcare field, it's essential, although the challenge arises of favoring accuracy over explicability. We need regulation mechanisms for these algorithms to apply them in clinical practice. It would also be necessary to explain part of the algorithms and generate evidence by applying them extensively to demonstrate that they are accurate, in addition to having external validations and traceability to see how the model works from the methodology standpoint it uses."

Among the various proposals to reinforce transparency and explicability, Martín referred to healthcare worker training and collaborative culture. These models include techniques that favor this explicability, transparency of data, methodology, and results, as well as decisions that can be explained, evaluated, and audited.

Predicting Thrombotic Recurrence

Martín discussed a machine learning tool that is being developed at her center and is still in a preliminary phase (data cleaning). Its developers seek new risk factors for recurrence of venous thromboembolic disease in patients with a first event to prevent a second episode. They aim to obtain an individualized recurrence prediction model and to select patients who benefit from anticoagulation. In addition to reducing recurrence, they want to control associated morbidity and mortality.

"They have collected data from 2016 to 2022 that indicated that the incidence of venous thromboembolic disease during the COVID-19 pandemic was very striking, with a clear increase in incidence in 2020 and 2021, which allowed us to accurately draw the pandemic waves with only the data extracted from a single question, which was whether they had had a venous thromboembolic disease," said Martín.

She mentioned important conclusions for developing models. "The ICD-9 and ICD-10 classification of diseases don't reach sufficient sensitivity in the search for patients with venous thromboembolic disease in electronic medical records, so other diagnostic coding standards should be associated in accordance with daily clinical practice.

"It is also essential to promote a collaborative culture in the development of systems applicable to clinical needs. It can be an extremely valuable tool to reduce gaps and inequality in access to healthcare," she added.

"There's no longer any doubt that the use of AI tools can substantially help us make decisions in clinical practice and improve the quality of care, contributing to the development of more personalized and accurate medicine. However, its implementation is not easy, since we face ethical and legal issues that remain to be resolved," said Mosquera.

Venous Thromboembolism

Ang Li, MD, professor of hematology and oncology at Baylor College of Medicine, Houston, Texas, discussed epidemiology, prevention, and treatment of venous thromboembolism, thrombotic microangiopathy in cancer treatment, and hematopoietic cell transplantation.

He also commented on the development of computable phenotypes and risk prediction models in patients with cancer and thrombosis. Thrombosis occurs seven times more often in patients with cancer vs those without tumors. Some types of cancer have 10% to 20% risk of venous thromboembolism in the first 6 months after diagnosis, and venous thromboembolism is the second leading cause of death, after infection, in these patients.

"There are risk prediction models in this field, and they are easy to use, but they don't give an optimal prediction, since it's difficult to incorporate nonstandardized markers, and only 50% of venous thromboembolisms are classified as high risk," said Li.

He also mentioned the following five general steps for the use of machine learning in cancer and thrombosis:

  1. Start by defining venous thromboembolism with exquisite accuracy.

  2. Identify relevant clinical predictors.

  3. Look for a robust statistical analysis.

  4. Build the final model with the selected samples.

  5. Have adequate internal and external databases.

Data Extraction Matters

"It is necessary to determine the phenotype of venous thromboembolism with structured and unstructured data," said Li. "The latter include data from the radiologic report, hospital discharge, and relevant clinical notes. Machine learning-based models must have a set of specific words, use a recurrent neural network, and employ appropriate language Bidirectional Encoder Representations from Transformers (BERT)."

Li analyzed the advantages and limitations of these tools in medical research in terms of predictability, where they improve traditional methods in the same dataset record, in explicability, legitimacy, generalizability, and interoperability.

He pointed out three relevant aspects. "Data extraction and harmonization are more important than the model. Machine learning is a good complement, but not a substitute for traditional statistical models. Clinical knowledge is essential for the selection of variables and interpretation of the model."

The Patient's Contribution

Cindy de Jong, a PhD student at Leiden University Medical Centre, Leiden, Netherlands, discussed the development of a standardized Patient Reported Outcome Measures System (PROMS) for patients with venous thromboembolism that improves decision making and treatment quality.

She described the SCOPE project, in which the patients enrolled were older than 16 years with pulmonary embolism or deep vein thrombosis, including antiphospholipid antibody syndrome, end-of-life status, pregnancy-associated venous thrombosis and thrombosis associated with cancer.

Patient-reported outcomes such as quality of life, functional limitation, pain, dyspnea, and treatment satisfaction were collected through various questionnaires and specific standardized tools. The investigators also gathered data on clinical events such as procedure-related complications, hemorrhages, survival, recurrences, chronic pulmonary hypertension, and pulmonary thromboembolic disease.

Risk factors are considered demographic (sex, race, BMI, age, level of education) or clinical (comorbidities, anticoagulants, previous deep vein thrombosis, interventions carried out). These risk factors are recorded at baseline, 3 months, 6 months, 1 year, and annually thereafter.

De Jong mentioned the most striking results in terms of the differing perceptions of the patient and doctor. "Changes in perceived quality of life were indicated by 70% of patients, compared with 48% of professionals. And chronic pulmonary hypertension was reported by half of the patients and 83% of the professionals, pulmonary thromboembolic disease by 45% of the patients, compared with 79% of the professionals."

Implementing AI Models

The symposium coordinators indicated that scientific societies have prepared a manifesto for active participation in the digitization of the health system and for improving data quality and security. There is currently growing interest among the national working groups in implementing these tools in diagnosis and risk stratification, as well as in the prediction of responses to new hematology drugs.

"However, we are facing a significant training deficit on the part of health specialists, given that it is a very new field and unrelated to medicine. Therefore, it is necessary to increase the training of specialists in advanced data analysis, as well as to foster collaboration with experts in intelligent information technologies," they added.

Thromboembolism Prediction

Inés Martínez, MD, of the Jiménez Díaz Foundation in Madrid, described research that compared machine vs logistic regression in the generation of predictive models of venous thromboembolism in patients with multiple myeloma.

The multicenter retrospective study included 133 patients (88 without venous thromboembolic disease and 45 with it) with 131 clinical and biological variables. The investigators used logistic regression methods (eight significant variables) and machine learning (six variables) to later compare the patients. The following four variables coincided: age under 65, high Revised International Staging System score, C-reactive protein greater than 0.60 mg/dL, and history of surgery.

"Both models are good predictors of venous thromboembolic disease in patients with newly diagnosed myeloma," said Martínez. "We hypothesize that these four variables that the patients share would have greater predictive power, although an independent validation cohort is required to assess the level of overadjusting that their models have to choose more accurately between these two approaches."

Martín, Li, and de Jong declared no relevant financial conflicts of interest.

Follow Javier Cotelo, MD, of Medscape Spanish Edition on Twitter: @Drjavico.

This article was translated from the Medscape Spanish edition.

Comments

3090D553-9492-4563-8681-AD288FA52ACE
Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.

processing....