Artificial Intelligence for the Orthopaedic Surgeon

An Overview of Potential Benefits, Limitations, and Clinical Applications

Eric C. Makhni, MD, MBA; Sonya Makhni, MD, MBA; Prem N. Ramkumar, MD, MBA


J Am Acad Orthop Surg. 2021;29(6):235-243. 

In This Article

Applications in Orthopaedics

Remote Patient Monitoring

Remote patientmonitoring systems (Table 2) represents an avenue that can increase the value of care during the perioperative period and has become increasingly important since the coronavirus disease 2019 (COVID-19) pandemic. Although many companies have developed software to monitor step counts and activity level, the application of ML with an open architecture system (eg, one that allows broad sharing and integration with other systems) allows patients and healthcare providers to track their participation in home exercise programs and general activity levels.[7,8] The surgical team can therefore track rehabilitation and intervene with calls or additional office visits if postoperative milestones are not being met.

Remote patient monitoring systems have been proven to be effective for patients undergoing primary total knee arthroplasty (TKA) for osteoarthritis. In one cohort study of 25 patients, patients who underwent this procedure, downloaded an AI-based, open architecture mobile application (FocusMotion) onto their personal iPhones (Apple), and recorded preoperative mobility and PROMs, beginning 2 to 4 weeks before surgery.[27] A knee sleeve was paired with the patient's iPhone via Bluetooth, and the application notified the patient to complete weekly exercises. Home exercise compliance and range of motion were detected by AI-based interpretation of the sensors on the knee sleeve that displayed range of motion and overall compliance with exercise form. This system was found to be reliable, low maintenance and well received during the process of recovery from TKA.[27]

Postoperative Outcomes and Cost

ML has been shown to be useful in using patient-specific factors to predict postoperative outcomes. This feature can be applied to further improve payment models by bringing greater, more nuanced specificity to tiered reimbursement. The Comprehensive Care for Joint Replacement model for bundled payments and quality measures was established to improve value and incentivize high-quality care at lower costs. However, hospitals that have demonstrated savings with bundled payments are more likely to be large, high volume, and associated with postacute care facilities.[28] Although bundling care has been shown to improve outcomes (readmissions decreased from 5% to 1.6%–2.7%, and patients are more likely to be discharged home), bundling care does not account for the specific factors each patient possesses.[29] Patient-level factors are essential in predicting the likely true cost and outcome of a procedure.[30,31] These specificities may shape or determine the course of their treatment. A single reimbursement fee for a single procedure may therefore fall short, failing to acknowledge or incorporate patient-level factors that influence cost. A comprehensive model that can identify patient-specific factors that influence cost may be able to help determine more appropriate reimbursements and reduce the phenomenon of "cherry-picking" or "lemon-dropping" patients.[32,33]

ML has also been applied to predict the necessity of prolonged opioid prescription after an operation. A 2019 study by Karhade et al[21] developed ML algorithms for preoperative prediction of prolonged opioid prescriptions after total hip arthroplasty. In addition, the algorithm's predictive power presents an opportunity to more accurately estimate the true cost of a procedure—a cost estimate that includes the likelihood of prolonged opioid prescription or dependence, in addition to the direct costs of the procedure.[21]

In this manner, ML technologies can be applied to determine a particular patients' likelihood of increased resource utilization—in this highlighted study, investigators predicted prescription utilization, but others have examined length of stay and inpatient charges. These specific outcomes help characterize a case's predicted complexity based on the patient's specific factors. Identifying and understanding the complexity of a case creates an opportunity to more accurately understand value. Although value has been understood as the relative benefit of the outcome to the cost, value is not standardized because individuals may require different resources, bring different goals, and achieve different outcomes.

Beyond simply identifying risk factors for increased cost, risk stratification for the purpose of improving cost may be a useful tool in improving equity. A 2018 study by Navarro et al[22] used a Bayesian model to forecast length of stay and cost, using factors such as age, race, sex, and comorbidity scores. A proposed risk-based patient-specific payment model was created based on the output. As patient complexity increased, cost add-ons then increased in tiers of 3%, 10%, and 15% for moderate, major, and extreme mortality risks. This proposition has the potential to encourage cost sharing, reduce patient selection, and even reinforce patient access by reimbursing in proportion to complexity. The ability to predict a specific patients' outcomes and resource utilization based on their preoperative variables has important implications in increasing the efficiency of payment models to improve cohort health. In the present era, risk is not distributed equally between payers and the treating team—surgeons are not incentivized to take on a large proportion of patients with increased comorbidities or case complexity. However, using ML to predict complexity offers an opportunity to fairly reward surgeons and institutions who take on greater risk. In the context of an aging cohort with increased comorbidities, a flat bundle reimbursement fee for patients with varying risk fails to match value with reimbursement.

Similarly, a 2019 study by Ramkumar et al[20] developed and validated an artificial neural network that was able use patient-specific factors and outcomes to "learn" and predict length of stay, inpatient charge, and discharge disposition in unfamiliar patients undergoing TKA. Furthermore, this predictive model was applied to propose a risk-based, patient-specific payment model. The neural network was created using 175,042 total knee arthroplasties and had an area under the curve of 0.748 for length of stay, 0.828 for charges, and 0.761 for discharge disposition. The model "learns" iteratively from training groups until it is able to predict value-based patient outcomes. This predictive capability has promise in application to patient-specific payment models and tiering reimbursement based on case complexity, in which patients may be preoperatively assigned to a tier based on their risk factors, with a reimbursement commensurate with their stratified risk.

With the advancement of data aggregation and deep learning algorithms, the field or orthopaedics is on the cusp of a transformation. The adoption of ML in orthopaedics has the power to improve patient care by estimating complexity of cases and supporting progress toward patient-specific payment models that are more capable of incorporating specificities of each case.[8]

Imaging and Gait Analysis

ML has important applications in diagnosis, using both imaging and gait analysis. It has been used to automatically detect osteoarthritis using imaging patterns and movement patterns, a feature that holds promise for efficient and objective automated diagnosis.[23]

For example, preprogrammed mathematical algorithms and measurements have been shown to accurately diagnose arthritis on a radiograph. Urish and Reznik[23] describe the use of medical imaging data in a technique that analyzes pixels in a radiograph image to recognize pertinent structures and specific features to create a pattern. When presented with an unknown image, the algorithm was shown to "decide" whether it was consistent with a known model for osteoarthritis or whether it did not match. This algorithm may be used in both clinical applications and research applications to confirm the presence of osteoarthritis. Furthermore, it could be expanded to predict which patients have more advanced pathology or would benefit most from surgical intervention. With an algorithm capable of processing images, a health system may be able to more efficiently triage a patient to the appropriate care provider—whether it be a specialist arthroplasty surgeon, sports medicine surgeon, or a nonsurgical physician. These clinical decisions can be made using data rather than reliance on nonclinical schedulers. In addition, they provide the benefit of increasing efficiency.

ML has utility in detecting knee osteoarthritis using gait analysis. A computer system developed by Kotti et al[24] took input body kinetics and produced as output an estimate of the likelihood of the presence of knee osteoarthritis. Furthermore, it identifies the discriminating parameters and set of rules that led to the decision. This explanation mimics "interpretation" and increases the value of the diagnosis. With an accuracy of 72.6%, this automatic detection of knee osteoarthritis provides a unique opportunity to create objective, sensitive diagnostic tools that can increase efficiency and quality of care delivered to patients.

Recently, Karnuta et al[34] trained, validated, and externally tested a deep-learning system to classify total hip arthroplasty and hip resurfacing arthroplasty femoral implants as one of 18 different manufacturer models from 1,972 retrospectively collected AP plain radiographs from four sites in one quaternary referral health system (Figure 1). After 1,000 training epochs by the deep-learning system, the system discriminated 18 implant models with an area under the curve of 0.999, accuracy of 99.6%, sensitivity of 94.3%, and specificity of 99.8% in the external-testing data set of 206 AP radiographs. Similarly, the same group[25] built a deep-learning system to identify TKA, unicompartmental knee arthroplasty, and distal femoral replacement images and found the model discriminated nine implant models with an area under the curve of 0.99, accuracy 99%, sensitivity of 95%, and specificity of 99% in the external-testing data set of 74 radiographs.

Implant Design

Optimization of implants and devices can increase the value that they provide both to the patient and the value of investments made by developers. Currently, implant design is not as efficient as possible because of constraints in testing fit.

Kozic et al[26] present a method to assess specific anatomical and morphological criteria that transcend shape variability in a cohort to optimize orthopaedic implant design. Although implants are mostly designed and tested through fitting on cadaver bones, which provide only a limited sample that does not represent the entire variability of the patient cohort, this technology provides an alternative. The framework allows an implant design to be virtually fit to samples drawn from a statistical model, determining which range of the cohort is most appropriate for a particular implant. Certain patterns of bone variability are more important for implant fitting, and this method allows for improvement of implant design such that a maximum target cohort can have a benefit.

This study demonstrated the optimization of implant design, using their proposed design and virtual validation method, of a proximal human tibia used for internal fracture fixation. Overall, implant design can benefit from these methods to improve fit for the patient, designer, and physician.