Role of Machine Learning in Management of Degenerative Spondylolisthesis

A Systematic Review

Sherif El-Daw, MD; Ahmad El-Tantawy, MD; Tarek Aly, MD; Mohamed Ramadan, MD


Curr Orthop Pract. 2021;32(3):302-308. 

In This Article


This systematic review evaluated machine learning in the management of degenerative spondylolisthesis. Although many articles (1,142 trials) regarding machine learning and artificial intelligence were published and found in the search, only eight articles fulfilled criteria for inclusion (Figure 1); five of these articles included spondylolisthesis as one of vertebral column pathologies that were discussed.[23–27]

Figure 1.

Results of the literature search.

Results for Question 1: Are Machine Learning Tools Helpful in Diagnosing the Type of Spondylolisthesis?

No articles were found to distinguish the type of spondylolisthesis, but four articles that were concerned with diagnosis were dependent on the dataset publicly available at "The Data Mining Repository of University of California Irvine," which contained 310 patients (100 with normal spines, 150 with spondylolisthesis, and 60 with disc hernia).[28,29] Those articles used six assumptions for diagnosis. Those measurements (degree of spondylolisthesis, pelvic incidence, pelvic tilt, angle of lumbar lordosis, angle of sacral slope, and pelvic radius) have major roles in describing the spinal posture and the sagittal balance of the body.[30]

Use of artificial intelligence technology can be complicated. Ansari et al.[23] used two classifiers to diagnose spinal pathologies, those included ANN and the support vector machine (SVM). Both classifiers were divided into three stages: (1) the preprocessing stage, in which all the attributes were incorporated according to their significance and influence on the class labels; (2) the design classifier stage, in which classifiers learn the pattern from the samples that have been provided to them and diagnose the correct disorder and the performance of all six classifiers (three classifiers with two different data distributions) is evaluated; and (3) the postprocessing stage, in which the results of those classifiers are converted into an easily understandable form. The final findings clarify if the person is normal or has spinal pathology and demonstrates the type of pathology. Minimum values for the six parameters of pelvic incidence, pelvic tilt, lumbar lordosis angle, sacral slope, pelvic radius, and degree of spondylolisthesis were 26.1479, −3.7599, 14.00, 13.3669, 70.0826, −11.0582, respectively, and the maximum values for these parameters were 129.834, 49.4319, 125.7424, 121.4296, 163.071, 418.5431, respectively.

Two newer algorithms, logistic model tree (LMT) and synthetic minority over-sampling technique (SMOTE) algorithm for preprocessing, were added to help in clinical decision support.[24,31] The decision tree and the neural networks were noted to have bias towards a major class when learning from an imbalanced dataset.[32] To benefit from the advantages of both, the LMT,[33] which is a combination of a decision tree and linear logistic regression technique, was proposed. In spondylolisthesis, Karabulut and Ibrikci[24] found that without SMOTE, there was a sensitivity of 0.960 and a specificity of 0.981, with accuracy of 86.45%, and with SMOTE, there was a sensitivity of 0.960 and specificity of 0.991, with accuracy of 89.73%.

Akben[25] used Naive Bayes as a tested classifier. Naive Bayes assumes that each input variable is independent. This is a strong assumption and unrealistic for real data; however, the technique is very effective on a large range of complex problems. It is the most common classifier for artificial intelligence and machine learning problems.[34] It handles both continuous and discrete data. It is highly scalable with the number of predictors and data points. It is fast and can be used to make real-time predictions. It is not sensitive to irrelevant features. All data were divided into three categories (healthy subjects, disc herniation, and patients with spondylolisthesis) in a single classification process. So, the class for each patient was detected according to measured parameters. The results were evaluated according to single or combined use of attributes. Akben[25] stated that the success of the process when using three (or more attributes) is much more likely than when using a single attribute. Because he used the same data as a previous researcher, whose data had a much greater number of spondylolisthesis patients than disc herniation patients, the results may have been misleading and led to a false success rate towards spondylolisthesis. It is preferable to compare each pathology to healthy individuals to get more accurate results. Akben[25] also concluded that the use of spondylolisthesis grade parameters was enough for classification.

Another way, the pairwise fuzzy C-means based feature weighting method, was proposed by Unal et al.[26] for data processing to detect the spinal pathology. In the first of the two stages of this method, each feature in the spinal dataset transformed from a nonlinearly separable case to a linearly separable case by pairwise fuzzy C-means based feature weighting. In the second stage, the weighted spinal dataset (linearly separable) is classified by a classifier algorithm that includes Naive Bayes, k-nearest neighbor (k-NN), multilayer perceptron (MLP), and SVM. The obtained classification accuracies were 78.3871, 85.4839, 83.2258, and 81.6129 for k-NN, MLP, Navie Bayes, and SVM classifiers, respectively, but after using Unal et al.'s[26] method, the accuracies were 95.4839, 96.7742, 97.4194, and 96.4516 for k-NN, MLP, Naive Bayes, and SVM classifiers, respectively.

Generally, diagnosis of spondylolisthesis depends on patient symptoms and signs as well as imaging studies. Servadei et al.[27] analyzed an ANN-supported data management platform (DMP) and its correlation to diagnostic imaging method, and the ANN supported the imaging diagnosis. The input data were the patient symptoms of back ache, the budget available for the image testing, patient conditions, radiation risk, and available imaging (radiograph, CT, or MRI), taken as a constant. The output of the DMP is the appropriate imaging method. The six features that were measured by radiograph produced the ANN classification (with an accuracy of 96,1%), and the ANN-supported diagnosis was the output. In the next step, DMP was used for a therapy decision by the doctor. The classification accuracy for this dataset through ANNs was 96,1% which is believed to be a better than any in the known literature.

Results for Question 2: Is Conservative or Surgical Treatment More Successful in Patients With Spondylolisthesis?

No articles were found in the literature that used artificial intelligence or machine learning to compare surgical and nonsurgical treatment of degenerative spondylolisthesis.

Results for Question 3: Is Decompression Alone or With Fusion, With or Without Instrumentation, Better for Treating Spondylolisthesis?

Two articles by the same author[35,36] were found that discussed types of surgery for degenerative spondylolisthesis. He analyzed data of 48,911 discharges of surgical patients from the National Inpatient Sample.[37] Regarding patients with spondylolisthesis grade I, he found that nearly 70% of patients were stable and improved with decompression without fusion, but 30% of patients developed some sort of instability and later needed fusion surgery. Besides National Inpatient Sample, there were databases that were used to study reoperation and readmission rates called state inpatient databases (SIDS), representing patients from each state in the United States. To detect the reoperation rates following surgery for lumbar spondylolisthesis, a recent study used SIDS and found that the rate of reoperation was about 17% at 5 yr when decompression was performed, regardless of whether a fusion was applied or not.[38]

Results for Question 4: Is Adding Reduction of Displaced Vertebra to Instrumentation With Fusion More Successful than Instrumentation Without Reduction in Patients With Spondylolisthesis?

No articles were found in the literature that used artificial intelligence algorithms or machine learning to compare reduction with nonreduction methods in patients with degenerative spondylolisthesis.

Discharge Placement After Surgery for Spondylolisthesis

Discharge placement after surgical treatment is part of the recovery period following surgical treatment. Although this finding is linked to question number 2, it is listed separately because it is unique to surgically treated rather than conservatively treated patients. Ogink et al.[39] developed a machine learning algorithm for accurate detection of discharge placement in patients with degenerative spondylolisthesis, using a dataset of patients of the American College of Surgeons National Surgical Quality Improvement Program, the American College of Surgeons-National Surgical Quality Improvement Program database. This dataset consisted of prospectively collected patient demographics, comorbidities, laboratory values, and perioperative and postoperative outcomes for 30 days after surgery in the period between 2009 and 2016. The number of patients in that database was 9,338.

The patients' placements were either home discharge or nonhome discharge (eg, rehabilitation center, skilled or unskilled nursing facility). The best model with significant variables used for detection of a patient's placement depended upon the following variables: age, gender, body mass index (BMI), elective surgery, number of levels operated, American Society of Anesthesiologists (ASA) class, and preoperative investigations, especially white blood cell count, creatinine, and diabetes screening.[40] This model was available in a web-based application that could easily be used on smartphones, computers, and tablets. It allowed the user to input the necessary variables, calculate the scores using the selected algorithm, and obtain the results.

In the study by Ogink et al.,[39] among the 9,338 patients who were included, the number of patients discharged to a nonhome facility was only 18.6%. The age of the patents ranged from 54 to 71 yr, and 63% of the patients were female. Data for the races of the patients were 8,369 (90%) Caucasian, 695 (7.4%) African American, and 274 (2.9%) "other." BMI average was 30 kg2/m2, creatinine levels ranged from 0.72 to 1.00 mg/dL, and white blood cell count ranged from 5.7 to 8.3/UL. Regarding patients with diabetes, diabetics using oral medicines were 1123 (12%), and patients who were insulin-dependent were 494 (5.3%).

Regarding the surgical procedure, elective surgeries were done in 9,114 (98%). Decompression and fusion were done in 5,897 patients (63%), fusion only in 2,857 patients (31%), and decompression only in 584 patients (6.3%). According to ASA classification, classifications were as follows: 238 patients (2.6%) in ASA I, 4632 patients (50%) in ASA II, 4288 patients (46%) in ASA III, and 180 patients (1.9%) in ASA IV.