Can Machine-learning Methods Really Help Predict Suicide?

Catherine M. McHugh; Matthew M. Large


Curr Opin Psychiatry. 2020;33(4):369-374. 

In This Article

Abstract and Introduction


Purpose of review: In recent years there has been interest in the use of machine learning in suicide research in reaction to the failure of traditional statistical methods to produce clinically useful models of future suicide. The current review summarizes recent prediction studies in the suicide literature including those using machine learning approaches to understand what value these novel approaches add.

Recent findings: Studies using machine learning to predict suicide deaths report area under the curve that are only modestly greater than, and sensitivities that are equal to, those reported in studies using more conventional predictive methods. Positive predictive value remains around 1% among the cohort studies with a base rate that was not inflated by case–control methodology.

Summary: Machine learning or artificial intelligence may afford opportunities in mental health research and in the clinical care of suicidal patients. However, application of such techniques should be carefully considered to avoid repeating the mistakes of existing methodologies. Prediction studies using machine-learning methods have yet to make a major contribution to our understanding of the field and are unproven as clinically useful tools.


The prediction of suicide has been a focus of mental health research for decades. However, a practical method for anticipating individual suicides or even usefully stratifying patients according to suicide risk has remained elusive.[1,2] While the fallibility of suicide risk assessment is now more clearly understood[3,4] suicide risk assessment is still held to be a key element of suicide prevention.[5,6] Perceived suicide risk can be used to justify coercive and restrictive care of individual patients, for example by involuntary hospitalization, with the reasoning that the certain but limited harms associated with the treatment are outweighed by the less certain but catastrophic effects of suicide.[7] Similarly, it has been suggested that suicide prevention can be achieved by targeted interventions given to groups at higher suicide risk.[8] In recent years, there has been a surge of interest in the prediction of suicidal behaviour that seems to have been fuelled by reports that artificial intelligence or machine learning can produce predictive models with a higher degree of accuracy than has been achieved by clinical or more traditional statistical methods.[9] Proponents of machine learning have suggested a central limitation in suicide research has been the number and complexity of interacting suicide risk factors. Traditional statistical approaches such as logistic regression and various survival analyses methods have a limited ability to analyse a large numbers of variables and cannot consider more complex and contingent interactions. In contrast machine learning has the potential to recognize complex patterns of interacting risk factors and might, as a result have more utility.

Perhaps the most important limitation in suicide prediction that needs to be overcome is the base rate problem. In line with Bayes' Theorem, events like suicide that have a very low base rate can only be predicted using methods that have an extraordinary ability to discriminate between lower and higher risk groups.[10] Some researchers have argued that in the face of base rate issues, and the limited predictive value of current methods, future studies should focus on more common outcomes such as suicide attempts.[11] However, there is strong evidence that suicide attempts are distinct to suicide deaths. For example, suicide attempts are more strongly associated with female sex and younger age while suicide occurs more frequently among men and older people.[12]

Another central issue in suicide prediction research has been the nature and quality of the risk factor variables. For example, a lifetime history of suicidal ideation or behaviour are commonly used as predictor variables when current or recent suicidal thoughts and behaviour might be more pertinent[13] because suicidal ideation is known to fluctuate over short periods of time, and even highly lethal suicidal behaviour can occur with little planning.[14] However, a lack of distinction between static risk factors (such as previous suicide attempts or childhood trauma) and dynamic or fluctuating risk factors (such as depressed mood, hopelessness, agitation, intoxication) may not be the most important limitation of risk factor data. Recent research has suggested that the recording of historical data about suicide attempts is less accurate than might be imagined[15] and one recent study found that suicide prediction is enhanced if more accurate data about past hospitalization is available.[16]

Suicide Prediction Using Established Statistical Techniques

Putting aside concerns about the nature and quality of suicide risk factors, in recent years the statistical limitations of both individual risk factors and combinations of risk factors have come into focus. After performing a meta-analytic review of over 50 years of suicide research into the association between risk factors and suicide, Franklin et al.[1] emphasized the surprisingly modest predictive strength of well known risk factors in longitudinal studies and in a related study Ribeiro et al.[3] found that a history of self-injurious thoughts and behaviours 'only provide marginal improvement in diagnostic accuracy above chance'. Similarly a recent meta-analytic review of studies reporting the association between suicidal ideation and later suicide found a meta-analytic area under the curve (AUC) of .676 and positive predictive value (PPV) of 1.7% over a mean study duration of 9.1 years.[17] These metrics were associated with pooled sensitivity for suicide of 40%, and studies outside of specialist psychiatric settings had a lower sensitivity such that four in five of suicides would be missed if suicidal ideation was used as the sole test of later suicide.

It is likely that the number and complexity of contributing suicide factors places a limit on the statistical association between any single risk factor and suicide. To test this hypothesis, Large et al. examined 50 years of studies that used two or more clinical risk factors in longitudinal studies of suicide prediction. Overall they found that a higher risk category was associated with an odds ratio of 4.84 compared with the lower risk category with a sensitivity of 56% and PPV of 5.5% over 5 years of follow-up.[2] The authors found that models that included a greater number of suicide risk factors were not associated with better predictive metrics. This surprising finding led them to conclude that risk assessment using multiple factors has much the same statistical limitations as individual risk factors.[18] However, the vast majority of the studies included in the meta-analysis used experimenter derived scales or regression models to combine suicide risk factors. The inability of researcher designed risk scales or regressions methods to utilize complex and possibly contingent interactions between risk factors may have accounted for the disappointing results.

In 2019, a review by Belsher et al.[19] examined 17 studies with suicide prediction models that had been tested in both experimental and validation models. This study concluded the ability of such models to predict future suicidal events was near zero. These authors suggested that any predictive model that is accurate should be evaluated within clinical frameworks to assess the potential impact on clinical pathways and other outcomes, such as adverse events and costs to the service. Further the authors suggest that policy makers and researchers should consider what an acceptable PPV may be, and if it is too low to actually inform change in clinical care then such modelling should be abandoned.

Suicide Prediction Using Machine Learning

In recent years, there has been growing interest in using machine learning, used here synonymously with artificial intelligence to predict suicide (Table 1).[20–24] While these studies examine a diverse group of patients, and used differing methods, two observations can be made. First, the AUC results seem to be only modestly greater than those reported in studies using more conventional predictive methods. Second, the sensitivities in the highest risk group vary, but might not greatly exceed the sensitivity of 56% derived by meta-analysis of more conventional studies. Third, the PPV was limited, in most studies about 1% among the cohort studies with a base rate that was not inflated by case–control methodology. Moreover, two studies explicitly compared the predictive metrics of machine learning methods and more conventional statistical approaches. Choi et al.[25] found that Cox regression and the machine-learning methods of support vector machine and deep neural networks produced similar AUC statistics of 0.688, 0.687 and 0.683 respectively. Amini et al.[26] also compared a variety of prediction methods examining logistic regression decision trees, support vector machine and artificial neural networks finding similar AUC statics of 0.752, 0.725, 0.719 and 0.748, respectively.

Other studies have examined the utility of machine learning to predict suicide attempts. Walsh et al.[27,28] examined the predictive accuracy of a machine-learning algorithm to detect suicide attempts in adults, and later adolescents, from hospital health record data and found an AUC of 0.8–0.9. Barak-Corren et al. used Bayesian modelling to predict future suicidal behaviour (attempts or deaths) using a retrospective cohort design of electronic health records of 1728 549 individuals who had visited a large healthcare centre three or more times over a 15-year period. The authors reported the model had an accuracy of approximately 90%.[29] However, this apparent level of overall accuracy was associated a sensitivity (33–45%), specificity (90–95%) and positive predicative value of (3–5%) – figures that are quite similar those derived using traditional statistical techniques. The reported accuracy in each of these studies benefited from a relatively higher base rate of suicide attempt when compared with suicide. In addition, reporting 'accuracy' may be misleading. Accuracy refers to the true positives and true negatives divided by the total number of Individuals, thus when the base rate of an outcome is low (i.e. low number of true positives) a high level of accuracy can still be achieved by a large number of true negatives as is generally the case in suicide research.

Potential Imitations of Machine Learning Prediction Studies

Some authors believe that prediction studies using machine learning may have been limited by the way they reported their findings. In addition to the difficulty of interpreting accuracy, Fazel and O'Reilly[30] highlight that the calibration of the models is rarely reported limiting the interpretation of the metrics of sensitivity, specificity and AUC. This lack calibration reporting has been raised consistently as a limitation of machine-learning outside suicide research.[31] In a recent review of prediction studies in neurosciences, Poldrack et al. (2019) raised another limitation of prediction studies using machine learning, which is the practice of using the same or overlapping data to train the models on and then to test.[32] This methodological issue may limit the generalizability of the findings and may be a particularly relevant in suicide research because health services can be highly specific to the populations they serve.[33]

It is also uncertain how the detection of suicide risk from electronic health data can be applied in practice.[19] However, lead researchers in the field of machine learning and suicide believe that machine learning predictive models may be worthwhile even if they have statistical limitations because of the low cost. For example a clinician might be reminded to screen a patient for suicidal risk on the basis of an artificial intelligence algorithm that works seamlessly with the medical record.[34] However, the utility such an application is questionable given guidelines for primary care already include screening for common mental disorders and suicidal ideation in high risk groups.[35]

While it may be that machine learning can achieve somewhat stronger suicide predictions than more conventional methods, this does not mean that machine-learning tools are not vulnerable to the pitfalls that have hindered the adoption earlier suicide prediction methods. For example, the goal of clinical risk prediction is rarely to predict whether individuals will engage in suicidal behaviour in the next 12 months, as is a common outcome in predictive studies.[36] Rather, the clinical goal of a suicide risk assessment is usually to identify people who are sufficiently high risk to justify a set of clinical interventions over much shorter periods. When studies of very short-term prediction are conducted, they will be faced with an even more challenging base rate problem.

One ethically concerning suggestion is that predictive analytics could be used to rule out those who do not truly have risk despite presenting for mental healthcare.[33] In medicine a 'rule out' test, must have high sensitivity, so as not to exclude suicide cases, a sensitivity that has not been achieved by machine learning (Table 1). More broadly machine learning does not impact the distinction between statistical validity and the clinical utility. In the past it has been argued that the low PPV of suicide risk assessment limits the range of indicated treatments to those that are benign and tolerable to the numerous false positives. Following on from this critics have suggested risk assessment is not useful in guiding treatment because benign and tolerable interventions should also be offered to low risk groups among whom many if not most suicides will occur.[37,38] We fear that the distinction between statistical validity and clinical utility might be lost in the enthusiasm for machine learning, particularly given the difficulty clinicians and policy makers may have in understanding the statistical methods. For example, the reported high degree of accuracy reported in some studies may sound more convincing than other less intuitive predictive metrics that might be less remarkable. This enthusiasm for new methods might even trump concern about the lack the transparency in decisions from 'black boxes' of machine learning.[30]

Applications of Artificial Intelligence and Prediction Beyond Identifying Risk

Some authors suggest that the identification of individuals at risk may be less relevant in clinical services than practice of managing risk factors. Torous and Walker suggest that artificial intelligence may be useful in identifying risk where previous suicide attempt or diagnosis of mental illness are not present. These authors also identify opportunities for using artificial intelligence to efficiently disseminate interventions, particularly in the areas of safety planning, lethal means reduction and providing supportive contacts.[39] Artificial intelligence might be a driver of more sophisticated interventions such virtual counsellors, although such technology is still in the early stages, and early reviews by users suggest these methods will not be alternatives to face to face therapy in the near future.[40]

Artificial intelligence might also be used to analyse continuously collected data, including data extracted or monitored on social media. To date the understanding of proximal and fluctuating risk factors for suicide has been limited by practical research considerations. An increased capacity to study proximal factors using technology has been identified as a major direction of future research.[41] Use of smartphones, wearable devices, social media and other connected devices to conduct ecological momentary assessment might allow for collection of data in the real-world settings and across different time periods. Such sources create complex data sets which would be unanalysable without machine learning and might hold opportunities for both the understanding of suicidal behaviour and its prevention.