Background: Suicide is a significant public health issue. Policy and decision makers can play an important role in suicide prevention. However, few prediction models for population risk of suicide have been developed.
Objective: To develop and validate prediction models for population risk of suicide using health administrative data.
Methods: We used a case-control study design to develop sex-specific risk prediction models for suicide, using the health administrative data in Quebec, Canada. The training data included all suicide cases (n = 8,899) that occurred from January 1st 2002 to December 31st 2010. The control group was a 1% random sample of living individuals in each year between January 1st, 2002 and December 31st, 2010 (n = 645,590). The developed model was converted into synthetic estimation models with community characteristics as predictors. The models were directly applied in the validation data from January 1st, 2011 to December 31st 2019.
Results: The sex-specific models based on individual data had good discrimination (Male model: C = 0.79; Female model: C = 0.85, and calibration (Brier score: male model = 0.01, female model = 0.005). With the regression-based synthetic models, the absolute difference between the synthetic risk estimates and observed suicide risk ranged from 0 to 0.001%. The Root Square Mean Errors were under 0.2. The synthetic estimation model for males correctly predicted 4 out of 5 high-risk regions in 8 years, and the model for females correctly predicted 4 out 5 high-risk regions in 5 years.
Conclusion: Prediction models built on routinely collected health administrative data can accurately predict population risk of suicide. This effort can be enhanced by timely access to other critical information at the population level.