PREDICTIVE MODELLING OF ISCHEMIC STROKE RISK: A SYSTEMATIC COMPARISON OF SVM AND LSTM ARCHITECTURES UNDER MULTI-METRIC ASSESSMENT CRITERIA

Fariha Shahid; Aftab Ahmed Chandio*; Qamar-ul-Nisa Chandio; Farhat Noureen Memon

Authors

Fariha Shahid
Aftab Ahmed Chandio*
Qamar-ul-Nisa Chandio
Farhat Noureen Memon

Abstract

Background: Ischemic stroke is a major cause of death and disability and is increasingly common, and the ability to recognize those at high risk is vital in its prevention. Machine learning and deep learning techniques have provided a means of improving stroke predictions. However, the choice of modelling framework that provides the most accurate and clinically meaningful predictions, particularly in the context of the important balance between sensitivity and precision, has not been properly investigated.

Methods: This paper develops and compares four stroke prediction models a Support Vector machine (SVM), Random Forest (RF), XGBoost, and a Long Short-Term Memory (LSTM) network- that are trained and assessed using a structured dataset of 50,000 patient records with demographic and clinical risk factors. Preprocessing included missing data that was imputed using the median value, one-hot encoding of categorical variables, and standard scaling. Each of the four models was trained, validated and evaluated on the same data splits (70%-15%-15) using stratified sampling. Performance was assessed using a multi-metric model covering accuracy, precision, recall, F1-score and ROC-AUC, log loss, Brier score, Cohens Kappa and Matthews Correlation Coefficient (MCC).

Results: The SVM had the best overall accuracy (0.9556), precision (0.8517) and F1-score (0.8521) and better calibration and agreement statistics. The Random Forest produced a closely comparable performance profile (accuracy = 0.9543; precision = 0.8491; F1 = 0.8472; AUC = 0.9867). The highest sensitivity-oriented model was exhibited by XGBoost (recall = 0.9022; accuracy = 0.9435; AUC = 0.9852), with the highest recall of all the models (0.9724). There was a close relationship in values between ROC-AUC across the four models (0.9852-0.9877) indicating similar rank-order discriminative ability regardless of the complexity of architecture.

Discussion: The results indicate that finely hybridized classical machine learning models are still highly competitive when compared to ensemble and deep learning models on structured tabular health data. The present study suggests the framework of model selection to be used in clinically grounded diagnostic support applications, namely: SVM and Random Forest are recommended to be used in high-sensitivity diagnostic support applications, whereas XGBoost and LSTM are more effectively designed to operate in high-sensitivity screening settings. The multi-metric benchmarking framework which can be proposed in this research can offer a better way of model evaluation than a single-metric evaluation approach.