Stroke Prediction with Random Forest Machine Learning Model

PDF Review History

Published: 2024-06-17

Page: 122-131


Okpe Anthony Okwori *

Department of Computer Science, Federal University Wukari, Nigeria.

Moses Adah Agana

Department of Computer Science, University of Calabar, Nigeria.

Ofem Ajah Ofem

Department of Computer Science, University of Calabar, Nigeria.

Obono I. Ofem

Department of Computer Science, University of Calabar, Nigeria.

*Author to whom correspondence should be addressed.


Abstract

Stroke is a medical condition associated with either blockage or rupture of blood vessels which prevents the free flow of blood to the brain cells causing the brain cells to die. The dead brain cells cause malfunctions of the part of the body that it controls leading to stroke that can further result in permanent disability. Both ischemic and hemorrhagic stroke though occurring suddenly, are associated with some stroke risk factors such as age, hypertension, and body mass index among others.  These two types of stroke are very dangerous to human health and are a threat to life, ischemic stroke occurs more frequently than haemorrhagic stroke. In an attempt to reduce stroke occurrence, medical doctors use stroke biomarkers to predict stroke occurrence and confirm suspected stroke cases using several diagnostic tests. This technique of stroke prediction and diagnosis is highly time consuming, especially at an early stage when decision making is most important and no individual candidate or multimarker panel has proven to have adequate performance for use in an acute clinical setting hence a need for more efficient stroke prediction technique such as machine learning models. Machine learning is one of the modern areas in artificial intelligence that deals with the ability of a machine to imitate intelligent human behavior. This field is widely applied in healthcare services due to the ever-evolving patient dataset that can be used to train machine learning algorithms for pattern detection that enable medical professionals to recognize new diseases, predict treatment outcomes as well as make medical decisions about the risk of developing disease or medical condition like stroke. this paper aims to predict the stroke vulnerability status of patients using a random forest (RF) machine learning model. The model was built on Python programming language using healthcare_dataset_stroke data obtained from the Kaggle machine learning dataset repository. The dataset was properly cleaned and the clean dataset was used to train the random forest machine learning model for efficient prediction of stroke. the results obtained from the random forest model were evaluated using a confusion matrix and it was found that random forest is a very good choice of algorithm for predicting stroke vulnerability as evidenced in its prediction accuracy of 93%.

Keywords: Stroke, biomarkers, prediction, machine, learning, random, forest


How to Cite

Okwori, O. A., Agana, M. A., Ofem, O. A., & Ofem, O. I. (2024). Stroke Prediction with Random Forest Machine Learning Model. Asian Research Journal of Current Science, 6(1), 122–131. Retrieved from https://jofscience.com/index.php/ARJOCS/article/view/111

Downloads

Download data is not yet available.

References

Parmar P. Stroke: classification and diagnosis, Journal of the Royal Pharmaceutical Society; 2018. Available:https://pharmaceutical-journal.com/article/ld/stroke-classification-and-diagnosis Accessed on 26th August, 2023.

Baye M, Hintze A, Gordon-Murer C, Tatiana Mariscal T, Belay GJ, Gebremariam AA, Hughes CML. Stroke Characteristics and Outcomes of Adult Patients in Northwest Ethiopia, Frontiers in Neurology; 2020. Available:https://www.frontiersin.org/articles/10.3389/fneur.2020.00428/full Accessed on 26th August, 2023.

Sara R. Biomarkers for Prediction of Stroke; 2020. Available:https://www.news-medical.net/health/Biomarkers-for-Prediction-of-Stroke.aspx Accessed on 28th December, 2022.

James M. Everything you need to know about stroke; 2020. Available:https://www.medicalnewstoday.com/articles/7624 Accessed on 28th December, 2022.

Marie D, Geoffrey AD, Stephen MD, Helen MD, David WH. Acute Stroke Biomarkers: Are We There Yet?, Front Neurol; 2021. Available:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7902038/ Accessed on 30th June, 2023.

Hamdi B. Machine Learning: For Beginners; 2020. Available:https://bouzouitina-hamdi.medium.com/machine-learning-for-beginners-b552ec0067a Accessed on 26th August, 2023.

Ogbu HN, Agana MA. Intranet Security using a LAN Packet Sniffer to Monitor Traffic. In Natarajan M.(Eds). 2019;9(8):57-68: CCSIT, NCWMC, DaKM

Fernandez Lozano C, Hervella P, Mato Abad V, Rodríguez Yáñez M, Suárez Garaboa S, López Dequidt I, et al. Random forest based prediction of stroke outcome, Journal of Scientifc Reports; 2021. Available:https://www.nature.com/articles/s41598-021-89434-7 Accessed on 16th February, 2023.

Ellis C. Random Forest overfitting; 2023. Available:https://crunchingthedata.com/random-forest-overfitting/ Accessed on 26th August, 2023.

Logunova I. Random Forest Classifier: Basic Principles and Applications; 2022. Available:https://serokell.io/blog/random-forest-classification Accessed on 26th August, 2023.

Queiroz DA, Assunção GSA, Ferreira KAS, Moura VV, Lima VPB, Dias FA, et al. Prediction of survival in breast cancer patients using Random Forest classifier and ReliefF feature selection method, International Journal of Computer Science and Information Security (IJCSIS). 2021;19(5):41-47.

Octaviani TL, Rustama Z. Random Forest for Breast Cancer Prediction, Proceedings of the 4th International Symposium on Current Progress in Mathematics and Sciences (ISCPMS 2018). 2018;1-6.

Hashi Z. Lung cancer survival prediction using random forest based decision tree algorithms, proceedings of the international conference on industrial engineering and Operations Management Washington, DC, USA; 2018. Available:https://www.researchgate.net/publication/328772631_Lung_Cancer_Survival_Prediction_Using_Random_Forest_Based_Decision_Tree_Algorithms Accessed on 11th March, 2023.

Sittidech P, Nai-arun N. Random forest analysis on diabetes complication data, Proceedings of the IASTED International Conference Biomedical Engineering (BioMed) Zurich, Switzerland. 2014;315-320.

Wang J, Heping Y, Hua Q, Jing S, Liu Z, Peng X, Cao C, Luo Y. A descriptive study of random forest algorithm for predicting COVID-19 patients outcome, Journal of Biomedical and Life Sciences; 2020. Available:https://www.ncbi.nlm.nih.gov/pmc/ Accessed on 26th August, 2023.