Modeling and Predicting the Risk of Coronary Artery Disease Using Data Mining Algorithms

Saadi, Paria; Zeinalnezhad, Masoomeh; Movahedi Sobhani, Farzad

Volume 8, Issue 2 (9-2021) jhbmi 2021, 8(2): 193-207 | Back to browse issues page

Mendeley

Zotero

RefWorks

Saadi P, Zeinalnezhad M, Movahedi Sobhani F. Modeling and Predicting the Risk of Coronary Artery Disease Using Data Mining Algorithms. jhbmi 2021; 8 (2) :193-207
URL: http://jhbmi.ir/article-1-592-en.html

Modeling and Predicting the Risk of Coronary Artery Disease Using Data Mining Algorithms

Paria Saadi

, Masoomeh Zeinalnezhad ^*

, Farzad Movahedi Sobhani

Ph.D. in Industrial Engineering, Assistant Professor, Industrial Engineering Dept., Faculty of Engineering, West Tehran Branch, Islamic Azad University, Tehran, Iran

Abstract: (4098 Views)

Introduction: Coronary artery disease (CAD) is one of the most common causes of death in adults while accurate and early diagnosis can lead to treatment and survival of patients to a great extent. Therefore, the objective of this study was to identify the effective factors leading to this disease and develop a data-driven model to assist physicians in predicting and diagnosing it.
Method: This is an applied research, considering 2038 medical records, collected from Shahid Rajaei Heart Hospital in Tehran, during 5 years. A data preprocessing was carried out and random balanced sampling reduced the dataset into 1000 records, with 500 CAD and 500 Normal. Literature review, consultation with specialist physicians, and weighting using the Chi-square method led to the determination of important features. Support Vector Machine, Neural Network and Random Forest algorithms were applied in RapidMiner and Python.
Results: Among the 35 identified variables, the most important features included VHD, Chest pain, LDL, RWMA, TG, Na, K, BP, and weight. The F-measure, precision, accuracy, and recall for random forest algorithm were calculated as 82.11%, 81.40%, 79.07%, and 85.40%, respectively, and the error rate was 18.6%.
Conclusion: Random Forest predicted the risk of CAD with a reasonable precision. In comparison, due to the large number of input nodes, the error rate of the Neural Network model was relatively higher (23.6%).

Keywords: Coronary Artery Disease, Prediction, Support Vector Machine, Neural Network, Random Forest

Full-Text [PDF 1337 kb] (1900 Downloads)

Type of Study: Original Article | Subject: Data Mining
Received: 2021/05/9 | Accepted: 2021/08/23

Audio File [MP3 3992 KB] (163 Download)