Introduction: Coronary artery disease (CAD) is still one of the primary causes of death in the developed countries. Stress single-photon emission computed tomography is used to evaluate myocardial perfusion and ventricular function in patients with suspected or known CAD. This study sought to test data mining and machine learning tools and to compare some supervised learning algorithms in a large cohort of Italian subjects with suspected or known CAD who underwent stress myocardial perfusion imaging. Methods: The dataset consisted of 10,265 patients with suspected or known CAD. The analysis was conducted using Knime analytics platform in order to implement Random Forests, C4.5, Gradient boosted tree, Naïve Bayes, and K nearest neighbor (KNN) after a procedure of features filtering. K-fold cross-validation was employed. Results: Accuracy, error, precision, recall, and specificity were computed through the above-mentioned algorithms. Random Forests and gradients boosted trees obtained the highest accuracy (>95%), while it was comprised between 83% and 88%. The highest value for sensitivity and specificity was obtained by C4.5 (99.3%) and by Gradient boosted tree (96.9%). Naïve Bayes had the lowest precision (70.9%) and specificity (72.0%), KNN the lowest recall and sensitivity (79.2%). Conclusions: The high scores obtained by the implementation of the algorithms suggests health facilities consider the idea of including services of advanced data analysis to help clinicians in decision-making. Similar applications of this kind of study in other contexts could support this idea.

Application of data mining in a cohort of Italian subjects undergoing myocardial perfusion imaging at an academic medical center

Cesarelli M.;
2020-01-01

Abstract

Introduction: Coronary artery disease (CAD) is still one of the primary causes of death in the developed countries. Stress single-photon emission computed tomography is used to evaluate myocardial perfusion and ventricular function in patients with suspected or known CAD. This study sought to test data mining and machine learning tools and to compare some supervised learning algorithms in a large cohort of Italian subjects with suspected or known CAD who underwent stress myocardial perfusion imaging. Methods: The dataset consisted of 10,265 patients with suspected or known CAD. The analysis was conducted using Knime analytics platform in order to implement Random Forests, C4.5, Gradient boosted tree, Naïve Bayes, and K nearest neighbor (KNN) after a procedure of features filtering. K-fold cross-validation was employed. Results: Accuracy, error, precision, recall, and specificity were computed through the above-mentioned algorithms. Random Forests and gradients boosted trees obtained the highest accuracy (>95%), while it was comprised between 83% and 88%. The highest value for sensitivity and specificity was obtained by C4.5 (99.3%) and by Gradient boosted tree (96.9%). Naïve Bayes had the lowest precision (70.9%) and specificity (72.0%), KNN the lowest recall and sensitivity (79.2%). Conclusions: The high scores obtained by the implementation of the algorithms suggests health facilities consider the idea of including services of advanced data analysis to help clinicians in decision-making. Similar applications of this kind of study in other contexts could support this idea.
2020
Analytics platform
Cardiology
Data mining
Decision-making
Myocardial perfusion imaging
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12070/67656
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 46
  • ???jsp.display-item.citation.isi??? ND
social impact