Designing an Interpretable Algorithm for Predicting Heart Diseases using Data Mining Methods

Kermani, Javad; Khedmati, Majid

Please enable javascript in your browser.

Designing an Interpretable Algorithm for Predicting Heart Diseases using Data Mining Methods

Kermani, Javad | 2024

0 Viewed

Type of Document: M.Sc. Thesis
Language: Farsi
Document No: 57623 (01)
University: Sharif University of Technology
Department: Industrial Engineering
Advisor(s): Khedmati, Majid
Abstract:
In recent years, there has been a growing interest in using intelligent methods in healthcare systems, particularly for predicting the probability of various diseases occurring. According to official statistics from the World Health Organization, heart disease is one of the leading causes of mortality worldwide. Therefore, predicting heart disease using machine learning methods, as a decision-support tool for physicians, can significantly aid healthcare systems in the prevention, diagnosis, and treatment of these conditions. This study utilizes machine learning techniques to classify and predict the occurrence of heart disease in individuals visiting medical centers. The objective is to apply these techniques to forecast heart disease and extract counterfactual analysis to interpret the results, helping to create conditions that prevent the disease and ensure timely treatment. The classification methods used include k-nearest neighbors, decision tree, naive Bayes classifier, random forest, logistic regression, and support vector machine. Given the high cost of false negatives in diagnosing serious diseases, several measures have been taken to minimize such errors. The performance of these algorithms has been assessed using metrics like the classification recall rate (sensitivity). Notably, the random forest algorithm has shown a leading recall rate of about 95%, marking a significant improvement compared to similar studies. Due to the importance of early disease diagnosis and its impact on human lives, the interpretability of machine learning models’ outputs is essential for healthcare professionals and patients. Therefore, various explainable AI and machine learning methods have been developed alongside predictive methods. One way to enhance the interpretability of machine learning models’ outputs is through counterfactuals. Extracting counterfactuals from machine learning models essentially enters the prescriptive layer of data mining to assist specialist doctors in treating and controlling diseases. In this study, while extracting counterfactuals for the random forest model, the feasibility of extracting them using the k-nearest neighbor algorithm based on the concept of the distance between samples was also conducted. The results of comparing this method with conventional methods in the literature, as well as assessing the trust in the quality of the produced counterfactuals using this novel approach compared to existing methods, have been calculated using the naive Bayes method
Keywords:
Machine Learning ; Heart Diseases ; Classification Algorithms ; Interpretability ; Heart Disease Prediction ; Counterfactual Analysis

Digital Object List

محتواي کتاب
view

Bookmark

No TOC

Friend's email
Your name
Your email
enter code