Loading...
Investigation and Development of an Interpretable Machine Learning Model in Therapeutic Applications by Providing Solutions to Change the Condition of Patients
Damandeh, Moloud | 2023
78
Viewed
- Type of Document: M.Sc. Thesis
- Language: Farsi
- Document No: 56216 (01)
- University: Sharif University of Technology
- Department: Aerospace Engineering
- Advisor(s): Haji, Alireza
- Abstract:
- Despite the significant progress of machine learning models in the health domain, current advanced methods usually produce non-transparent and black-box models, and for this reason, they are not widely used in medical decision-making. To address the issue of non-transparency in black-box models, interpretable machine learning models have been developed. In the health domain, counterfactual scenarios can provide personalized explanations for predictions and suggest necessary changes to transition from an undesirable outcome class to a desirable one for physicians. The aim of this study is to present an interpretable machine learning framework in the health domain that, in addition to having high predictive accuracy, offers explanations for improving patients' conditions. In this regard, we first formulate the counterfactual explanation problem for predicting the risk of coronary heart disease. Then, we propose a three-stage interpretable framework to solve it. The dataset used is the Framingham Heart Study dataset, which includes potential risk variables for coronary heart disease. In the first (pre-processing) and second (risk prediction of coronary artery disease using machine learning classification models) phases, we improve the accuracy of the classification models using pre-processing and processing methods. Additionally, we select the best classification model for predicting coronary heart disease based on medical domain evaluation criteria for use in the third phase of the proposed framework. The third phase of our proposed framework is dedicated to generating counterfactual scenarios. These scenarios identify the minimum necessary changes in risk factors to reduce the risk of coronary heart disease. The evaluation of the proposed framework is conducted based on performance evaluation criteria relevant to medical domain objectives and expert physician surveys using five machine learning classification models and two counterfactual explanation generation algorithms. The comparison of results obtained in the second phase of our proposed framework with previous works shows that pre-processing and processing methods play an effective role in improving the performance of machine learning models; for example, decision tree, support vector machine, logistic regression, random forest, and multilayer perceptron algorithms achieved accuracies of 85.39%, 87.61%, 87.27%, 91.66%, and 88.11%, respectively, which are 1.65%, 1.83%, 0.76%, 3.03%, and 3.10% higher than previous studies. The quantitative evaluation results of the generated scenarios in the third phase indicate that interpretable machine learning models have high potential for generating explanations for predicting chronic diseases, specifically coronary artery disease; for instance, the Dice algorithm achieved a sparsity of 5.5 and a local outlier factor of 0.0182. Lastly, the survey results show that, overall, expert physicians agree with our proposed framework, although in some cases, the suggested scenarios for reducing the risk of coronary artery disease are considered insufficient
- Keywords:
- Machine Learning ; Coronary Arteries ; Interpretability ; Coronary Arteries Disease (CAD) ; Risk Prediction ; Counterfactual Explanations
- محتواي کتاب
- view