DIAGNOSTIC MODEL OF COVID-19 DISEASE FROM CLINICAL DATA BASED ON XGBOOST METHOD

  • Dương Thị Kim Chi

Abstract

Clinical data are results from blood count tests, urinalysis, which is also a medical procedure that is very commonly performed during examination, treatment and disease monitoring. For doctors directly treating, the results of subclinical tests are considered an effective way to support, especially in the case of functional symptoms, the patient's symptoms are unclear or non-specific. Currently, COVID-19 disease is also an asymptomatic disease or with unclear symptoms that can easily be confused with influenza or hemorrhagic numbers. Using modern machine learning methods to support the screening process of infectious diseases from clinical data samples will help to quickly and accurately identify diseases that can be applied simultaneously to a large number of samples. This has made the disease screening process fast, accurate and cost-effective. This study proposes an automatic model of clinical data processing and combines the Gradient Boosting classification model to predict COVID-19 disease, the proposed model can learn directly from the raw data as a result of the test. clinical trials without deleting blank data. The proposed model from this study includes two phases: the first phase will evaluate and process data; Phase two will build a disease classification model based on XGBoost (Extreme Gradient Boosting) method. To build a successful model, the study was carried out based on a dataset from the Israelita Albert Einstein hospital in Brazil, which is a dataset compiled by Teich from patients hospitalized April to May 2020 and published publicly in the journal einstein_journal. The results from this study show that combining the automated data processing technique and the XGBoost model to generate a COVID-19 disease classifier from clinical data has good results and performance obtained from the model. is superior to studies on the same topic on the same dataset, with overall accuracy above 0.998. To confirm the accuracy and performance of the proposed model, we compared it with other authors' studies for the same predictive function, and found that the model gave better results in terms of accuracy and sensitivity. Recall, Specificity, F1 score, ROC, Results were all at 0.99. In the future, the model from this study will help make the patient's diagnosis simple and accurate. At the same time, it will help the medical system to automatically diagnose diseases, bring more opportunities for timely treatment to patients and help prevent disease outbreaks.

điểm /   đánh giá
Published
2023-06-15
Section
Bài viết