Identification of high death risk coronavirus disease-19 patients using blood tests
Elaheh Zadeh Hosseingholi1, Saeede Maddahi2, Sajjad Jabbari2, Ghader Molavi3
1 Department of Biology, Faculty of Basic Sciences, Azarbaijan Shahid Madani University, Tabriz, Iran
2 Department of Internal Medicine, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz; Emam Hossein Hospital, Tabriz University of Medical Sciences, Hashtrood, Iran
3 Emam Hossein Hospital, Tabriz University of Medical Sciences, Hashtrood; Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
|Date of Submission||19-Jun-2021|
|Date of Acceptance||18-Dec-2021|
|Date of Web Publication||29-Jul-2022|
Dr. Ghader Molavi
Immunology Research Center, Tabriz University of Medical Sciences, Tabriz 5166614733
Source of Support: None, Conflict of Interest: None
Background: The coronavirus disease (COVID-19) pandemic has made a great impact on health-care services. The prognosis of the severity of the disease help reduces mortality by prioritizing the allocation of hospital resources. Early mortality prediction of this disease through paramount biomarkers is the main aim of this study. Materials and Methods: In this retrospective study, a total of 205 confirmed COVID-19 patients hospitalized from June 2020 to March 2021 were included. Demographic data, important blood biomarkers levels, and patient outcomes were investigated using the machine learning and statistical tools. Results: Random forests, as the best model of mortality prediction, (Matthews correlation coefficient = 0.514), were employed to find the most relevant dataset feature associated with mortality. Aspartate aminotransferase (AST) and blood urea nitrogen (BUN) were identified as important death-related features. The decision tree method was identified the cutoff value of BUN >47 mg/dL and AST >44 U/L as decision boundaries of mortality (sensitivity = 0.4). Data mining results were compared with those obtained through the statistical tests. Statistical analyses were also determined these two factors as the most significant ones with P values of 4.4 × 10−7 and 1.6 × 10−6, respectively. The demographic trait of age and some hematological (thrombocytopenia, increased white blood cell count, neutrophils [%], RDW-CV and RDW-SD), and blood serum changes (increased creatinine, potassium, and alanine aminotransferase) were also specified as mortality-related features (P < 0.05). Conclusions: These results could be useful to physicians for the timely detection of COVID-19 patients with a higher risk of mortality and better management of hospital resources.
Keywords: Aspartate aminotransferases, blood urea nitrogen, coronavirus disease-19, machine learning, prognosis
|How to cite this article:|
Zadeh Hosseingholi E, Maddahi S, Jabbari S, Molavi G. Identification of high death risk coronavirus disease-19 patients using blood tests. Adv Biomed Res 2022;11:58
|How to cite this URL:|
Zadeh Hosseingholi E, Maddahi S, Jabbari S, Molavi G. Identification of high death risk coronavirus disease-19 patients using blood tests. Adv Biomed Res [serial online] 2022 [cited 2022 Dec 5];11:58. Available from: https://www.advbiores.net/text.asp?2022/11/1/58/352934
| Introduction|| |
A new member of the coronavirus family, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has been spread throughout the world since December 2019. The World Health Organization declared this novel coronavirus disease (COVID-19) as a pandemic. The limited number of staff and equipment in hospitals is one of the important issues in pandemic situations. The symptoms of infected individuals are nonspecific. Patients may develop a variety of symptoms such as fever, cough, loss of appetite, fatigue, and shortness of breath. Mild illness is the outcome of most COVID-19 patients. These low-risk patients can be treated by simple methods and home-based self-quarantine. However, in patients with severe COVID-19, the disease progress to acute respiratory pneumonia and syndromes and even death. The mortality rate of patients in critical cases is high. It is necessary to predict the mortality risk of patients to efficiently allocation of hospital resources. Early identification of key patients will help efficient hospitalization in the intensive care unit (ICU). Several biomarkers have been indicated in recent researches that can help to and mortality by providing crucial information regarding the patients' health status.
Early prediction scoring systems using machine learning algorithms have been proposed to the classify most appropriate discriminatory biomarkers determining survival or death outcomes in COVID-19 patients. These models categorize patients into low, moderate, and high-risk groups. Self-reported symptoms, computed tomography (CT) scans or chest X-rays, and hematological parameters are input data sources of most models. Routine blood tests are low-cost and quick. Their false results are rare and require low-resource settings. Some research studies have introduced feasible blood test-based biomarkers for the early detection of COVID-19 cases and distinguishing high mortality risk patients. However, few proposed machine learning predictive models have been applied to these biomarkers. In this study, we analyzed the blood test results of 205 patients retrospectively with machine learning and statistical tools to identify significant markers of mortality risk. The findings obtained through this study provide easy-to-use predictive biomarkers to identify high-risk COVID-19 patients. This simple method can be exploited as a complement for the detection of individuals that require immediate medical attention and prioritizing their therapy and hospitalization.
| Materials and Methods|| |
The analyzed dataset includes demographic (age and gender), medical records (blood test), and definite survival outcomes (survived or deceased) information of 205 confirmed COVID-19 patients collected at Emam Hossein Hospital (Hashtrood, Iran) during June 2020 − March 2021. The cases of COVID-19 patients were confirmed through the clinical symptoms, polymerase chain reaction test, and chest X-rays results. The data contains no missing or uncertain values. The patients consisted of 112 women (54.6%) and 93 men (45.4%), and their ages range between 16 and 90 years old. Survival outcomes of data are nonnormal containing 28 (13.7%) died instances and 177 (86.3%) survived instances. Each patient profile contains 20 features related to blood tests [described in more detail in [Table 1]].
|Table 1: Blood test features with the measurement units, and their related minimum, maximum, and median values in survived or deceased groups of patients|
Click here to view
Machine learning prediction classifiers
The data of all features were partitioned into two sets of training (70%) and test (30%). The final prediction scores are the median of ten separate random data splitting. Three successful methods for the classification including Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), and Random Forests (RF) were used in this study. These methods were implemented by “MASS,” “e1071,” and “RandomForest” packages in the R statistical environment, respectively.
The main objective of Discriminant Analysis is a predictive equation for better classifying and understanding the relationship among the features. The SVM is a machine learning algorithm used for regression and classification of both linear and nonlinear health care types of research. RF is the most commonly used ensemble machine learning algorithm. RF generates different decision trees with a random subset of features to select optimal split.
The performance of applied methods was tested by confusion matrix scores of Matthews correlation coefficient (MCC), accuracy, F1 score, sensitivity, specificity. Best algorithm selection was based upon the MCC score because it considers all the four true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) categories of the confusion matrix. The accuracy, F1 score, sensitivity, specificity, and MCC formulas are the following:
Accuracy = TP + TN/TP + FN + TN + FP
F1 score = 2× TP/2 × TP + FN + FP
Sensitivity = TP/TP + FN
Specificity = TN/TN + FP
MCC = TP × TN-FP × FN/√ (TP + FP) × (TP + FN) × (TN + FP) × (TN + FN)
Aggregate feature rankings
Feature selection reduces the complexity of the prediction by selecting the most relevant features of the dataset. To investigate the most important features of the COVID-19 patients' dataset and their ranking the following procedure was used: first, the ranking of each feature was listed by performing RF feature ranking (section 2.3.1); second, features ranked through employing traditional statistics analysis (section 2.3.2); finally, geometric mean (GM) of the ranking numbers were calculated and the features were ranked according to GM. The feature with a smaller GM was considered the most important feature.
Machine learning analysis
RF statistically (mean square error accuracy decrease) and informatively (Gini impurity decrease) feature ranking methods in R are effective machine learning techniques in the context of health informatics. In both methods, the more important features are which their removal increases accuracy drop more. All feature importance ranks were assessed through related commands of the “RandomForest” package in the R.
Traditional statistical tests were also performed to represent the relationship between each feature and mortality by P value score. Statistically, a significant difference was defined as a P < 0.05. First, quantile-quantile plot was drawn to each feature to check their distribution. Mann–Whitney U-test (or Wilcoxon rank-sum test) to the real-valued features and the Fisher exact test to the binary and feature (gender) were applied to compare the distribution of each feature between the two groups (survived and deceased patients). A low P value of these tests means the strong relationship between analyzed features and mortality, while a high P value means the opposite. A violin plot of significant nondemographic data was drawn via the R “ggplot2” package. Finally, obtained P value scores were employed to rank the features from the most death-related to the least death-related.
Prediction using two most important features
The two most relevant features in the dataset were selected to predict mortality upon these two features exclusively. Accordingly, classification and regression tree (CART) through the R “rpart” and “rpart. plot” packages. The results of this methodology are easy to understand and interpret by a biomedical doctor and would be useful in critical decision-making situations. To verify the predictive power of this decision tree method, one rule method prediction through the R “OneR” package was also applied to the two selected features.
| Results|| |
Performance evaluation of prediction algorithms using all features
The study developed machine learning methods for mortality prediction of COVID-19 patients using the patients' demographic and blood tests believed to be involved with the outcomes. The results of three applied methods of RF, SVM, and LDA are presented in [Table 2]. The performances of the used models were evaluated using a confusion matrix. Prediction results showed that RF outperformed all the other methods based on performance metrics, while LDA was better than SVM. RF was considered as the best model regard correct predicting of the majority of deceased patients compared to two other algorithms by obtaining the top MCC (+0.514), accuracy (0.887), and specificity. LDA attained the highest true positive rate (sensitivity = 0.4) and F1-score (0.5). SVM obtained a very high TN rate (specificity = 1).
|Table 2: Performance results of machine learning algorithms using all features|
Click here to view
Machine learning and biostatics clinical feature ranking
Two ranking methods (machine learning and statistics) showed high similarity in the feature selection. Regarding RF feature ranking, which measured the importance of each feature with the Mean square error decrease and the Gini impurity rankings [Figure 1], the AST and BUN features were detected as the most predictive ones among all dataset features [Table 3]. They were selected as top features because they both occupy the first and second positions in both the RF ranking methods and their removal from the dataset would influence the prediction results more than the removal of the other features.
According to statistical analysis, the normality assumptions were not fulfilled. Wilcoxon rank-sum tests (continuous variables), and Fisher exact test (categorical variables) were applied to the dataset to find statistically different features between survived and deceased groups, at a significance level of 0.05.
|Figure 1: Random Forests feature selection. The mean square error decrease (left), Gini impurity decrease (right) for each feature removal|
Click here to view
Age was identified as a relevant feature to mortality (P = 0.0034) among demographic features. The distribution of COVID-19 patients' age groups and survival outcomes is demonstrated in [Figure 2]. As it is easy to perceive, most patients 36% (74) were observed in the age range of 61–75 years old. The highest number of deaths, 25%, were seen in patients aged 76–90 years old. As for the distribution of the gender of the patients, 14.2% of female patients were deceased while 12.9% of them died. The Fisher exact test indicated that the two variables of gender and patient mortality are independent (P = 0.84).
In this study WBC, PLT, NEUT, RDW-SD, RDW-CV, BUN, Cr, PT, K, AST, and ALT were identified as significant blood biomarkers between survived and deceased groups. Violin plots reporting the relative distribution of significant blood features are shown in [Figure 3].
|Figure 3: The distribution of the significant blood biomarker data among healthy survived and deceased groups|
Click here to view
Ranking the features according to their P values detected AST and BUN as the most significant features [Table 3].
Two selected features-based prediction
The capability of machine learning in the precise prediction of patients' mortality using the top two ranked features alone was evaluated by CART decision tree and one rule algorithms. One rule method was also applied to prove the obtained results [Table 4]. As shown, results showed high MCC prediction scores (0.53 0.44), confirming the importance of AST and BUN in the dataset. Decision tree plot analysis by “rpat.plot” also determined that the mortality risk of patients with BUN >47 mg/dL and AST >44 U/L is 50%.
| Discussion|| |
Health systems have encountered the problem in the efficient allocation of hospital facilities to patients in the COVID-19 pandemic duration. Machine learning helps to timely analysis, identify hidden patterns, and compute rankings of factors in the dataset. Therefore, researchers take advantage of these methods to analyze the health records of COVID-19 patients and predict mortality risk among them.
The study used demographics, blood biomarker examination results, and survival information of 205 COVID-19 patients to find important features being involved in their mortality. Three different algorithms (RF, SVM, and LDA) were used. RF was the best performer with MCC = 0.514 [Table 2].
Because of the imbalance of the dataset (86.3% survived and 13.7% deceased), the algorithms encounter more negative instances during training, and consequently, they are more trained to recognize deceased patient data during testing. Therefore, the used methods obtained better prediction scores on negative elements (specificity), rather than the positive elements (sensitivity). Applying the appropriate evaluation metrics is the solution to this problem. MCC (−1 to + 1), unlike accuracy, is an appropriate metric in the imbalanced data context, which produces higher scores if the classifier predicts the majority of positives and negatives correctly. Hence, it was the main performance indicator of used methods.
In the second part of the project, the most relevant features associated with the mortality of patients were investigated. Since RF achieved the best performance results in predicting mortality, its feature selection methods were used to rank the clinical features of the dataset. Serum AST and BUN were the top two most important features of the data [Table 3]. Increased BUN and serum Cr are two laboratory tests that indicate kidney injury. Although COVID-19 impacts mainly the lungs, it can also affect the kidneys and liver., Kidney involvement in severe COVID-19 patients was frequently seen. The association of elevated BUN and serum Cr levels with the mortality risk of patients was also proven.,
Virus-mediated liver injuries and increased abnormal levels of ALT and AST, as a common laboratory finding in severe COVID-19 patients with unfavorable outcomes, were demonstrated in clinical investigations.,,, Furthermore, the probability of death is high in patients with preexisting liver diseases. The elevated concentrations of ALT and AST were reported as recurrence risk predictive markers in COVID-19 patients in future.
The statics results demonstrated a high association of age with the survival outcome (P = 0.0034).
The relationship between both COVID-19 fatality and age varies notably across the countries because of differences in population health and clinical care standards. The results of this study indicated the necessity of allocating more clinical facilities to patients over 70 years old. It was also found that death in younger age groups (<30 years of age) is uncommon with a log-linear increase in age groups older than 30 years. A higher rate of death in individuals with older age is previously reported in several kinds of researches. Association of age with the death of patients has previously been reported by many researchers.,,, The results also indicated independence of gender and survival outcome. The results of several studies have emphasized that the severity and mortality of COVID-19 are lower in females than in males.,, However, the insignificance of sex in patient outcomes has been shown in some earlier researches.,
The dead patients showed significantly decreased PLT, but increased WBC and NEUT, compared with surviving patients. Neutrophils eliminate the viral severe lung infections by the production of neutrophil extracellular traps. It has been proven that increased neutrophil count is indicative of severe COVID-19 disease, the requirement to patient transfer to ICU, and increased mortality rate., Reduced platelet count is another common hematological change in COVID-19 patients. SARS-CoV-2 causes thrombocytopenia by different mechanisms such as reducing platelet production in the bone marrow, increasing platelet destruction by the immune system, and its aggregation in the lung. The significantly lower number of platelets in the severe COVID-19 cases has been reported as a poor prognosis of COVID-19 patients. Monitoring PLT count during hospitalization has been also suggested.
Elevated RDW-related parameters (RDW-CV and RDW-SD) were associated with mortality in this study. A progressive increase of RDW is a prognostic parameter in many infectious diseases including COVID-19 and its routine assessment was suggested in patients RDW increases associated with a higher mortality rate was found., The significantly increased plasma K level was seen in dead COVID-19 patients which is in parallel with the findings of this study.
Aggregate features ranking [Table 3], identified AST and BUN as the two most relevant features to patients' death. Then CART was trained and tested using these two features and all the 324 patients. The decision tree method was employed instead of RF in this phase. Because combinations of subsets containing only two features were not possible in RF. The prediction was re-examined by one rule model, and the performance result was compared. Results showed high MCC prediction scores [Table 4]. Performance results of these two prediction models were very similar because one rule is a simple decision tree method based on one splitting. On the imbalanced dataset and fewer features, decision trees lead to better predictions.
Strong relationships correlations between features are very common in clinical datasets. However, tree-like graph models (decision trees, one rule, and RF), because these methods are not affected by the statistical correlation between features, and therefore, their application to patient clinical datasets could be efficient, as in this study.
| Conclusions|| |
The results showed the capabilities of machine learning methods in the prediction of COVID-19 patients' survival outcomes based on simple routine blood test parameters. The study also demonstrates that COVID-19 patients with increased levels of AST and BUN have little chance of surviving. These results suggest that these two informative features can be useful for medical doctors to quickly quantify the death risk when analyzing the health records of COVID-19 patients and deciding on the allocation of hospital facilities.
Although the research has reached its aims, there were some unavoidable limitations. First, because of the time limit, this research was conducted only on a small size of the population. Second, our study did not investigate the impact of SARS-CoV-2 mutations virus over pandemic time. Finally, some variation may exist in the results of studies across racial groups. Therefore, care should be taken in extending the results to patients in other countries.
The authors would like to thank Tabriz University of Medical Sciences for supporting this work.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Ali N. Relationship between COVID-19 infection and liver injury: A review of recent data. Front Med (Lausanne) 2020;7:458.
Azhar M, Thomas PA. Comparative Review of Feature Selection and Classification Modeling. 2019 International Conference on Advances in Computing, Communication and Control (ICAC3); 2019. p. 1-9.
Baj J, Karakuła-Juchnowicz H, Teresiński G, Buszewicz G, Ciesielka M, Sitarz E, et al
. COVID-19: Specific and non-specific clinical manifestations and symptoms: The current state of knowledge. J Clin Med 2020;9:1753.
Bashash D, Olfatifar M, Hadaegh F, Asadzadeh Aghdaei H, Zali MR. COVID-19 prognosis: What we know of the significance and prognostic value of liver-related laboratory parameters in SARS-CoV-2 infection. Gastroenterol Hepatol Bed Bench 2020;13:313-20.
Biswas M, Rahaman S, Biswas TK, Haque Z, Ibrahim B. Association of sex, age, and comorbidities with mortality in COVID-19 patients: A systematic review and meta-analysis. Intervirology 2020;64:36-47.
Bonanad C, García-Blas S, Tarazona-Santabalbina F, Sanchis J, Bertomeu-González V, Fácila L, et al
. The effect of age on mortality in patients with COVID-19: A meta-analysis with 611,583 subjects. J Am Med Dir Assoc 2020;21:915-8.
Borges L, Pithon-Curi TC, Curi R, Hatanaka E. COVID-19 and neutrophils: The relationship between hyperinflammation and neutrophil extracellular traps. Mediators Inflamm 2020;2020:8829674.
Breiman L. Random forests. Mach Learn 2001;45:5-32.
Cabitza F, Campagner A, Ferrari D, Di Resta C, Ceriotti D, Sabetta E, et al
. Development, evaluation, and validation of machine learning models for COVID-19 detection based on routine blood tests. Clin Chem Lab Med 2020;59:421-31.
Chen D, Pan X, Xiao P, Farwell MA, Zhang B. Evaluation and identification of reliable reference genes for pharmacogenomics, toxicogenomics, and small RNA expression analysis. J Cell Physiol 2011;226:2469-77.
Chen LZ, Lin ZH, Chen J, Liu SS, Shi T, Xin YN. Can elevated concentrations of ALT and AST predict the risk of 'recurrence' of COVID-19? Epidemiol Infect 2020;148:e218.
Cheng A, Hu L, Wang Y, Huang L, Zhao L, Zhang C, et al
. Diagnostic performance of initial blood urea nitrogen combined with D-dimer levels for predicting in-hospital mortality in COVID-19 patients. Int J Antimicrob Agents 2020;56:106110.
Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 2020;21:6.
Chicco D, Rovelli C. Computational prediction of diagnosis and feature selection on mesothelioma patient health records. PLoS One 2019;14:e0208737.
Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Min 2021;14:13.
Chowdhury ME, Rahman T, Khandakar A, Al-Madeed S, Zughaier SM, Doi SA, et al
. An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cognit Comput 2021 Apr 21:1-6.
Cucinotta D, Vanelli M. WHO declares COVID-19 a pandemic. Acta Biomed 2020;91:157-60.
Debnath S, Barnaby DP, Coppa K, Makhnevich A, Kim EJ, Chatterjee S, et al
. Machine learning to assist clinical decision-making during the COVID-19 pandemic. Bioelectron Med 2020;6:1-8.
Farghaly S, Makboul M. Correlation between age, sex, and severity of coronavirus disease-19 based on chest computed tomography severity scoring system. Egypt J Radiol Nucl Med 2021;52:1-8.
Foy BH, Carlson JC, Reinertsen E, Padros I Valls R, Pallares Lopez R, Palanques-Tost E, et al
. Association of red blood cell distribution width with mortality risk in hospitalized adults with SARS-CoV-2 infection. JAMA Netw Open 2020;3:e2022058.
Ghahramani S, Tabrizi R, Lankarani KB, Kashani SM, Rezaei S, Zeidi N, et al
. Laboratory features of severe vs. non-severe COVID-19 patients in Asian populations: A systematic review and meta-analysis. Eur J Med Res 2020;25:30.
Gomez JM, Du-Fay-de-Lavallaz JM, Fugar S, Sarau A, Simmons JA, Clark B, et al
. Sex differences in coronavirus disease 2019 (COVID-19) hospitalization and mortality. J Women's Health 2021;30:646-53.
Hendren NS, de Lemos JA, Ayers C, Das SR, Rao A, Carter S, et al
. Association of body mass index and age with morbidity and mortality in patients hospitalized with COVID-19: Results from the American Heart Association COVID-19 Cardiovascular Disease Registry. Circulation 2021;143:135-44.
Henry BM, Benoit JL, Benoit S, Pulvino C, Berger BA, Olivera MH, et al
. Red blood cell distribution width (RDW) predicts COVID-19 severity: A prospective, observational study from the Cincinnati SARS-CoV-2 Emergency Department Cohort. Diagnostics (Basel) 2020;10:618.
Hintze JL, Nelson RD. Violin plots: A box plot-density trace synergism. Am Stat 1998;52:181-4.
Hjerpe A. Computing random forests variable importance measures (vim) on mixed numerical and categorical data. Stochholm, Sweden: KTH Royal Institute of Technology School of Computer Science and Communication; 2016.
Janardhanan P, Sabika F. Effectiveness of support vector machines in medical data mining. J Commun Softw Syst 2015;11:25-30.
Jin JM, Bai P, He W, Wu F, Liu XF, Han DM, et al
. Gender differences in patients with COVID-19: Focus on severity and mortality. Front Public Health 2020;8:152.
Kaye AD, Okeagu CN, Pham AD, Silva RA, Hurley JJ, Arron BL, et al
. Economic impact of COVID-19 pandemic on healthcare facilities and systems: International perspectives. Best Pract Res Clin Anaesthesiol 2021;35:293-306.
Kim HY. Statistical notes for clinical researchers: Chi-squared test and Fisher's exact test. Restor Dent Endod 2017;42:152-5.
Kong M, Zhang H, Cao X, Mao X, Lu Z. Higher level of neutrophil-to-lymphocyte is associated with severe COVID-19. Epidemiol Infect 2020;148:e139.
Lashari SA, Ibrahim R, Senan N, Taujuddin N. Application of Data Mining Techniques for Medical Data Classification: A Review. Vol. 150. MATEC Web of Conferences; 2018. p. 06003.
Lewis RJ. An Introduction to Classification and Regression Tree (CART) Analysis. Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California; 2000. p. 14.
Liu S, Zhang L, Weng H, Yang F, Jin H, Fan F, et al
. Association between average plasma potassium levels and 30-day mortality during hospitalization in patients with COVID-19 in Wuhan, China. Int J Med Sci 2021;18:736-43.
Liu YM, Xie J, Chen MM, Zhang X, Cheng X, Li H, et al
. Kidney function indicators predict adverse outcomes of COVID-19. Med (N Y) 2021;2:38-48.e2.
Liu Y, Sun W, Guo Y, Chen L, Zhang L, Zhao S, et al
. Association between platelet parameters and mortality in coronavirus disease 2019: Retrospective cohort study. Platelets 2020;31:490-6.
Lodder RA, Hieftje GM. Quantile analysis: A method for characterizing data distributions. Appl Spectrosc 1988;42:1512-20.
Lorente L, Martín MM, Argueso M, Solé-Violán J, Perez A, Marcos Y Ramos JA, et al
. Association between red blood cell distribution width and mortality of COVID-19 patients. Anaesth Crit Care Pain Med 2021;40:100777.
McKnight PE, Najab J. Mann-Whitney U test. In: The Corsini Encyclopedia of Psychology. 2010 Jan 30:1.
Meizlish ML, Pine AB, Bishai JD, Goshua G, Nadelmann ER, Simonov M, et al
. A neutrophil activation signature predicts critical illness and mortality in COVID-19. Blood Adv 2021;5:1164-77.
Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang CC, et al
. Package 'e1071'. R J 2019.
Milborrow S, Milborrow MS. Package 'rpart. plot'; 2020.
Minghim R, Huancapaza L, Artur E, Telles GP, Belizario IV. Graphs from features: Tree-based graph layout for feature analysis. Algorithms 2020;13:302.
Molnar C. Interpretable Machine Learning. Lulu.com; 2020.
Nogueira SÁ, Oliveira SC, Carvalho AF, Neves JM, Silva LS, Silva Junior GB, et al
. Renal changes and acute kidney injury in covid-19: A systematic review. Rev Assoc Med Bras (1992) 2020;66 Suppl 2:112-7.
Osi AA, Dikko HG, Abdu M, Ibrahim A, Isma'il LA, Sarki H, et al
. A classification approach for predicting COVID-19 patient survival outcome with machine learning techniques. medRxiv 2020.
Park SE. Epidemiology, virology, and clinical features of severe acute respiratory syndrome -coronavirus-2 (SARS-CoV-2; Coronavirus Disease-19). Clin Exp Pediatr 2020;63:119-24.
Pecoraro F, Clemente F, Luzi D. The efficiency in the ordinary hospital bed management in Italy: An in-depth analysis of intensive care unit in the areas affected by COVID-19 before the outbreak. PLoS One 2020;15:e0239249.
Pijls BG, Jolani S, Atherley A, Derckx RT, Dijkstra JI, Franssen GH, et al
. Demographic risk factors for COVID-19 infection, severity, ICU admission and death: A meta-analysis of 59 studies. BMJ Open 2021;11:e044640.
Ponti G, Maccaferri M, Ruini C, Tomasi A, Ozben T. Biomarkers associated with COVID-19 disease progression. Crit Rev Clin Lab Sci 2020;57:389-99.
Pourbagheri-Sigaroodi A, Bashash D, Fateh F, Abolghasemi H. Laboratory findings in COVID-19 diagnosis and prognosis. Clin Chim Acta 2020;510:475-82.
Pradhan A, Olsson PE. Sex differences in severity and mortality from COVID-19: Are males more vulnerable? Biol Sex Differ 2020;11:53.
RColorBrewer S, Liaw MA. Package 'randomForest'. Berkeley, CA, USA: University of California; 2018.
Ripley B, Venables B, Bates DM, Hornik K, Gebhardt A, Firth D, et al.
Package 'mass'. Cran r. 2013;538:113-20..
Rocca B. Handling imbalanced datasets in machine learning. Towards Data Science 2019.
Ruan Q, Yang K, Wang W, Jiang L, Song J. Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China. Intensive Care Med 2020;46:846-8.
Sasson I. Age and COVID-19 mortality: A comparison of Gompertz doubling time across countries and causes of death. Demogr Res 2021;44:379-96.
Sun DW, Zhang D, Tian RH, Li Y, Wang YS, Cao J, et al
. The underlying changes and predicting role of peripheral blood inflammatory cells in severe COVID-19 patients: A sentinel? Clin Chim Acta 2020;508:122-9.
von Jouanne-Diedrich H. OneR: One rule machine learning classification algorithm with enhancements. R package version. 2017;2:2.
Wang Q, Zhao H, Liu LG, Wang YB, Zhang T, Li MH, et al
. Pattern of liver injury in adult patients with COVID-19: A retrospective analysis of 105 patients. Mil Med Res 2020;7:28.
Wickham H, Chang W, Wickham MH. Package 'ggplot2'. Create Elegant Data Visualisations Using the Grammar of Graphics. Vol. 2. Version; 2016. p. 1-189.
Xanthopoulos P, Pardalos PM, Trafalis TB. Robust data mining. Springer Science & Business Media; 2012.
[Figure 1], [Figure 2], [Figure 3]
[Table 1], [Table 2], [Table 3], [Table 4]