COVID-19 Patient Health Prediction using Artificial Intelligence Boosted Random Forest Algorithm

Authors

  • Abdul Subhan Masters Scholar at Economics Department COMSATS University Islamabad. Author
  • Tuba Rasheed Masters Scholar at Economics Department COMSATS University Islamabad. Author
  • Zarwa Shah Masters Scholar at Economics Department COMSATS University Islamabad. Author
  • Sadia Noor Masters Scholar at Economics Department COMSATS University Islamabad. Author
  • Muhammad Aamir Khan Assistant Professor at Economics Department COMSATS University Islamabad. Author
  • Usman Shakoor Assistant Professor at Economics Department COMSATS University Islamabad. Author

DOI:

https://doi.org/10.48165/sajssh.2021.2313

Keywords:

COVID-19, Predictions, Boosted Random Forest Algorithm, Pakistan

Abstract

In the current times, there is high demand for artificial intelligence (AI) techniques to be integration with real-time collection, wireless infrastructure, as well as processing in terms of end-user devices. It is now remarkable to make use of AI for detection as well as prediction of pandemics that are extremely large in nature. Coronavirus pandemic of 2019 (COVID-19) began in Wuhan, China and caused the deaths of 175,694 deaths around the world, while the number of active patients stands at 254,4792 patients around the world. In Pakistan, from January 2020 March 2021, there have been 658,132 positive cases, 603,512 recovered cases of COVID-19 with 16,208 deaths, reported by world health organization. Nonetheless, the quick and exponential increase in COVID-19 patients has made it necessary that quick and efficient predictions be made in terms of the possible outcomes with respect to the patient for the sake of suitable treatment by making use of AI techniques. A fine-tuned random forest model has been proposed by this paper, which has been given a boost by AdaBoost algorithm. The COVID-19 patient’s health, geographical area, gender, and marital status are used for the prediction of severity in terms of cases as well as possible outcomes, either recovery or no recovery (i.e. death). The model is 90% accurate and has a 0.76 F1 Score on the set of data used. Analysis of data shows a positive correlation with respect to the gender of patient, and death. It also shows that most of the patients had ages between twenty years and seventy years.

References

Bai, Y., Yao, L., Wei, T., Tian, F., Jin, D. Y., Chen, L., & Wang, M. (2020). Presumed asymptomatic carrier transmission of COVID-19. Jama, 323(14), 1406-1407.

Bayes, C., & Valdivieso, L. (2020). Modelling death rates due to COVID-19: A Bayesian approach. arXiv preprint arXiv:2004.02386.

Beck, B. R., Shin, B., Choi, Y., Park, S., & Kang, K. (2020). Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Computational and structural biotechnology journal, 18, 784-790.

Cai, H. (2020). Sex difference and smoking predisposition in patients with COVID-19. The Lancet Respiratory Medicine, 8(4), e20.

Chen, L., Zhou, M., Dong, X., Qu, J., & Gong, F. Y. (2001). han Y. Yang F, Zhang tJ.

Feng, L., Ali, A., Iqbal, M., Bashir, A. K., Hussain, S. A., & Pack, S. (2019). Optimal haptic communications over nanonetworks for E-health systems. IEEE Transactions on Industrial Informatics, 15(5), 3016-3027.

Freund, Y., Schapire, R., & Abe, N. (1999). A short introduction to boosting. Journal-Japanese Society For Artificial Intelligence, 14(771-780), 1612.

Iwendi, C., Bashir, A. K., Peshkar, A., Sujatha, R., Chatterjee, J. M., Pasupuleti, S., .& Jo, O. (2020). COVID-19 patient health prediction using boosted random forest algorithm. Frontiers in public health, 8, 357.

Jain, V., & Chatterjee, J. M. (2020). Machine Learning with Health Care Perspective. Springer International Publishing.

Khalifa, N. E. M., Taha, M. H. N., Hassanien, A. E., & Elghamrawy, S. (2020). Detection of coronavirus (covid-19) associated pneumonia based on generative adversarial networks and a fine-tuned deep transfer learning model using chest x-ray dataset. arXiv preprint arXiv:2004.01184.

Khamparia, A., Gupta, D., de Albuquerque, V. H. C., Sangaiah, A. K., & Jhaveri, R. H. (2020). Internet of health things-driven deep learning system for detection and classification of cervical cells using transfer learning. The Journal of Supercomputing, 1-19.

Khalilia, M., Chakraborty, S., & Popescu, M. (2011). Predicting disease risks from highly imbalanced data using random forest. BMC medical informatics and decision making, 11(1), 1-13.

Kobayashi, T., Jung, S. M., Linton, N. M., Kinoshita, R., Hayashi, K., Miyama, T., ... & Nishiura, H. (2020). Communicating the risk of death from novel coronavirus disease (COVID-19).

Kutia, S., Chauhdary, S. H., Iwendi, C., Liu, L., Yong, W., & Bashir, A. K. (2019). Socio-Technological factors affecting user’s adoption of eHealth functionalities: A case study of China and Ukraine eHealth systems. IEEE Access, 7, 90777-90788.

Liu, D., Clemente, L., Poirier, C., Ding, X., Chinazzi, M., Davis, J. T., ... & Santillana, M. (2020). A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models. arXiv preprint arXiv:2004.04019.

Linton, N. M., Kobayashi, T., Yang, Y., Hayashi, K., Akhmetzhanov, A. R., Jung, S. M., ... & Nishiura, H. (2020). Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. Journal of clinical medicine, 9(2), 538.

Leonard, S., Atwood, C. W., Walsh, B. K., DeBellis, R. J., Dungan, G. C., Strasser, W., & Whittle, J. S. (2020). Preliminary findings on control of dispersion of aerosols and droplets during high-velocity nasal insufflation therapy using a simple surgical mask: implications for the high-flow nasal cannula. Chest, 158(3), 1046-1049.

Nishiura, H., Kobayashi, T., Yang, Y., Hayashi, K., Miyama, T., Kinoshita, R., ... & Akhmetzhanov, A. R. (2020). The rate of under ascertainment of novel coronavirus (2019-nCoV) infection: estimation using Japanese passengers data on evacuation flights.

Pal, R., Sekh, A. A., Kar, S., & Prasad, D. K. (2020). Neural network based country wise risk prediction of COVID-19. Applied Sciences, 10(18), 6448.

Parbat, D., & Chakraborty, M. (2020). A python based support vector regression model for prediction of COVID19 cases in India. Chaos, Solitons & Fractals, 138, 109942.

Pham, Q., & Nguyen, D. C. (2020). T. Huynh-The, W. Hwang, and PN Pathirana,“. Artificial intelligence (AI) and big data for coronavirus (COVID-19) pandemic: A survey on the state-of-the-arts,’’IEEE Access, 8, 130820-130839.

Pillai, S. K., Raghuwanshi, M. M., & Gaikwad, M. (2020). Hyperparameter tuning and optimization in machine learning for species identification system. In Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India (pp. 235-241). Springer, Singapore.

Sakarkar, G., Pillai, S., Rao, C. V., Peshkar, A., & Malewar, S. (2020). Comparative study of ambient air quality prediction system using machine learning to predict air quality in smart city. In Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India (pp. 175-182). Springer, Singapore.

Shankar, K., Sait, A. R. W., Gupta, D., Lakshmanaprabu, S. K., Khanna, A., & Pandey, H. M. (2020). Automated detection and classification of fundus diabetic retinopathy images using synergic deep learning model. Pattern Recognition Letters, 133, 210-216.

Sujatha, R., & Chatterjee, J. (2020). A machine learning methodology for forecasting of the COVID-19 cases in India.

Sultan, S., Javed, A., Irtaza, A., Dawood, H., Dawood, H., & Bashir, A. K. (2019). A hybrid egocentric video summarization method to improve the healthcare for Alzheimer patients. Journal of Ambient Intelligence and Humanized Computing, 10(10), 4197-4206.

Tang, Z., Zhao, W., Xie, X., Zhong, Z., Shi, F., Liu, J., & Shen, D. (2020). Severity assessment of coronavirus disease 2019 (COVID-19) using quantitative features from chest CT images. arXiv preprint arXiv:2003.11988.

Tan, Z., Zhang, J., He, Y., Zhang, Y., Xiong, G., & Liu, Y. (2020). Short-term Load Forecasting based on Integration of SVR and stacking. IEEE Access.

Wang, L., Lin, Z. Q., & Wong, A. (2020). Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest x-ray images. Scientific Reports, 10(1), 1-12.

Wu, J., Mafham, M., Mamas, M. A., Rashid, M., Kontopantelis, E., Deanfield, J. E., & Gale, C. P. (2021, April). Place and underlying cause of death during the COVID-19 pandemic: retrospective cohort study of 3.5 million deaths in England and Wales, 2014 to 2020. In Mayo Clinic Proceedings (Vol. 96, No. 4, pp. 952-963). Elsevier.

WHO, (2020). Situation Report-94 Coronavirus disease 2019 (COVID-19) 2020. World Health Organization, Geneva.

Yang, X., Yu, Y., Xu, J., Shu, H., Liu, H., Wu, Y., & Shang, Y. (2020). Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. The Lancet Respiratory Medicine, 8(5), 475-481.

Zumla, A., & Hui, D. S. (2015). to C, Stanley Perlman P. Middle East Respiratory Syndrome HHS Public Access. Lancet, 386(9997), 995-1007.

Published

2021-06-03