Assessing The COVID-19 Trends In Pakistan Using Predictive Machine Learning Techniques: An Empirical Study
Keywords:
COVID-19, Pakistan, COVID-19 trends in Pakistan, machine learning, Predictive Machine Learning TechniqueAbstract
The Coronavirus (also referred to as COVID-19) which started in Wuhan, China on December 2019 has taken the world by storm. Scientists across the globe have used epidemiological models to predict the spread of the virus along with the death rate and make different outbreak predictions. Also, prediction models have been utilized for new policies to control the spread of the virus. Because of the complex and irregular nature of the virus, it has been hard to forecast the trends in different nations specially using conventional mathematical models such as the SIR (Susceptible Infected resistant) model. Therefore, this study analyzes the five waves of COVID-19 that have hit Pakistan since February 2020 using Machine Learning models. Advanced predictive models Predictive models such as Logistic Growth and Autoregressive Integrated Moving Average model (ARIMA) are utilized for predicting and modeling contagion spread trends. The study uses these models to capture the variation in the incidence of daily cases in each province of Pakistan. The time-series data utilized for this study is obtained from the official website of the government of Pakistan; consisting of daily caseload for each region of Pakistan. There are two main contributions of the paper: first, it compares the modeling accuracy of two widely used disease growth models ARIMA, and Logistic Growth Model, in the case of Pakistan. Secondly, it recommends the model better suited for datasets similar to Pakistan, which have fluctuations in numbers. One of the main limitation of this research is that although, one solution for this uncertainty has been the use of Machine Learning predictive techniques, limited data is available for Pakistan. The findings of this research indicated that the logistic model could not model everyday COVID-19 case numbers effectively for the overall pandemic wave, and the model tried to decrease the error, producing an inaccurate plot. However, it showed better results when the waves were divided into smaller sections. The RMSE were less compared to the ARIMA Model. Lastly, the researcher also recommends other models which could be utilized for further modeling of COVID-19 trends in Pakistan.