Beruflich Dokumente
Kultur Dokumente
Manuscript Number:
Title: Estimating the number of deaths due to COVID-19 in Lima and Peru
during March and April 2020 using ARIMA time series and modeling
Order of Authors: Eduardo Gonzalo Villarreyes Peña, M.D; Ana E Luna, PhD;
Andres J Soriano, M.D
B.U. Park
Co-Editors
Computational Statistics & Data Analysis
Seoul National University Department of Statistics, 1 Gwanak-ro, Gwanak-gu, 08826, Seoul,
Korea, Republic of
June 27, 2020
I’m writing to you because we wish to submit our original research article entitled " Estimating
the number of deaths due to COVID-19 in Lima and Peru during March and April 2020 using
ARIMA time series and modeling” for consideration by Computational Statistics & Data Analysis.
The death toll caused by the COVID-19 pandemic is of utmost importance due to the
current situation worldwide. In this paper, when estimating the number of deaths, we have used
statistical analysis in time series together with ARIMA predictions and we have estimated the
number of deaths due to COVID-19 in Peru and the city of Lima during March and April 2020.
We truly believe that the findings presented in our paper will call the attention of statisticians and
researchers who subscribe to Computational Statistics & Data Analysis. Besides, our findings will
also contribute to develop new tools in order to quantify the underreporting of deaths caused by
COVID-19 pandemic.
Each of the authors confirms that this manuscript is original and has not been previously
published, nor it is currently under consideration by any other journal. Additionally, all of the
authors have approved the contents of this paper and have agreed to the Computational Statistics
& Data Analysis submission policies.
Each named author has substantially contributed to conducting the underlying research
and drafting this manuscript. Additionally, to the best of our knowledge, the named authors have
no conflict of interest, financial or otherwise.
The SARS-CoV-2 virus, which causes COVID-19 disease, is a large family of viruses that cause
respiratory disease and complications in humans. Nowadays, the pandemic is studied by the rate of
infection and mortality in different countries and even its fast spread has caught the attention of
researchers worldwide. By using inference modeling techniques and statistical analysis in time series
together with ARIMA predictions is possible to estimate deaths from COVID-19. Our investigation was
carried out in the city of Lima and Peru where the detailed analysis was done during the months of
March and April in 2020. When we compared the death toll provided by the Ministry of Health
(MINSA), we had obtained a difference of approximately 185.9% regarding the number of deaths due
to COVID-19.
Corresponding author.
E-mail addresses: eduardo.villarreyes@unmsm.edu.pe (E. Villarreyes),
ae.lunaa@up.edu.pe (A. Luna), jose.soriano@unmsm.edu.pe (A. Soriano).
modeling is the best univariate model to predict throughout Peru with the number obtained
the number of infant deaths caused by Acute from previous years. During March 2020, the
Respiratory Infections (ARI). The effectiveness difference reached 222 deaths in Peru and 1006
of ARIMA modeling was also reflected in a study deaths for the particular case of Lima. In April,
whose aim was to predict cancer mortality in the difference increased to 3,763 deaths in Peru
Spain [15]. Therefore, a correct implementation and 3,202 in Lima. When calculating the
of the model allows adequate inferences to be standard deviation and the coefficient of
made about unknown or unexplored variation (C.V) for the city of Lima, the values
phenomena in the field of biomedical science obtained were 525.38 and 20.09%, respectively
[16]. The authors of the research [17] highlight for March and 1616.56 and 67.29% for April.
the predictive performance and the certainty in
Thus, homogeneous data has become
their prediction periods of ARIMA models with a
heterogeneous and shows that the arithmetic
seasonal component to be used as a
mean value for this last month is not reliable to
management tool for the diverse queries.
be such an analysis tool (C.V > 30%).
In this study, we propose the use of official data
Likewise, in the case of Peru, there is a standard
obtained from the National Death Registry
deviation value and a coefficient variation of
Information System (SINADEF) regarding deaths
599.10 and 6.57% for March, rising in April to
in Peru during the last three years [18] and the
1958 and 23.28% respectively (Fig. 3 and Fig. 4).
official information of deaths due to
coronavirus by the Ministry of Health (MINSA)
during March and April in 2020 [3, 19].
Consequently, we will combine the techniques
of time series analysis and ARIMA modeling.
3. ARIMA models
Highlights
- A correct implementation of the ARIMA model allows adequate inferences to be made about
unknown or unexplored phenomena in the field of biomedical science and statistical.
- 3,213 deaths have occurred in Peru in April 30. MINSA has reported 1,124 deaths which means
that our model has detected an increase of deaths by 185.9%.
- So far, no studies have been found in Peru that have used the set of models mentioned
previously with respect to the predictions of death rates from COVID-19.
Supplementary Material for online publication only
Click here to download Supplementary Material for online publication only: Dead Peru.xlsx