Statistical modelling for estimating and forecasting Covid-19 incidence in Spain
Abstract
Introduction: Basing decision-making processes on data containing errors and inaccuracies is unavoidable in many situations. The COVID-19 pandemic related data is a clear example, where the information provided by official sources was often unreliable due to data collection mechanisms and the amount of asymptomatic cases. Objectives: To estimate the amount of misreported data in a time series and reconstructing the most probable evolution of the process and provides a discussion on the more appropriate statistical methods able to yield reliable forecasts in this context. Methods: The usage of a model based on autoregressive conditional heteroskedastic time series is proposed, estimating the parameters by Bayesian synthetic likelihood. Results: Only around 51% of the cases of COVID-19 in the period from February 23rd, 2020 to February 27th, 2022 were observed in Spain, also detecting remarkable differences in the reporting issues between Autonomous communities. Conclusion: The presented method allows generating realistic predictions under different possible scenarios, and therefore it represents a valuable tool for policy makers in order to improve the evaluation of the evolution of a situation.
Downloads
References
Alfonso, J. H., Løvseth, E. K., Samant, Y. and Holm, J. (2015). Work-related skin diseases in Norway may be underreported: Data from 2000 to 2013. Contact Dermatitis, 72(6), 409-412. https://doi.org/10.1111/cod.12355
Arendt, S., Rajagopal, L., Strohbehn, C., Stokes, N., Meyer, J., & Mandernach, S. (2013). Reporting of foodborne illness by U.S. consumers and healthcare professionals. International journal of environmental research and public health, 10(8), 3684-3714. https://doi.org/10.3390/ijerph10083684
Azmon, A., Faes, C., & Hens, N. (2014). On the estimation of the reproduction number based on misreported epidemic data. Statistics in medicine, 33(7), 1176-1192. https://doi.org/10.1002/sim.6015
Bernard, H., Werber, D., & Höhle, M. (2014). Estimating the under-reporting of norovirus illness in Germany utilizing enhanced awareness of diarrhoea during a large outbreak of Shiga toxin-producing E. coli O104: H4 in 2011—A time series analysis. BMC Infectious Diseases, 14(1), 116-116. https://doi.org/10.1186/1471-2334-14-116
Fernández-Fontelo, A., Moriña, D., Cabaña, A., Arratia, A. and Puig, P. (2020). Estimating the real burden of disease under a pandemic situation: The SARS-CoV2 case. PLoS ONE, 15, e0242956. https://doi.org/10.1371/journal.pone.0242956
Fernández‐Fontelo, A., Cabaña, A., Joe, H., Puig, P. and Moriña, D. (2019). Untangling serially dependent underreported count data for gender‐based violence. Statistics in Medicine, 38(22), 4404-4422. https://doi.org/10.1002/sim.8306
Fernández-Fontelo, A., Cabaña, A., Puig, P. and Moriña, D. (2016). Under-reported data analysis with INAR-hidden Markov chains. Statistics in Medicine, 35(26), 4875-4890. https://doi.org/10.1002/sim.7026
Gibbons, C. L., Mangen, M.-J. J., Plass, D., Havelaar, A. H., Brooke, R. J., Kramarz, P., Peterson, K. L., Stuurman, A. L., Cassini, A., Fèvre, E. M., Kretzschmar, M. E. E., & Burden of Communicable diseases in Europe (BCoDE) consortium. (2014). Measuring underreporting and under-ascertainment in infectious disease datasets: A comparison of methods. BMC public health, 14(1), 147. https://doi.org/10.1186/1471-2458-14-147
Harkener, S., Stausberg, J., Hagel, C., & Siddiqui, R. (2019). Towards a Core Set of Indicators for Data Quality of Registries. Studies in health technology and informatics, 267, 39-45. https://doi.org/10.3233/SHTI190803
Hastie, T. J., & Tibshirani, R. J. (1990). Generalized Additive Models. CRC Press.
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
Kodra, Y., Weinbach, J., Posada-De-La-Paz, M., Coi, A., Lemonnier, S. L., van Enckevort, D., Roos, M., Jacobsen, A., Cornet, R., Ahmed, S. F., Bros-Facer, V., Popa, V., van Meel, M., Renault, D., von Gizycki, R., Santoro, M., Landais, P., Torreri, P., Carta, C., … Taruscio, D. (2018). Recommendations for improving the quality of rare disease registries. International Journal of Environmental Research and Public Health, 15(8). https://doi.org/10.3390/ijerph15081644
Magal, P., & Webb, G. (2018). The parameter identification problem for SIR epidemic models: Identifying unreported cases. Journal of Mathematical Biology, 77(6-7), 1629-1648. https://doi.org/10.1007/s00285-017-1203-9
Moriña, D., Fernández-Fontelo, A., Cabaña, A., Arratia, A., & Puig, P. (2023). Estimated Covid-19 burden in Spain: ARCH underreported non-stationary time series. BMC Medical Research Methodology, 23, 75. https://doi.org/10.1186/s12874-023-01894-9
Moriña, D., Fernández-Fontelo, A., Cabaña, A., Arratia, A., Ávalos, G., & Puig, P. (2021). Cumulated burden of Covid-19 in Spain from a Bayesian perspective. European Journal of Public Health, 31(4), 917-920. https://doi.org/10.1093/eurpub/ckab118
Moriña, D., Fernández-Fontelo, A., Cabaña, A., & Puig, P. (2021). New statistical model for misreported data with application to current public health challenges. Scientific Reports, 11(1), 23321. https://doi.org/10.1038/s41598- 021-02620-5
Moriña, D., Fernández-Fontelo, A., Cabaña, A., Puig, P., Monfil, L., Brotons, M., & Diaz, M. (2021). Quantifying the under-reporting of uncorrelated longitudal data: The genital warts example. BMC Medical Research Methodology, 21(1), 6-6. https://doi.org/10.1186/s12874-020-01188-4
Rosenman, K. D., Kalush, A., Reilly, M. J., Gardiner, J. C., Reeves, M., & Luo, Z. (2006). How much work-related injury and illness is missed by the current national surveillance system? Journal of occupational and environmental medicine / American College of Occupational and Environmental Medicine, 48(4), 357-365. https://doi.org/10.1097/01.jom.0000205864.81970.63
Sherratt, K., Gruson, H., Grah, R., Johnson, H., Niehus, R., Prasse, B., Sandmann, F., Deuschel, J., Wolffram, D., Abbott, S., Ullrich, A., Gibson, G., Ray, E. L., Reich, N. G., Sheldon, D., Wang, Y., Wattanachit, N., Wang, L., Trnka, J., … Funk, S. (2023). Predictive performance of multi-model ensemble forecasts of Covid-19 across European nations. eLife, 12, e81916. https://doi.org/10.7554/eLife.81916
Sohrabi, C., Alsafi, Z., O’Neill, N., Khan, M., Kerwan, A., Al-Jabir, A., Iosifidis, C., & Agha, R. (2020). World Health Organization declares Global Emergency: A review of the 2019 Novel Coronavirus (Covid-19). International journal of surgery (London, England), 76, 71-76. https://doi.org/10.1016/j.ijsu.2020.02.034
Stocks, T., Britton, T., & Höhle, M. (2018). Model selection and parameter estimation for dynamic epidemic models via iterated filtering: Application to rotavirus in Germany. Biostatistics. https://doi.org/10.1093/biostatistics/kxy057
Stoner, O., Economou, T. and Drummond Marques da Silva, G. (2019). A Hierarchical Framework for Correcting Under-Reporting in Count Data. Journal of the American Statistical Association, 114(528), 1481-1492. https://doi.org/10.1080/01621459.2019.1573732
Taylor, S. J., & Letham, B. (2018). Forecasting at Scale. The American Statistician, 72(1), 37-45. https://doi.org/10.1080/00031305.2017.1380080
Winkelmann, R. (1996). Markov chain Monte Carlo analysis of underreported count data with an application to worker absenteeism. Empirical Economics, 21(4), 575-587. https://doi.org/10.1007/BF01180702
Zhao, S., Musa, S. S., Lin, Q., Ran, J., Yang, G., Wang, W., Lou, Y., Yang, L., Gao, D., He, D., & Wang, M. H. (2020). Estimating the Unreported Number of Novel Coronavirus (2019-nCoV) Cases in China in the First Half of January 2020: A Data-Driven Modelling Analysis of the Early Outbreak. Journal of Clinical Medicine, 9(2), 388. https://doi.org/10.3390/jcm9020388
All articles published in this journal –unless otherwise stated- are distributed under the terms of the Creative Commons Attribution-NoDerives (CC-BY-ND 3.0 ES) Spain 3.0 License, which allows others to copy, distribute and transmit in a public way as long as they credit the author(s), journal and institution that publish these articles, and provided that they are not altered or modified. The complete license can be consulted in: https://creativecommons.org/licenses/by/3.0/es/deed/.es
The copyright belongs to the manuscript’s author just on the basis of creating this work:
- Moral rights are undeniable and inalienable.
- Economic or exploitation rights can be transferred to third parties, as it occurs when articles are published and authors partially or totally transfer their exploitation rights to publishers
Authors can archive their own articles in an institutional repository as long as their publications are cited in this journal.