Statistical modelling for estimating and forecasting Covid-19 incidence in Spain

  • David Moriña Departamento de Econometría, Estadística y Economía Aplicada - Riskcenter-IREA, Universitat de Barcelona, España
  • Alessandra Ybargüen Departamento de Econometría, Estadística y Economía Aplicada - Riskcenter-IREA, Universitat de Barcelona, España
Keywords: statistical modelling, time series, underregistered data, infectious diseases, covid-19

Abstract

Introduction: Basing decision-making processes on data containing errors and inaccuracies is unavoidable in many situations. The COVID-19 pandemic related data is a clear example, where the information provided by official sources was often unreliable due to data collection mechanisms and the amount of asymptomatic cases. Objectives: To estimate the amount of misreported data in a time series and reconstructing the most probable evolution of the process and provides a discussion on the more appropriate statistical methods able to yield reliable forecasts in this context. Methods: The usage of a model based on autoregressive conditional heteroskedastic time series is proposed, estimating the parameters by Bayesian synthetic likelihood. Results: Only around 51% of the cases of COVID-19 in the period from February 23rd, 2020 to February 27th, 2022 were observed in Spain, also detecting remarkable differences in the reporting issues between Autonomous communities. Conclusion: The presented method allows generating realistic predictions under different possible scenarios, and therefore it represents a valuable tool for policy makers in order to improve the evaluation of the evolution of a situation.

Downloads

Download data is not yet available.
Abstract Views: 116
PDF (Español (España)) : 222

References

Alfonso, J. H., Løvseth, E. K., Samant, Y. and Holm, J. (2015). Work-related skin diseases in Norway may be underreported: Data from 2000 to 2013. Contact Dermatitis, 72(6), 409-412. https://doi.org/10.1111/cod.12355

Arendt, S., Rajagopal, L., Strohbehn, C., Stokes, N., Meyer, J., & Mandernach, S. (2013). Reporting of foodborne illness by U.S. consumers and healthcare professionals. International journal of environmental research and public health, 10(8), 3684-3714. https://doi.org/10.3390/ijerph10083684

Azmon, A., Faes, C., & Hens, N. (2014). On the estimation of the reproduction number based on misreported epidemic data. Statistics in medicine, 33(7), 1176-1192. https://doi.org/10.1002/sim.6015

Bernard, H., Werber, D., & Höhle, M. (2014). Estimating the under-reporting of norovirus illness in Germany utilizing enhanced awareness of diarrhoea during a large outbreak of Shiga toxin-producing E. coli O104: H4 in 2011—A time series analysis. BMC Infectious Diseases, 14(1), 116-116. https://doi.org/10.1186/1471-2334-14-116

Fernández-Fontelo, A., Moriña, D., Cabaña, A., Arratia, A. and Puig, P. (2020). Estimating the real burden of disease under a pandemic situation: The SARS-CoV2 case. PLoS ONE, 15, e0242956. https://doi.org/10.1371/journal.pone.0242956

Fernández‐Fontelo, A., Cabaña, A., Joe, H., Puig, P. and Moriña, D. (2019). Untangling serially dependent underreported count data for gender‐based violence. Statistics in Medicine, 38(22), 4404-4422. https://doi.org/10.1002/sim.8306

Fernández-Fontelo, A., Cabaña, A., Puig, P. and Moriña, D. (2016). Under-reported data analysis with INAR-hidden Markov chains. Statistics in Medicine, 35(26), 4875-4890. https://doi.org/10.1002/sim.7026

Gibbons, C. L., Mangen, M.-J. J., Plass, D., Havelaar, A. H., Brooke, R. J., Kramarz, P., Peterson, K. L., Stuurman, A. L., Cassini, A., Fèvre, E. M., Kretzschmar, M. E. E., & Burden of Communicable diseases in Europe (BCoDE) consortium. (2014). Measuring underreporting and under-ascertainment in infectious disease datasets: A comparison of methods. BMC public health, 14(1), 147. https://doi.org/10.1186/1471-2458-14-147

Harkener, S., Stausberg, J., Hagel, C., & Siddiqui, R. (2019). Towards a Core Set of Indicators for Data Quality of Registries. Studies in health technology and informatics, 267, 39-45. https://doi.org/10.3233/SHTI190803

Hastie, T. J., & Tibshirani, R. J. (1990). Generalized Additive Models. CRC Press.

Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735

Kodra, Y., Weinbach, J., Posada-De-La-Paz, M., Coi, A., Lemonnier, S. L., van Enckevort, D., Roos, M., Jacobsen, A., Cornet, R., Ahmed, S. F., Bros-Facer, V., Popa, V., van Meel, M., Renault, D., von Gizycki, R., Santoro, M., Landais, P., Torreri, P., Carta, C., … Taruscio, D. (2018). Recommendations for improving the quality of rare disease registries. International Journal of Environmental Research and Public Health, 15(8). https://doi.org/10.3390/ijerph15081644

Magal, P., & Webb, G. (2018). The parameter identification problem for SIR epidemic models: Identifying unreported cases. Journal of Mathematical Biology, 77(6-7), 1629-1648. https://doi.org/10.1007/s00285-017-1203-9

Moriña, D., Fernández-Fontelo, A., Cabaña, A., Arratia, A., & Puig, P. (2023). Estimated Covid-19 burden in Spain: ARCH underreported non-stationary time series. BMC Medical Research Methodology, 23, 75. https://doi.org/10.1186/s12874-023-01894-9

Moriña, D., Fernández-Fontelo, A., Cabaña, A., Arratia, A., Ávalos, G., & Puig, P. (2021). Cumulated burden of Covid-19 in Spain from a Bayesian perspective. European Journal of Public Health, 31(4), 917-920. https://doi.org/10.1093/eurpub/ckab118

Moriña, D., Fernández-Fontelo, A., Cabaña, A., & Puig, P. (2021). New statistical model for misreported data with application to current public health challenges. Scientific Reports, 11(1), 23321. https://doi.org/10.1038/s41598- 021-02620-5

Moriña, D., Fernández-Fontelo, A., Cabaña, A., Puig, P., Monfil, L., Brotons, M., & Diaz, M. (2021). Quantifying the under-reporting of uncorrelated longitudal data: The genital warts example. BMC Medical Research Methodology, 21(1), 6-6. https://doi.org/10.1186/s12874-020-01188-4

Rosenman, K. D., Kalush, A., Reilly, M. J., Gardiner, J. C., Reeves, M., & Luo, Z. (2006). How much work-related injury and illness is missed by the current national surveillance system? Journal of occupational and environmental medicine / American College of Occupational and Environmental Medicine, 48(4), 357-365. https://doi.org/10.1097/01.jom.0000205864.81970.63

Sherratt, K., Gruson, H., Grah, R., Johnson, H., Niehus, R., Prasse, B., Sandmann, F., Deuschel, J., Wolffram, D., Abbott, S., Ullrich, A., Gibson, G., Ray, E. L., Reich, N. G., Sheldon, D., Wang, Y., Wattanachit, N., Wang, L., Trnka, J., … Funk, S. (2023). Predictive performance of multi-model ensemble forecasts of Covid-19 across European nations. eLife, 12, e81916. https://doi.org/10.7554/eLife.81916

Sohrabi, C., Alsafi, Z., O’Neill, N., Khan, M., Kerwan, A., Al-Jabir, A., Iosifidis, C., & Agha, R. (2020). World Health Organization declares Global Emergency: A review of the 2019 Novel Coronavirus (Covid-19). International journal of surgery (London, England), 76, 71-76. https://doi.org/10.1016/j.ijsu.2020.02.034

Stocks, T., Britton, T., & Höhle, M. (2018). Model selection and parameter estimation for dynamic epidemic models via iterated filtering: Application to rotavirus in Germany. Biostatistics. https://doi.org/10.1093/biostatistics/kxy057

Stoner, O., Economou, T. and Drummond Marques da Silva, G. (2019). A Hierarchical Framework for Correcting Under-Reporting in Count Data. Journal of the American Statistical Association, 114(528), 1481-1492. https://doi.org/10.1080/01621459.2019.1573732

Taylor, S. J., & Letham, B. (2018). Forecasting at Scale. The American Statistician, 72(1), 37-45. https://doi.org/10.1080/00031305.2017.1380080

Winkelmann, R. (1996). Markov chain Monte Carlo analysis of underreported count data with an application to worker absenteeism. Empirical Economics, 21(4), 575-587. https://doi.org/10.1007/BF01180702

Zhao, S., Musa, S. S., Lin, Q., Ran, J., Yang, G., Wang, W., Lou, Y., Yang, L., Gao, D., He, D., & Wang, M. H. (2020). Estimating the Unreported Number of Novel Coronavirus (2019-nCoV) Cases in China in the First Half of January 2020: A Data-Driven Modelling Analysis of the Early Outbreak. Journal of Clinical Medicine, 9(2), 388. https://doi.org/10.3390/jcm9020388

Published
2024-01-06
How to Cite
Moriña, D., & Ybargüen, A. (2024). Statistical modelling for estimating and forecasting Covid-19 incidence in Spain. Revista Española De Comunicación En Salud (RECS), 54-59. https://doi.org/10.20318/recs.2024.7951
Section
Artículos