Modelar la incidencia de la infección por COVID-19 en el área metropolitana de Santiago de Cali, en términos de variables socioeconómicas, demográficas y de salud, usando métodos estadísticos, de econometría espacial y machine learning, en el periodo comprendido de marzo 2020 - junio 2021

dc.contributor.advisorBohorquez Castañeda, Martha Patricia
dc.contributor.authorSaenz Perilla, Juan Pablo
dc.coverage.cityCali
dc.coverage.temporal2021 - 2021
dc.date.accessioned2023-08-31T20:57:02Z
dc.date.available2023-08-31T20:57:02Z
dc.date.issued2023-02
dc.descriptionilustraciones, diagramas, mapasspa
dc.description.abstractEl objetivo de este documento es modelar la incidencia del COVID-19 en Cali en términos de los factores socioeconómicos, demográficos y de salud de los contagiados. Se estiman modelos lineales generalizados de Poisson y Regresión de Poisson Ponderada geográficamente. Además, Se ajustan y evalúan técnicas de Aprendizaje automático, y se usan los algoritmos Bosque Aleatorio (Random Forest); Potenciación del Gradiente Extremo (eXtreme Gradient Boosting); Red Neuronal (Neural Network) y Máquinas de Vectores de Soporte (Support Vector Machine). Finalmente, en todos los casos se incluye el componente espacial. Se seleccionan las variables más influyentes con base en la correlación y en la técnica de regularización Lasso. Se encuentra que ciertas afecciones de salud preexistentes (comorbilidades), el tipo de vacuna, la edad, y el régimen de salud están asociados significativamente con los casos de COVID-19 por barrio en la ciudad de Cali. (Texto tomado de la fuente)spa
dc.description.abstractThe objective of this document is to model the incidence of COVID-19 in Cali in terms of the socioeconomic, demographic, and health factors of those infected. Generalized Poisson linear models and geographically weighted Poisson Regression models are employed. Additionally, machine learning techniques are applied, including Random Forest, eXtreme Gradient Boosting, Neural Network, and Support Vector Machine algorithms. In all cases, the spatial component is incorporated. The most influential variables are selected based on correlation and Lasso regularization techniques. It is determined that certain preexisting health conditions (comorbidities), the type of vaccine, age, and health insurance regime are found to be significantly associated with COVID-19 cases by neighborhood in the city of Cali.eng
dc.description.degreelevelMaestríaspa
dc.description.degreenameMagíster en Ciencias - Estadísticaspa
dc.format.extentx, 99 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/84623
dc.language.isospaspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.facultyFacultad de Cienciasspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ciencias - Maestría en Ciencias - Estadísticaspa
dc.relation.referencesAgarwal, P. and Skupin, A. (2008). Self-organising maps : applications in geographic information science.spa
dc.relation.referencesAggarwal, C. C. et al. (2018). Neural networks and deep learning. Springer, 10(978):3.spa
dc.relation.referencesAl-Hasani, G., Asaduzzaman, M., and Soliman, A.-H. (2021). Geographically weighted poisson regression models with different kernels: Application to road traffic accident data. Communications in Statistics: Case Studies, Data Analysis and Applications, 7(2):166–181.spa
dc.relation.referencesAnselin, L. and Bera, A. K. (1998). Spatial dependence in linear regression models with an introduction to spatial econometrics. Statistics textbooks and monographs, 155:237–290.spa
dc.relation.referencesArbia, G. (2014). A primer for spatial econometrics with applications in r.spa
dc.relation.referencesArdabili, S. F., Mosavi, A., Ghamisi, P., Ferdinand, F., Varkonyi-Koczy, A. R., Reuter, U., Rabczuk, T., and Atkinson, P. M. (2020). Covid-19 outbreak prediction with machine learning. Algorithms, 13(10):249.spa
dc.relation.referencesBehrens, T., Schmidt, K., Viscarra Rossel, R. A., Gries, P., Scholten, T., and MacMillan, R. A. (2018). Spatial modelling with euclidean distance fields and machine learning. European journal of soil science, 69(5):757–770.spa
dc.relation.referencesBohorquez, M. (2020). Estadística espacial y espacio-temporal para campos aleatorios escalares y funcionales.spa
dc.relation.referencesBorah, S. and Panigrahi, R. (2022). Applied soft computing: techniques and applications.spa
dc.relation.referencesBorcard, D. and Legendre, P. (2002). All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecological modelling, 153(1-2):51–68.spa
dc.relation.referencesBreiman, L. (1996). Bagging predictors. Machine learning, 24:123–140.spa
dc.relation.referencesBreiman, L. (2001). Random forests. Machine learning, 45:5–32.spa
dc.relation.referencesBrenning, A. (2012). Spatial cross-validation and bootstrap for the assessment of prediction rules in remote sensing: The r package sperrorest. In 2012 IEEE international geoscience and remote sensing symposium, pages 5372–5375. IEEE.spa
dc.relation.referencesBrunsdon, C., Fotheringham, A. S., and Charlton, M. E. (1996). Geographically weighted regression: a method for exploring spatial nonstationarity. Geographical analysis, 28(4):281–298.spa
dc.relation.referencesCasella, G. and Berger, R. (2001). Statistical inference, 2nd edn. ser.spa
dc.relation.referencesChen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data, pages 785–794.spa
dc.relation.referencesCortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20:273–297.spa
dc.relation.referencesCristianini, N., Shawe-Taylor, J., et al. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge university press.spa
dc.relation.referencesCuartas, D. E., Arango-Londoño, D., Guzmán-Escarria, G., Muñoz, E., Caicedo, D., Ortega, D., Fandiño-Losada, A., Mena, J., Torres, M., Barrera, L., et al. (2020). Análisis espacio-temporal del sars-cov-2 en cali, colombia. Revista de Salud Pública, 22(2):138–143.spa
dc.relation.referencesde Jong, P., Sprenger, C., and van Veen, F. (1984). On extreme values of moran’s i and geary’s c. Geographical Analysis, 16(1):17–24.spa
dc.relation.referencesDobson, A. J. (2002). An introduction to generalized linear models.spa
dc.relation.referencesDong, Z., Zhu, S., Xie, Y., Mateu, J., and Rodríguez-Cortés, F. J. (2021). Non-stationary spatio-temporal point process modeling for high-resolution covid-19 data. arXiv preprint arXiv:2109.09029.spa
dc.relation.referencesDray, S., Legendre, P., and Peres-Neto, P. R. (2006). Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (pcnm). Ecological modelling, 196(3-4):483–493.spa
dc.relation.referencesFotheringham, A. S., Brunsdon, C., and Charlton, M. (2003). Geographically weighted regression: the analysis of spatially varying relationships. John Wiley & Sons.spa
dc.relation.referencesFrank, L. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics, 35(2):109–135.spa
dc.relation.referencesGahegan, M. (2000). On the application of inductive machine learning tools to geographical analysis. Geographical analysis, 32(2):113–139.spa
dc.relation.referencesGilardi, N. and Bengio, S. (2000). Local machine learning models for spatial data analysis. Journal of Geographic Information and Decision Analysis, 4(ARTICLE):11–28.spa
dc.relation.referencesGoodfellow, I., Bengio, Y., and Courville, A. (2016). Deep learning. MIT press.spa
dc.relation.referencesGower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53(3-4):325–338.spa
dc.relation.referencesGriffith, D. A. and Griffith, D. A. (2003). Spatial filtering. Springer.spa
dc.relation.referencesHarris, R. (2020). Exploring the neighbourhood-level correlates of covid-19 deaths in london using a difference across spatial boundaries method. Health & place, 66:102446.spa
dc.relation.referencesHastie, T., Tibshirani, R., Friedman, J. H., and Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer.spa
dc.relation.referencesHearst, M. A., Dumais, S. T., Osuna, E., Platt, J., and Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their applications, 13(4):18–28.spa
dc.relation.referencesHengl, T., Nussbaum, M., Wright, M. N., Heuvelink, G. B., and Gräler, B. (2018). Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables. PeerJ, 6:e5518.spa
dc.relation.referencesHuang, C.-J., Shen, Y., Kuo, P.-H., and Chen, Y.-H. (2022). Novel spatiotemporal feature extraction parallel deep neural network for forecasting confirmed cases of coronavirus disease 2019. Socio-Economic Planning Sciences, 80:100976.spa
dc.relation.referencesHuber, P. J. (1992). Robust estimation of a location parameter. Breakthroughs in statistics: Methodology and distribution, pages 492–518.spa
dc.relation.referencesJames, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning, volume 112. Springer.spa
dc.relation.referencesKopczewska, K. (2022). Spatial machine learning: new opportunities for regional science. The Annals of Regional Science, 68(3):713–755.spa
dc.relation.referencesLe Rest, K., Pinaud, D., Monestiez, P., Chadoeuf, J., and Bretagnolle, V. (2014). Spatial leave-one-out cross validation for variable selection in the presence of spatial autocorrelation. Global ecology and biogeography, 23(7):811–820.spa
dc.relation.referencesLee, C.-H., Greiner, R., and Schmidt, M. (2005). Support vector random fields for spatial classification. In Knowledge Discovery in Databases: PKDD 2005: 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, Porto, Portugal, October 3-7, 2005. Proceedings 9, pages 121–132. Springer.spa
dc.relation.referencesLi, Z. and Sillanpää, M. J. (2012). Overview of lasso-related penalized regression methods for quantitative trait mapping and genomic selection. Theoretical and applied genetics, 125:419–435.spa
dc.relation.referencesLindsey, J. K. (2000). Applying generalized linear models. Springer Science & Business Media.spa
dc.relation.referencesLovelace, R., Nowosad, J., and Muenchow, J. (2019). Geocomputation with R. Chapman and Hall/CRC.spa
dc.relation.referencesLuo, Y., Yan, J., and McClure, S. (2021). Distribution of the environmental and socioeconomic risk factors on covid-19 death rate across continental usa: a spatial nonlinear analysis. Environmental Science and Pollution Research, 28:6587–6599.spa
dc.relation.referencesMaimon, O. and Rokach, L. (2005). Data mining and knowledge discovery handbook.spa
dc.relation.referencesMaiti, A., Zhang, Q., Sannigrahi, S., Pramanik, S., Chakraborti, S., Cerda, A., and Pilla, F. (2021). Exploring spatiotemporal effects of the driving factors on covid-19 incidences in the contiguous united states. Sustainable cities and society, 68:102784.spa
dc.relation.referencesMateu, J. and Jalilian, A. (2022). Spatial point processes and neural networks: A convenient couple. Spatial Statistics, 50:100644.spa
dc.relation.referencesMccullagh, P. and Nelder, J. A. (1989). Generalized linear models.spa
dc.relation.referencesMelin, P., Monica, J. C., Sanchez, D., and Castillo, O. (2020). Analysis of spatial spread relationships of coronavirus (covid-19) pandemic in the world using self organizing maps. Chaos, Solitons & Fractals, 138:109917.spa
dc.relation.referencesMeyer, H., Reudenbach, C., Hengl, T., Katurji, M., and Nauss, T. (2018). Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environmental Modelling & Software, 101:1–9.spa
dc.relation.referencesMeyer, H., Reudenbach, C., Wöllauer, S., and Nauss, T. (2019). Importance of spatial predictor variable selection in machine learning applications–moving from data reproduction to spatial prediction. Ecological Modelling, 411:108815.spa
dc.relation.referencesMiddya, A. I. and Roy, S. (2022). Spatio-temporal variation of covid-19 health outcomes in india using deep learning based models. Technological Forecasting and Social Change, 183:121911.spa
dc.relation.referencesMohebbi, M., Wolfe, R., and Jolley, D. (2011). A poisson regression approach for modelling spatial autocorrelation between geographically referenced observations. BMC Medical Research Methodology, 11:1–11.spa
dc.relation.referencesNelder, J. A. and Wedderburn, R. W. (1972). Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3):370–384.spa
dc.relation.referencesNikparvar, B., Rahman, M. M., Hatami, F., and Thill, J.-C. (2021). Spatio-temporal prediction of the covid-19 pandemic in us counties: modeling with a deep lstm neural network. Scientific reports, 11(1):21715.spa
dc.relation.referencesNikparvar, B. and Thill, J.-C. (2021). Machine learning of spatial data. ISPRS International Journal of Geo-Information, 10(9):600.spa
dc.relation.referencesNiraula, P., Mateu, J., and Chaudhuri, S. (2022). A bayesian machine learning approach for spatio-temporal prediction of covid-19 cases. Stochastic Environmental Research and Risk Assessment, 36(8):2265–2283.spa
dc.relation.referencesPohjankukka, J., Pahikkala, T., Nevalainen, P., and Heikkonen, J. (2017). Estimating the prediction performance of spatial models via spatial k-fold cross validation. International Journal of Geographical Information Science, 31(10):2001–2019.spa
dc.relation.referencesPourghasemi, H. R., Pouyan, S., Heidari, B., Farajzadeh, Z., Shamsi, S. R. F., Babaei, S., Khosravi, R., Etemadi, M., Ghanbarian, G., Farhadi, A., et al. (2020). Spatial modeling, risk mapping, change detection, and outbreak trend analysis of coronavirus (covid-19) in iran (days between february 19 and june 14, 2020). International Journal of Infectious Diseases, 98:90–108.spa
dc.relation.referencesQuiñones, S., Goyal, A., and Ahmed, Z. U. (2021). Geographically weighted machine learning model for untangling spatial heterogeneity of type 2 diabetes mellitus (t2d) prevalence in the usa. Scientific reports, 11(1):6955.spa
dc.relation.referencesReyes, P. M., Jaramillo, A. H., and Rojas, L. R. (2020). Efecto de factores socio-económicos y condiciones de salud en el contagio de covid-19 en los estados de México. Contaduría y administración, 65(5):17.spa
dc.relation.referencesRogerson, P. A. and Fotheringham, S. (2009). The sage handbook of spatial analysis.spa
dc.relation.referencesSaefuddin, A., Saepudin, D., and Kusumaningrum, D. (2013). Geographically weighted poisson regression (gwpr) for analyzing the malnutrition data in java-indonesia.spa
dc.relation.referencesSánchez A, V. D. (2003). Advanced support vector machines and kernel methods. Neurocomputing, 55(1-2):5–20.spa
dc.relation.referencesSchratz, P., Becker, M., Lang, M., and Brenning, A. (2021). mlr3spatiotempcv: Spatiotemporal resampling methods for machine learning in r. arXiv preprint arXiv:2110.12674.spa
dc.relation.referencesSchratz, P., Muenchow, J., Iturritxa, E., Richter, J., and Brenning, A. (2018). Performance evaluation and hyperparameter tuning of statistical and machine-learning models using spatial data. arXiv preprint arXiv:1803.11266.spa
dc.relation.referencesShafizadeh-Moghadam, H., Hagenauer, J., Farajzadeh, M., and Helbich, M. (2015). Performance analysis of radial basis function networks and multi-layer perceptron networks in modeling urban change: a case study. International Journal of Geographical Information Science, 29(4):606–623.spa
dc.relation.referencesShao, Q., Xu, Y., Wu, H., et al. (2021). Spatial prediction of covid-19 in china based on machine learning algorithms and geographically weighted regression. Computational and Mathematical Methods in Medicine, 2021.spa
dc.relation.referencesStojanova, D., Ceci, M., Appice, A., Malerba, D., and Džeroski, S. (2011). Global and local spatial autocorrelation in predictive clustering trees. In Discovery Science: 14th International Conference, DS 2011, Espoo, Finland, October 5-7, 2011. Proceedings 14, pages 307–322. Springer.spa
dc.relation.referencesTibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288.spa
dc.relation.referencesTobler, W. R. (1979). Cellular geography. Philosophy in geography, pages 379–386.spa
dc.relation.referencesVapnik, V. (1999). The nature of statistical learning theory. Springer science & business media.spa
dc.relation.referencesVapnik, V., Burges, C. J., and Schoelkopf, B. (1995). A new method for constructing artificial neural networks.spa
dc.relation.referencesVapnik, V. and Vapnik, V. (1998). Statistical learning theory wiley. New York, 1(624):2.spa
dc.relation.referencesWagner, M. and Zeileis, A. (2019). Heterogeneity and spatial dependence of regional growth in the EU: A recursive partitioning approach. German Economic Review, 20(1):67–82.spa
dc.relation.referencesWang, L., Xu, T., Stoecker, T., Stoecker, H., Jiang, Y., and Zhou, K. (2021). Machine learning spatio-temporal epidemiological model to evaluate germany-county-level covid-19 risk. Machine Learning: Science and Technology, 2(3):035031.spa
dc.relation.referenceswho (2020). Estimación de la mortalidad de la covid-19.spa
dc.relation.referencesWu, C., Zhou, M., Liu, P., and Yang, M. (2021). Analyzing covid-19 using multisource data: An integrated approach of visualization, spatial regression, and machine learning. GeoHealth, 5(8):e2021GH000439.spa
dc.relation.referencesYou, H., Wu, X., and Guo, X. (2020). Distribution of covid-19 morbidity rate in association with social and economic factors in wuhan, china: Implications for urban development. International journal of environmental research and public health, 17(10):3417.spa
dc.relation.referencesZoabi, Y., Deri-Rozov, S., and Shomron, N. (2021). Machine learning-based prediction of covid-19 diagnosis based on symptoms. npj digital medicine, 4(1):3.spa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseAtribución-NoComercial-SinDerivadas 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/spa
dc.subject.decsCOVID-19spa
dc.subject.decsEnfermedad del Coronavirus-19eng
dc.subject.proposalCOVID-19spa
dc.subject.proposalCOVID-19eng
dc.subject.proposalSARS-CoV-2spa
dc.subject.proposalSARS-CoV-2eng
dc.subject.proposalEconometría espacialspa
dc.subject.proposalSpatial econometricseng
dc.subject.proposalModelos Lineales Generalizadosspa
dc.subject.proposalGeneralized Linear Modelseng
dc.subject.proposalAprendizaje automáticospa
dc.subject.proposalMachine learningeng
dc.subject.proposalAprendizaje profundospa
dc.subject.proposalDeep learningeng
dc.titleModelar la incidencia de la infección por COVID-19 en el área metropolitana de Santiago de Cali, en términos de variables socioeconómicas, demográficas y de salud, usando métodos estadísticos, de econometría espacial y machine learning, en el periodo comprendido de marzo 2020 - junio 2021spa
dc.title.translatedModel the incidence of COVID-19 infection in the metropolitan area of Santiago de Cali, in terms of socioeconomic, demographic and health variables, using statistical methods, spatial econometrics and machine learning, in the period from march 2020 - June 2021eng
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdccspa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TMspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
1110583247.2023.pdf
Tamaño:
4.27 MB
Formato:
Adobe Portable Document Format
Descripción:
Tesis de Maestría en Ciencias - Estadística

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
5.74 KB
Formato:
Item-specific license agreed upon to submission
Descripción: