Mostrar el registro sencillo del documento
Modelo predictivo para la ocurrencia de leishmaniasis cutánea en Colombia, a partir de variables ambientales y socioeconómicas
dc.rights.license | Reconocimiento 4.0 Internacional |
dc.contributor.advisor | Niño Vásquez, Luis Fernando |
dc.contributor.advisor | Gutiérrez Torres, Juan David |
dc.contributor.author | Salazar Mora, José Daniel |
dc.date.accessioned | 2021-10-08T14:40:28Z |
dc.date.available | 2021-10-08T14:40:28Z |
dc.date.issued | 2021-10-07 |
dc.identifier.uri | https://repositorio.unal.edu.co/handle/unal/80442 |
dc.description | ilustraciones, gráficas, tablas |
dc.description.abstract | Se crearon varios modelos predictivos para la ocurrencia de leishmaniasis cutánea en Colombia a partir de un conjunto de variables socioeconómicas y ambientales. Con este conjunto de datos (dataset) se hizo un trabajo de ciencia de datos utilizando el proceso de KDD (Knowledge Discovery in Databases), pasando por cada una de sus etapas. Particularmente, se recolectó y organizó el conjunto de datos, se elaboró una descripción y revisión de este y se hizo un análisis estadístico descriptivo. Después, se realizó el preprocesamiento de los datos, se hicieron transformaciones de estos y se implementaron técnicas de reducción de dimensionalidad. Posteriormente, se procedió a utilizar diferentes técnicas de aprendizaje de máquina, tanto para clasificación como regresión. Para clasificación se implementaron varios métodos: naive bayes, redes neuronales (perceptrón multicapa), árboles de decisión y redes bayesianas, los cuales permitieron generar un modelo predictivo de clasificación, obteniendo los mejores resultados con el algoritmo XGBoost sobre un set de datos municipal con datos reportados mensualmente. De la misma forma, se realizó un modelo de regresión a través de redes neuronales y XGBoost, obteniendo los mejores resultados con el algoritmo XGBoost, pero esta vez con un conjunto de datos departamentales con periodicidad mensual. Finalmente, se realizó un análisis de series de tiempo con algoritmos de regresión con redes neuronales y XGBoost obteniendo las mejores métricas con XGBoost para un modelo departamental con resolución temporal semanal. Con cada uno de los modelos se identificaron las variables más importantes para la predicción; todos los modelos tuvieron en cuenta al menos las siguientes: el total de la población, precipitación, temperatura, índice de vegetación mejorado (EVI por sus siglas en inglés) y mes. Además, para poder utilizar el modelo de regresión para series de tiempo, se creó una página web que recibe como entrada las variables independientes junto con sus retrasos y genera la predicción de la cantidad de casos futuros a 1, 2 y 4 semanas. (Texto tomado de la fuente). |
dc.description.abstract | Several predictive models were created for the occurrence of cutaneous leishmaniasis in Colombia from a set of socioeconomic and environmental variables. With this dataset, a data science work was done using the KDD process (Knowledge Discovery in Databases), going through each of its stages. In particular, the data set was collected and organized, a description and review of it was prepared, and a descriptive statistical analysis was carried out. Afterwards, the data was preprocessed, transformations were made of these and dimensionality reduction techniques were implemented. Subsequently, different machine learning techniques were used, both for classification and regression. For classification, several methods were implemented: naive bayes, neural networks (multilayer perceptron), decision trees and Bayesian networks, which allowed to generate a predictive classification model, obtaining the best results with the XGBoost algorithm on a municipal data set with data reported monthly. In the same way, a regression model was carried out through neural networks and XGBoost, obtaining the best results with the XGBoost algorithm, but this time with a departmental data set on a monthly basis. Finally, a time series analysis was performed with regression algorithms with neural networks and XGBoost, obtaining the best metrics with XGBoost for a departmental model with weekly temporal resolution. With each of the models, the most important variables for prediction were identified; all the models took into account at least the following variables: the total population, precipitation, temperature, improved vegetation index (EVI) and month. In addition, to be able to use the regression model for time series, a web page was created that receives as input the independent variables together with their delays and generates the prediction of the number of future cases at 1, 2 and 4 weeks. |
dc.format.extent | 117 páginas |
dc.format.mimetype | application/pdf |
dc.language.iso | spa |
dc.publisher | Universidad Nacional de Colombia |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ |
dc.subject.ddc | 000 - Ciencias de la computación, información y obras generales::004 - Procesamiento de datos Ciencia de los computadores |
dc.title | Modelo predictivo para la ocurrencia de leishmaniasis cutánea en Colombia, a partir de variables ambientales y socioeconómicas |
dc.type | Trabajo de grado - Maestría |
dc.type.driver | info:eu-repo/semantics/masterThesis |
dc.type.version | info:eu-repo/semantics/acceptedVersion |
dc.publisher.program | Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación |
dc.description.notes | Incluye anexos |
dc.description.notes | Contiene componente investigativo en ciencia de datos, epidemiología, inteligencia artificial y desarrollo de software. |
dc.contributor.researchgroup | Laboratorio de investigación en sistemas inteligentes (LISI) |
dc.coverage.country | Colombia |
dc.description.degreelevel | Maestría |
dc.description.degreename | Magíster en Ingeniería - Ingeniería de Sistemas y Computación |
dc.description.methods | Proceso KDD (Knowledge Discovery in Databases) |
dc.description.researcharea | Ciencia de datos e inteligencia artificial |
dc.identifier.instname | Universidad Nacional de Colombia |
dc.identifier.reponame | Repositorio Institucional Universidad Nacional de Colombia |
dc.identifier.repourl | https://repositorio.unal.edu.co/ |
dc.publisher.department | Departamento de Ingeniería de Sistemas e Industrial |
dc.publisher.faculty | Facultad de Ingeniería |
dc.publisher.place | Bogotá, Colombia |
dc.publisher.branch | Universidad Nacional de Colombia - Sede Bogotá |
dc.relation.indexed | Bireme |
dc.relation.references | [1] Agudelo, J. Informe de evento de leishmaniasis. Colombia, 2017. [2] Zambrano, P. Leishmaniasis. Colombia: Bogotá, Instituto Nacional de Salud 2014. [3] Sociedad colombiana de infectología. Guía 2I. Guía de atención de la leishmaniasis. Ministerio de la protección social, Colombia. [4] Medical Care Development International. Leishmaniosis: ciclo biológico. Disponible en: https://www.mcdinternational.org/trainings/malaria/spanish/dpdx/HTML/Frames/G-L/Leishmaniasis/body_Leishmaniasis_pg1#Life%20Cycle [5] Leishmaniosis: ciclo biológico de la leishmania y transmisión. 20 AV 16. Leishmaniosis. Disponible en: http://axonveterinaria.net/web_axoncomunicacion/auxiliarveterinario/20/AV_20_16-19_Leishmaniosis_ciclo_transmision.pdf [6] Hernández, A., Gutiérrez, J., Xiao, Y., Branscum, A. & Cuadros, D. Spatial epidemiology of cutaneous leishmaniasis in Colombia: socioeconomic and demographic factors associated with a growing epidemic. The royal society tropical medicine & hygiene. 2019. [7] Russell, S. & Norving, P. Artificial Intelligence A Modern Appoach. Prentice Hall, Third Edition. 2010 [8] Alberto, J. Introducción al Análisis de series temporales. Universidad Complutense de Madrid, marzo de 2007. [9] Brownlee, J. Hyperparameter Optimization With Random Search and Grid Search. Machine Learning Mastery. (2020, sep. 19). [Online]. Disponible en: https://machinelearningmastery.com/hyperparameter-optimization-with-random-search-and-grid-search/ [10] Dong, W., Huang, Y., Lehane, B. & Ma, G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. ELSEVIER, 2020. [11] Sitiobigdata. Gentle Introduction of XGBoost Library. (2019, ene. 19). [Online]. Disponible en: https://sitiobigdata.com/2019/01/20/gentle-introduction-of-xgboost-library/#ambientales y socioeconómicas [12] Towards data science. The Ultimate Guide to AdaBoost, random forests and XGBoost. (2020, mar. 16). [Online]. Disponible en: https://towardsdatascience.com/the-ultimate-guide-to-adaboost-random-forests-and-xgboost-7f9327061c4f [13] King, R., Campbell-Lendrum, D., & Davies, C. Predicting Geographic Variation in Cutaneous Leishmaniasis, Colombia, 2004. [14] Chaves, L. & Pascual, M. Climate Cycles and Forecasts of Cutaneous Leishmaniasis, a Nonstationary Vector-Borne Disease. PLos Medicine, agosto de 2006. [15] Chaves, L. Climate and recruitment limitation of hosts: the dynamics of American cutaneous leishmaniasis seen through semi-mechanistic seasonal models. Annals of Tropical Medicine & Parasitology, diciembre de 2008. [16] Lewnard, J. A., Jirmanus, L., Júnior, N. N., Machado, P. R., Glesby, M. J., Ko, A. I. & Weinberger, D. M. Forecasting Temporal Dynamics of Cutaneous Leishmaniasis in Northeast Brazil. PLoS Neglected Tropical Diseases, 2014. [17] Sharafi, M., Ghaem, H., Tabatabaee, H. R., & Faramarzi, H. Forecasting the number of zoonotic cutaneous leishmaniasis cases in south of Fars province, Iran using seasonal ARIMA time series method. Asian Pacific Journal of Tropical Medicine, diciembre de 2016. [18] Valderrama, C., Alexander, N., Ferro, C., Cadena, H., Marín, D., Holford, T., Munstermann, L. & Ocampo, C. Environmental Risk Factors for the Incidence of American Cutaneous Leishmaniasis in a Sub-Andean Zone of Colombia (Chaparral, Tolima). Am. J. Trop. Med. Hyg., 82(2), 2010, pp. 243–250. [19] Chaves, L. F., Calzada, J. E., Valderrama, A., & Saldaña, A. Cutaneous Leishmaniasis and Sand Fly Fluctuations Are Associated with El Niño in Panamá. PLoS Neglected Tropical Diseases, octubre de 2014. [20] Pérez, M., Ocampo, C., Valderrama, C. & Alexander, N. Spatial modeling of cutaneous leishmaniasis in the Andean region of Colombia. Mem Inst Oswaldo Cruz, Rio de Janeiro, Vol. 111(7): 433-442, julio de 2016. [21] Gutiérrez, J., Martínez, R., Ramoni, J., Diaz, F., Gutiérrez, R., Ruiz, F., Botello, H., Gil, M., González, J. & Palencia, M. Environmental and socio-economic determinants associated with the occurrence of cutaneous leishmaniasis in the northeast of Colombia. Trans R Soc Trop Med Hyg 2018; 00: 1–8. [22] Yuexin, W., Nishiura, H, Yiming, Y. & M. Saitoh. Deep Learning for Epidemiological Predictions. Short Research Papers II. MI, USA, julio de 2018. [23] Zhao, N., Charland, K., Carabali, M., Nsoesie, E., Maher-Giroux, M., Rees, E., Yuan, M., Garcia C., Ramirez, G. & Zinszer, K. Machine learning and dengue forecasting: Comparing random forests and artificial neural 2 networks for predicting dengue burdens at the national sub-national scale in Colombia. BioRxiv, enero de 2020. [24] Fayyad, U. & Stolorz, P. Data mining and KDD: Promise and challenges. Future Generation Computer Systems. ELSEVIER, pp. 99-104, 1997 [25] Shafique, U., & Qaiser, H. A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA). International Journal of Innovation and Scientific Research, Vol. 12 No. 1, pp. 217-22, noviembre de 2014. [26] Maimon, O. & Rokach, L. Data Mining and Knowledge Discovery Handbook. Springer-Verretraso New York, Inc., 2nd ed., febrero 2018. |
dc.rights.accessrights | info:eu-repo/semantics/openAccess |
dc.rights.references | [1] Agudelo, J. Informe de evento de leishmaniasis. Colombia, 2017. |
dc.rights.references | [2] Zambrano, P. Leishmaniasis. Colombia: Bogotá, Instituto Nacional de Salud 2014. |
dc.rights.references | [3] Sociedad colombiana de infectología. Guía 2I. Guía de atención de la leishmaniasis. Ministerio de la protección social, Colombia. |
dc.rights.references | [4] Medical Care Development International. Leishmaniosis: ciclo biológico. Disponible en: https://www.mcdinternational.org/trainings/malaria/spanish/dpdx/HTML/Frames/G-L/Leishmaniasis/body_Leishmaniasis_pg1#Life%20Cycle |
dc.rights.references | [5] Leishmaniosis: ciclo biológico de la leishmania y transmisión. 20 AV 16. Leishmaniosis. Disponible en: http://axonveterinaria.net/web_axoncomunicacion/auxiliarveterinario/20/AV_20_16-19_Leishmaniosis_ciclo_transmision.pdf |
dc.rights.references | [6] Hernández, A., Gutiérrez, J., Xiao, Y., Branscum, A. & Cuadros, D. Spatial epidemiology of cutaneous leishmaniasis in Colombia: socioeconomic and demographic factors associated with a growing epidemic. The royal society tropical medicine & hygiene. 2019. |
dc.rights.references | [7] Russell, S. & Norving, P. Artificial Intelligence A Modern Appoach. Prentice Hall, Third Edition. 2010 |
dc.rights.references | [8] Alberto, J. Introducción al Análisis de series temporales. Universidad Complutense de Madrid, marzo de 2007. |
dc.rights.references | [9] Brownlee, J. Hyperparameter Optimization With Random Search and Grid Search. Machine Learning Mastery. (2020, sep. 19). [Online]. Disponible en: https://machinelearningmastery.com/hyperparameter-optimization-with-random-search-and-grid-search/ |
dc.rights.references | [10] Dong, W., Huang, Y., Lehane, B. & Ma, G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. ELSEVIER, 2020. |
dc.rights.references | [11] Sitiobigdata. Gentle Introduction of XGBoost Library. (2019, ene. 19). [Online]. Disponible en: https://sitiobigdata.com/2019/01/20/gentle-introduction-of-xgboost-library/#ambientales y socioeconómicas |
dc.rights.references | [12] Towards data science. The Ultimate Guide to AdaBoost, random forests and XGBoost. (2020, mar. 16). [Online]. Disponible en: https://towardsdatascience.com/the-ultimate-guide-to-adaboost-random-forests-and-xgboost-7f9327061c4f |
dc.rights.references | [13] King, R., Campbell-Lendrum, D., & Davies, C. Predicting Geographic Variation in Cutaneous Leishmaniasis, Colombia, 2004. |
dc.rights.references | [14] Chaves, L. & Pascual, M. Climate Cycles and Forecasts of Cutaneous Leishmaniasis, a Nonstationary Vector-Borne Disease. PLos Medicine, agosto de 2006. |
dc.rights.references | [15] Chaves, L. Climate and recruitment limitation of hosts: the dynamics of American cutaneous leishmaniasis seen through semi-mechanistic seasonal models. Annals of Tropical Medicine & Parasitology, diciembre de 2008. |
dc.rights.references | [16] Lewnard, J. A., Jirmanus, L., Júnior, N. N., Machado, P. R., Glesby, M. J., Ko, A. I. & Weinberger, D. M. Forecasting Temporal Dynamics of Cutaneous Leishmaniasis in Northeast Brazil. PLoS Neglected Tropical Diseases, 2014. |
dc.rights.references | [17] Sharafi, M., Ghaem, H., Tabatabaee, H. R., & Faramarzi, H. Forecasting the number of zoonotic cutaneous leishmaniasis cases in south of Fars province, Iran using seasonal ARIMA time series method. Asian Pacific Journal of Tropical Medicine, diciembre de 2016. |
dc.rights.references | [18] Valderrama, C., Alexander, N., Ferro, C., Cadena, H., Marín, D., Holford, T., Munstermann, L. & Ocampo, C. Environmental Risk Factors for the Incidence of American Cutaneous Leishmaniasis in a Sub-Andean Zone of Colombia (Chaparral, Tolima). Am. J. Trop. Med. Hyg., 82(2), 2010, pp. 243–250. |
dc.rights.references | [19] Chaves, L. F., Calzada, J. E., Valderrama, A., & Saldaña, A. Cutaneous Leishmaniasis and Sand Fly Fluctuations Are Associated with El Niño in Panamá. PLoS Neglected Tropical Diseases, octubre de 2014. |
dc.rights.references | [20] Pérez, M., Ocampo, C., Valderrama, C. & Alexander, N. Spatial modeling of cutaneous leishmaniasis in the Andean region of Colombia. Mem Inst Oswaldo Cruz, Rio de Janeiro, Vol. 111(7): 433-442, julio de 2016. |
dc.rights.references | [21] Gutiérrez, J., Martínez, R., Ramoni, J., Diaz, F., Gutiérrez, R., Ruiz, F., Botello, H., Gil, M., González, J. & Palencia, M. Environmental and socio-economic determinants associated with the occurrence of cutaneous leishmaniasis in the northeast of Colombia. Trans R Soc Trop Med Hyg 2018; 00: 1–8. |
dc.rights.references | [22] Yuexin, W., Nishiura, H, Yiming, Y. & M. Saitoh. Deep Learning for Epidemiological Predictions. Short Research Papers II. MI, USA, julio de 2018. |
dc.rights.references | [23] Zhao, N., Charland, K., Carabali, M., Nsoesie, E., Maher-Giroux, M., Rees, E., Yuan, M., Garcia C., Ramirez, G. & Zinszer, K. Machine learning and dengue forecasting: Comparing random forests and artificial neural 2 networks for predicting dengue burdens at the national sub-national scale in Colombia. BioRxiv, enero de 2020. |
dc.rights.references | [24] Fayyad, U. & Stolorz, P. Data mining and KDD: Promise and challenges. Future Generation Computer Systems. ELSEVIER, pp. 99-104, 1997 |
dc.rights.references | [25] Shafique, U., & Qaiser, H. A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA). International Journal of Innovation and Scientific Research, Vol. 12 No. 1, pp. 217-22, noviembre de 2014. |
dc.rights.references | [26] Maimon, O. & Rokach, L. Data Mining and Knowledge Discovery Handbook. Springer-Verretraso New York, Inc., 2nd ed., febrero 2018. |
dc.subject.decs | Leishmaniasis, Cutaneous |
dc.subject.decs | Leishmaniasis Cutánea |
dc.subject.lemb | Digital computer simulation |
dc.subject.lemb | Simulación por computadores digitales |
dc.subject.lemb | Simulation methods |
dc.subject.lemb | Métodos de simulación |
dc.subject.proposal | Modelo predictivo leishmaniasis |
dc.subject.proposal | Ciencia de datos |
dc.subject.proposal | Machine learning |
dc.subject.proposal | Inteligencia artificial |
dc.subject.proposal | Predictive model |
dc.subject.proposal | Leishmaniasis cutánea |
dc.subject.proposal | Cutaneous leishmaniasis |
dc.subject.proposal | Forecasting model |
dc.subject.proposal | Epidemiología |
dc.subject.proposal | Epidemiology |
dc.subject.proposal | Data science |
dc.subject.proposal | Artificial intelligence |
dc.subject.proposal | Time series |
dc.subject.proposal | Series de tiempo |
dc.title.translated | Predictive model for the occurrence of cutaneous leishmaniasis in Colombia, based on environmental and socioeconomic variables |
dc.type.coar | http://purl.org/coar/resource_type/c_bdcc |
dc.type.coarversion | http://purl.org/coar/version/c_ab4af688f83e57aa |
dc.type.content | Text |
dc.type.redcol | http://purl.org/redcol/resource_type/TM |
oaire.accessrights | http://purl.org/coar/access_right/c_abf2 |
dcterms.audience.professionaldevelopment | Estudiantes |
dcterms.audience.professionaldevelopment | Investigadores |
dcterms.audience.professionaldevelopment | Maestros |
Archivos en el documento
Este documento aparece en la(s) siguiente(s) colección(ones)
![Reconocimiento 4.0 Internacional](/themes/Mirage2//images/creativecommons/cc-generic.png)