Selección de variables en modelos de regresión logística usando regularización
dc.contributor.advisor | Vanegas Penagos, Luis Hernando | spa |
dc.contributor.author | Agudelo Rico, Harold Daniel | spa |
dc.date.accessioned | 2025-09-12T16:09:55Z | |
dc.date.available | 2025-09-12T16:09:55Z | |
dc.date.issued | 2025 | |
dc.description | ilustraciones, diagramas | spa |
dc.description.abstract | Con el propósito de desarrollar un mecanismo probabilístico que modele el comportamiento natural de un fenómeno dicotómico, garantizando estabilidad, generalización, interpretabilidad, precisión y la estimación de parámetros, mientras selecciona variables y evita problemas de multicolinealidad, se implementan modelos lineales generalizados regularizados con respuesta binomial y enlace logístico. Para validar estas características, en particular la capacidad de selección de variables, se propone comparar esta metodología en términos de AUC con las técnicas clásicas. Estas últimas incluyen el modelo lineal generalizado con respuesta binomial y enlace logit, utilizando métodos de ajuste como el AIC (Criterio de Información de Akaike) y estrategias de selección de variables como forward, backward y stepwise. (Texto tomado de la fuente). | spa |
dc.description.abstract | To develop a probabilistic mechanism capable of modeling the natural behavior of a dichotomous phenomenon while ensuring stability, generalization, interpretability, precision, and parameter estimation, as well as variable selection while avoiding multicollinearity issues, generalized linear models with regularization for a binomial response and logistic link are implemented. To validate these features, particularly the ability to select variables, this methodology is compared in terms of AUC with classical approaches. The latter include generalized linear models with binomial response and logistic link, using adjustment methods such as AIC (Akaike Information Criterion) and variable selection strategies like forward, backward, and stepwise | eng |
dc.description.degreelevel | Maestría | spa |
dc.description.degreename | Magíster en Ciencias - Estadística | spa |
dc.description.researcharea | Modelo lineales generalizados | spa |
dc.format.extent | vi, 55 páginas | spa |
dc.format.mimetype | application/pdf | |
dc.identifier.instname | Universidad Nacional de Colombia | spa |
dc.identifier.reponame | Repositorio Institucional Universidad Nacional de Colombia | spa |
dc.identifier.repourl | https://repositorio.unal.edu.co/ | spa |
dc.identifier.uri | https://repositorio.unal.edu.co/handle/unal/88739 | |
dc.language.iso | spa | |
dc.publisher | Universidad Nacional de Colombia | spa |
dc.publisher.branch | Universidad Nacional de Colombia - Sede Bogotá | spa |
dc.publisher.department | Departamento de Estad´ıstica | spa |
dc.publisher.faculty | Facultad de Ciencias | spa |
dc.publisher.place | Bogotá, Colombia | spa |
dc.publisher.program | Bogotá - Ciencias - Maestría en Ciencias - Estadística | spa |
dc.relation.references | A. Agresti. Categorical Data Analysis. Wiley Series in Probability and Statistics. Wiley- Interscience, 2 edition, 2002. ISBN 0471360937,9780471360933,9780471458760. URL http://gen.lib.rus.ec/book/index.php?md5=68B63FC9EA34F5691AD8434727373D5E | |
dc.relation.references | Z. Y. Algamal and M. H. Lee. High dimensional logistic regression model using adjusted elastic net penalty. Pakistan Journal of Statistics and Operation Research, 11(4):667-676, 2015 | |
dc.relation.references | G. E. Box and G. M. Jenkins. Time series analysis: Forecasting and control. San Francisco, CA: Holden Day, 3226(3228):10, 1976 | |
dc.relation.references | J. Friedman, T. Hastie, and R. Tibshirani. Regularization paths for generalized linear models via coordinate descent. Journal of statistical software, 33(1):1, 2010 | |
dc.relation.references | D. N. Gujarati and D. C. Porter. Basic econometrics. McGraw-hill, 2009 | |
dc.relation.references | T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics. Springer, New York, 2nd edition, 2009. ISBN 978-0-387-84857-0 | |
dc.relation.references | H. He and E. A. Garcia. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9):1263-1284, 2009. doi: 10.1109/TKDE.2008.239 | |
dc.relation.references | D. W. Hosmer Jr, S. Lemeshow, and R. X. Sturdivant. Applied logistic regression. John Wiley & Sons, 2013 | |
dc.relation.references | G. James, D. Witten, T. Hastie, R. Tibshirani, et al. An introduction to statistical learning, volume 112. Springer, 2013 | |
dc.relation.references | P. McCullagh and J. A. Nelder. Generalized linear models. Chapman Hall/CRC Monographs on Statistics Applied Probability. Chapman and Hall/CRC, 2 edition, 1989. ISBN 9780412317606,0412317605. URL http://gen.lib.rus.ec/book/index.php?md5=F0CE6B1E52E967D164DF73DE4CFDC567 | |
dc.relation.references | J. A. Nelder and R. W. Wedderburn. Generalized linear models. Journal of the Royal Statistical Society: Series A (General), 135(3):370-384, 1972 | |
dc.relation.references | B. Nguyen, C. Morell, and B. De Baets. Un método eficiente para resolver un problema de programación cuadrática derivado del aprendizaje de funciones de distancia. Investigación Operacional, 37(2):124-136, 2016 | |
dc.relation.references | R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1):267-288, 1996 | |
dc.relation.references | H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2):301-320, 2005 | |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | |
dc.rights.license | Atribución-NoComercial 4.0 Internacional | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc/4.0/ | |
dc.subject.ddc | 510 - Matemáticas::515 - Análisis | spa |
dc.subject.proposal | Área bajo la curva | spa |
dc.subject.proposal | Dimensionalidad | spa |
dc.subject.proposal | Modelos lineales generalizados | spa |
dc.subject.proposal | Multicolinealidad | spa |
dc.subject.proposal | Regresión logística | spa |
dc.subject.proposal | Regularización | spa |
dc.subject.proposal | Selección de variables | spa |
dc.subject.proposal | Area under the curve | eng |
dc.subject.proposal | Dimensionality | eng |
dc.subject.proposal | Generalized linear models | eng |
dc.subject.proposal | Multicollinearity | eng |
dc.subject.proposal | Logistic regression | eng |
dc.subject.proposal | Regularization | eng |
dc.subject.proposal | Variable selection | eng |
dc.subject.unesco | Análisis de variancia | spa |
dc.subject.unesco | Variance analysis | eng |
dc.subject.wikidata | regularization | eng |
dc.subject.wikidata | modelo estadístico | spa |
dc.subject.wikidata | statistical model | eng |
dc.subject.wikidata | regularización | spa |
dc.title | Selección de variables en modelos de regresión logística usando regularización | spa |
dc.title.translated | Variable selection in logistic regression models using regularization | eng |
dc.type | Trabajo de grado - Maestría | spa |
dc.type.coar | http://purl.org/coar/resource_type/c_bdcc | |
dc.type.coarversion | http://purl.org/coar/version/c_ab4af688f83e57aa | |
dc.type.content | Text | |
dc.type.driver | info:eu-repo/semantics/masterThesis | |
dc.type.redcol | http://purl.org/redcol/resource_type/TM | |
dc.type.version | info:eu-repo/semantics/acceptedVersion | |
dcterms.audience.professionaldevelopment | Investigadores | spa |
oaire.accessrights | http://purl.org/coar/access_right/c_abf2 |