Pronóstico de la pérdida crediticia esperada de los clientes con mayor nivel de riesgo de un banco por medio de modelos paramétricos y no paramétricos

López Avendaño, Brandon

Mostrar el registro sencillo del documento

dc.rights.license	Atribución-NoComercial-SinDerivadas 4.0 Internacional
dc.contributor.advisor	González Álvarez, Nelfi Gertrudis
dc.contributor.author	López Avendaño, Brandon
dc.date.accessioned	2023-11-08T13:56:55Z
dc.date.available	2023-11-08T13:56:55Z
dc.date.issued	2023
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/84912
dc.description	ilustraciones, diagramas
dc.description.abstract	La pérdida crediticia esperada (ECL) permite establecer bajo la normatividad IFRS 9 el nivel de provisión y el cálculo de reservas esperadas de una entidad financiera, donde a mayor riesgo percibido, existirá un mayor nivel de provisión en los balances del banco. Se ha encontrado en la literatura que, por medio de indicadores macroeconómicos, información transaccional y sectorial, índices financieros y medidas de riesgo, es posible prever la pérdida crediticia esperada en diferentes periodos de tiempo, por lo tanto, en el presente trabajo se proponen 437 variables que han resultado ser significativas en diferentes estudios, a las cuales, se les realizó una reducción de dimensionalidad y selección de variables, resultando 10 de éstas las que mejor explican la ECL. Adicionalmente, se propusieron modelos paramétricos y no paramétricos como: Regresión Lineal Múltiple, Lasso, Ridge, Bosques Aleatorios, entre otros para pronosticar la pérdida crediticia esperada, siendo el modelo Extremely Randomized Trees (Extra Trees) el que mejor desempeño tuvo en las medidas MSE, MAE y coeficiente de determinación, con valores de 0.0078, 0.0564 y 0.9199, respectivamente. Se encontró que gran parte de los predictores presentaban relaciones no lineales con la variable respuesta que el modelo era capaz de capturar, y por medio de los valores de SHAP (Shapley Additive Explanation) se pudo evidenciar que las relaciones de las variables independientes con la ECL guardaban sentido con la teoría económica. (Texto tomado de la fuente)
dc.description.abstract	Expected credit loss (ECL) enables financial institutions to determine the provision level and calculate expected reserves in accordance with IFRS 9 regulations. Higher perceived risk corresponds to higher provision levels recorded in the bank's balance sheets. Extensive research has shown that by utilizing macroeconomic indicators, transactional and sectorial information, financial ratios, and risk measures, it is possible to forecast the expected credit loss across different time periods. In this study, a set of 437 variables, identified as significant in previous research, underwent dimensionality reduction and variable selection procedures, resulting in the identification of 10 key predictors that best explain the ECL. Moreover, a range of parametric and non-parametric models, including Multiple Linear Regression, Lasso, Ridge, Random Forests, among others, were evaluated for their ability to forecast the expected credit loss. Among these models, the Extremely Randomized Trees (Extra Trees) model demonstrated superior performance in terms of MSE, MAE, and coefficient of determination, with values of 0.0078, 0.0564 and 0.9199, respectively. Notably, the analysis revealed that a significant number of predictors exhibited non-linear relationships with the response variable, which the Extra Trees model effectively captured. By employing SHAP values (Shapley Additive Explanation), the relationships between the independent variables and ECL were found to align with the economic theory.
dc.format.extent	133 páginas
dc.format.mimetype	application/pdf
dc.language.iso	spa
dc.publisher	Universidad Nacional de Colombia
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject.ddc	510 - Matemáticas::519 - Probabilidades y matemáticas aplicadas
dc.subject.ddc	330 - Economía::332 - Economía financiera
dc.title	Pronóstico de la pérdida crediticia esperada de los clientes con mayor nivel de riesgo de un banco por medio de modelos paramétricos y no paramétricos
dc.type	Trabajo de grado - Maestría
dc.type.driver	info:eu-repo/semantics/masterThesis
dc.type.version	info:eu-repo/semantics/acceptedVersion
dc.publisher.program	Medellín - Ciencias - Maestría en Ciencias - Estadística
dc.coverage.country	Colombia
dc.description.degreelevel	Maestría
dc.description.degreename	Magister en Ciencias-Estadística
dc.identifier.instname	Universidad Nacional de Colombia
dc.identifier.reponame	Repositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourl	https://repositorio.unal.edu.co/
dc.publisher.faculty	Facultad de Ciencias
dc.publisher.place	Medellín, Colombia
dc.publisher.branch	Universidad Nacional de Colombia - Sede Medellín
dc.relation.indexed	RedCol
dc.relation.indexed	LaReferencia
dc.relation.references	Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705
dc.relation.references	Altman, E. I. (1968). Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. The Journal of Finance, 23(4), 589–609. https://doi.org/10.1111/j.1540-6261.1968.tb00843
dc.relation.references	Antonsson, H. (2018). Macroeconomic factors in Probability of Default A study applied to a Swedish credit portfolio [KTH Royal Institute of Technology]. https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1264976&dswid=3197
dc.relation.references	Apergis, E., Apergis, I., & Apergis, N. (2019). A new macro stress testing approach for financial realignment in the Eurozone. Journal of International Financial Markets, Institutions and Money, 61(4), 52–80. https://doi.org/10.1016/j.intfin.2019.02.002
dc.relation.references	Banco de la República de Colombia. (2022). Sectorización Monetaria y Económica. https://www.banrep.gov.co/sites/default/files/paginas/sectormon.pdf
dc.relation.references	Bandyopadhyay, A. (2022). Loan level loss given default (LGD) study of Indian banks. IIMB Management Review, 34(2), 168–177. https://doi.org/10.1016/J.IIMB.2022.06.003
dc.relation.references	Banerjee, R., & Venkateshwaran, S. (2017, July). Demystifying Expected Credit Loss (ECL). KPMG. https://assets.kpmg/content/dam/kpmg/in/pdf/2017/07/Demystifying-Expected-Credit-Loss.pdf
dc.relation.references	Bastos, J. A. (2010). Forecasting bank loans loss-given-default. Journal of Banking & Finance, 34(10), 2510–2517. https://doi.org/10.1016/J.JBANKFIN.2010.04.011
dc.relation.references	BCBS. (2000). Principles for the Management of Credit Risk. Basel Committee on Banking Supervision. https://www.bis.org/publ/bcbs75.htm
dc.relation.references	Bemister-Buffington, J., Wolf, A. J., Raschka, S., & Kuhn, L. A. (2020). Machine Learning to Identify Flexibility Signatures of Class A GPCR Inhibition. Biomolecules, 10(3). https://doi.org/10.3390/biom10030454
dc.relation.references	Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
dc.relation.references	Cheng, J., Sun, J., Yao, K., Xu, M., & Cao, Y. (2022). A variable selection method based on mutual information and variance inflation factor. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy, 268(6), 1–7. https://doi.org/https://doi.org/10.1016/j.saa.2021.120652
dc.relation.references	Chen, J. (2022, September 6). Default: What It Means, What Happens When You Default, Examples. Investopedia. https://www.investopedia.com/terms/d/default2.asp
dc.relation.references	Chen, M. (2011). Bankruptcy prediction in firms with statistical and intelligent techniques and a comparison of evolutionary computation approaches. Computers & Mathematics with Applications, 62(12), 4514–4524. https://doi.org/10.1016/J.CAMWA.2011.10.030
dc.relation.references	Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
dc.relation.references	Dupré la Tour, T., Eickenberg, M., Nunez-Elizalde, A. O., & Gallant, J. L. (2022). Feature-space selection with banded ridge regression. NeuroImage, 264(19), 1–19. https://doi.org/10.1016/J.NEUROIMAGE.2022.119728
dc.relation.references	ECB. (2023, June 30). What are haircuts? European Central Bank. https://www.ecb.europa.eu/ecb/educational/explainers/tell-me-more/html/haircuts.en.html
dc.relation.references	Fernando, J. (2023a, March 27). Inventory Turnover Ratio: What It Is, How It Works, and Formula. Investopedia. https://www.investopedia.com/terms/i/inventoryturnover.asp
dc.relation.references	Fernando, J. (2023b, May 24). Return on Equity (ROE) Calculation and What It Means. Investopedia. https://www.investopedia.com/terms/r/returnonequity.asp
dc.relation.references	Filippo, M., Alfonso, N., Theodore, P., Enrico, R., & Gerhard, S. (2017). IFRS 9: A silent revolution in banks’ business models. McKinsey. https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights/ifrs-9-a-silent-revolution-in-banks-business-models
dc.relation.references	Fischler, M. A., & Bolles, R. C. (1981). Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM, 24(6), 381–395. https://doi.org/10.1145/358669.358692
dc.relation.references	Fonti, V., & Belitser, E. (2017). Feature selection using lasso. In VU Amsterdam research paper in business analytics. Vrije Universitetit Amsterdam.
dc.relation.references	Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189 – 1232. https://doi.org/10.1214/aos/1013203451
dc.relation.references	Genuer, R., Poggi, J. M., & Tuleau-Malot, C. (2010). Variable selection using random forests. Pattern Recognition Letters, 31(14), 2225–2236. https://doi.org/10.1016/J.PATREC.2010.03.014
dc.relation.references	Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42. https://doi.org/10.1007/s10994-006-6226-1
dc.relation.references	Giesecke, K., Longstaff, F. A., Schaefer, S., & Strebulaev, I. (2011). Corporate bond default risk: A 150-year perspective. Journal of Financial Economics, 102(2), 233–250. https://doi.org/10.1016/J.JFINECO.2011.01.011
dc.relation.references	Giraud, C. (2021). Introduction to High-Dimensional Statistics (CRC Press, Ed.; 2nd, illustrated ed.). https://www.imo.universite-paris-saclay.fr/~christophe.giraud/Orsay/Bookv3.pdf
dc.relation.references	Gitman, L. J., & Zutter, C. J. (2012). Principios de Administración financiera (12th ed.). Pearson Educación de México, S.A. de C.V. https://economicas.unsa.edu.ar/afinan/informacion_general/book/pcipios-adm-finan-12edi-gitman.pdf
dc.relation.references	Granitto, P. M., Furlanello, C., Biasioli, F., & Gasperi, F. (2006). Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemometrics and Intelligent Laboratory Systems, 83(2), 83–90. https://doi.org/10.1016/J.CHEMOLAB.2006.01.007
dc.relation.references	Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, 46(1), 389–422. https://doi.org/10.1023/A:1012487302797
dc.relation.references	Härdle, W. K., & Prastyo, D. D. (2013). Default Risk Calculation based on Predictor Selection for the Southeast Asian Industry. SSRN Electronic Journal, SFB 649(Discussion Paper 2013-037), 1–24. https://doi.org/10.2139/ssrn.2892650
dc.relation.references	Hastie, T., Tibshirani, R., & Friedman, J. (2001). The Elements of Statistical Learning (2nd ed.). Springer New York Inc
dc.relation.references	Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics). https://doi.org/10.1007/978-0-387-84858-7
dc.relation.references	Hayes, A. (2022, August 10). EBITDA: Meaning, Formula, and History. Investopedia. https://www.investopedia.com/terms/e/ebitda.asp
dc.relation.references	Heo, J., & Yang, J. Y. (2014). AdaBoost based bankruptcy forecasting of Korean construction companies. Applied Soft Computing, 24(13), 494–499. https://doi.org/10.1016/J.ASOC.2014.08.009
dc.relation.references	Hoerl, A. E., & Kennard, R. W. (1970). Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. https://doi.org/10.1080/00401706.1970.10488634
dc.relation.references	Jiménez, G., & Mencía, J. (2009). Modelling the distribution of credit losses with observable and latent factors. Journal of Empirical Finance, 16(2), 235–253. https://doi.org/10.1016/j.jempfin.2008.10.003
dc.relation.references	Kendall, M. G. (1948). Rank correlation methods. In Rank correlation methods. Griffin & Co. https://doi.org/10.1017/S0020268100013019
dc.relation.references	Khamis, H. (2008). Measures of Association: How to Choose? Journal of Diagnostic Medical Sonography, 24(3), 155–162. https://doi.org/10.1177/8756479308317006
dc.relation.references	Khandani, A. E., Kim, A. J., & Lo, A. W. (2010). Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34(11), 2767–2787. https://doi.org/10.1016/J.JBANKFIN.2010.06.001
dc.relation.references	Leow, M., & Mues, C. (2012). Predicting loss given default (LGD) for residential mortgage loans: A two-stage model and empirical evidence for UK bank data. International Journal of Forecasting, 28(1), 183–195. https://doi.org/10.1016/J.IJFORECAST.2011.01.010
dc.relation.references	Liu, J., & Xu, X. E. (2003). The Predictive Power of Economic Indicators in Consumer Credit Risk Management. The Rma Journal, 86, 40–45. https://acortar.link/7x2mfu
dc.relation.references	Loh, W.-Y. (2011). Classification and Regression Trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14–23. https://doi.org/10.1002/widm.8
dc.relation.references	Louppe, G. (2014). Understanding Random Forests: From Theory to Practice [Université de Liège]. https://doi.org/10.48550/arXiv.1407.7502
dc.relation.references	Lundberg, S., Erion, G., & Lee, S.-I. (2018). Consistent Individualized Feature Attribution for Tree Ensembles. Arxiv. https://doi.org/10.48550/arXiv.1802.03888
dc.relation.references	Lundberg, S., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. https://www.researchgate.net/publication/317062430
dc.relation.references	Luong, T. M., & Scheule, H. (2022). Benchmarking forecast approaches for mortgage credit risk for forward periods. European Journal of Operational Research, 299(2), 750–767. https://doi.org/10.1016/j.ejor.2021.09.026
dc.relation.references	Maulud, D., & Abdulazeez, A. M. (2020). A Review on Linear Regression Comprehensive in Machine Learning. Journal of Applied Science and Technology Trends, 1(4), 140–147. https://doi.org/10.38094/jastt1457
dc.relation.references	Mazibaş, M., & Tuna, Y. (2017). Understanding the Recent Growth in Consumer Loans and Credit Cards in Emerging Markets: Evidence from Turkey. Emerging Markets Finance and Trade, 53(10), 2333–2346. https://doi.org/10.1080/1540496X.2016.1196895
dc.relation.references	Melkumova, L. E., & Shatskikh, S. Y. (2017). Comparing Ridge and LASSO estimators for data analysis. Procedia Engineering, 201(31), 746–755. https://doi.org/10.1016/J.PROENG.2017.09.615
dc.relation.references	Meng, Y., Yang, N., Qian, Z., & Zhang, G. (2021). What Makes an Online Review More Helpful: An Interpretation Framework Using XGBoost and SHAP Values. Journal of Theoretical and Applied Electronic Commerce Research, 16(3), 466–490. https://doi.org/10.3390/jtaer16030029
dc.relation.references	Nalluri, M., Pentela, M., & Eluri, N. R. (2020). A Scalable Tree Boosting System: XG Boost. Int. J. Res. Stud. Sci. Eng. Technol, 7(12), 36–51. https://doi.org/10.22259/2349-476X.0712005
dc.relation.references	Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830. http://scikit-learn.sourceforge.net.
dc.relation.references	Pereira, J. M., Basto, M., & Silva, A. F. da. (2016). The Logistic Lasso and Ridge Regression in Predicting Corporate Failure. Procedia Economics and Finance, 39(5), 634–641. https://doi.org/10.1016/S2212-5671(16)30310-0
dc.relation.references	Peterdy, K. (2023, June 14). Credit Risk. CFI. https://corporatefinanceinstitute.com/resources/knowledge/finance/credit-risk
dc.relation.references	Robles, C. L. (2012). Fundamentos de administración financiera (M. E. Buendía, Ed.; 1st ed.). Red Tercer Milenio S.C. http://biblioteca.udgvirtual.udg.mx/jspui/handle/123456789/3175
dc.relation.references	Rory, M., Andrey, A., Thejaswi, R., & Eibe, F. (2018). XGBoost: Scalable GPU Accelerated Learning. Cornell University. https://doi.org/10.48550/arXiv.1806.11248
dc.relation.references	Ross, S. A., Westerfiel, R. W., & Jordan, B. D. (2010). Fundamentos de finanzas corporativas (9th ed.). McGraw-Hill/IInteramericana Editores, S.A.de C.V. https://www.mheducation.com.co/fundamentos-de-finanzas-corporativas-9781456291136-col-group
dc.relation.references	Rubaszek, M., & Serwa, D. (2014). Determinants of credit to households: An approach using the life-cycle model. Economic Systems, 38(4), 572–587. https://doi.org/10.1016/J.ECOSYS.2014.05.004
dc.relation.references	Stoppiglia, H., Dreyfus, G., Dubois, R., & Oussar, Y. (2003). Ranking a Random Feature For Variable And Feature Selection. Journal of Machine Learning Research, 3, 1399–1414. https://doi.org/10.1162/153244303322753733
dc.relation.references	Taghiyeh, S., Lengacher, D. C., & Handfield, R. B. (2021). Loss rate forecasting framework based on macroeconomic changes: Application to US credit card industry. Expert Systems with Applications, 165(3), 113954. https://doi.org/10.1016/J.ESWA.2020.113954
dc.relation.references	Temim, J. (2016, November). The IFRS 9 Impairment Model and its Interaction with the Basel Framework. Moody’s Analytics. Risk Perspectives. https://acortar.link/3HQHP5
dc.relation.references	Theil, H. (1949). A Rank-Invariant Method of Linear and Polynomial Regression Analysis. In Advanced Studies in Theoretical and Applied Econometrics (Vol. 23). Springer, Dordrecht. https://doi.org/10.1007/978-94-011-2546-8_20
dc.relation.references	Vipond Tim. (2022, June). Net Working Capital. CFI. https://corporatefinanceinstitute.com/resources/knowledge/finance/what-is-net-working-capital/
dc.relation.references	Wang, X., Wang, X., Ma, B., Li, Q., Wang, C., & Shi, Y. (2023). High-performance reversible data hiding based on ridge regression prediction algorithm. Signal Processing, 204(3), 108818. https://doi.org/10.1016/J.SIGPRO.2022.108818
dc.relation.references	Wang, Y. (2011). Corporate Default Prediction: Models, Drivers and Measurements [Doctoral thesis, University of Exeter]. http://hdl.handle.net/10036/3457
dc.relation.references	Xia, Y., Zhao, J., He, L., Li, Y., & Yang, X. (2021). Forecasting loss given default for peer-to-peer loans via heterogeneous stacking ensemble approach.International Journal of Forecasting, 37(4), 1590–1613. https://doi.org/10.1016/J.IJFORECAST.2021.03.002
dc.relation.references	Yeh, C. C., Chi, D. J., & Lin, Y. R. (2014). Going-concern prediction using hybrid random forests and rough set approach. Information Sciences, 254(1), 98–110. https://doi.org/10.1016/J.INS.2013.07.011
dc.relation.references	Zhang, Y., & Chen, L. (2021). A Study on Forecasting the Default Risk of Bond Based on XGboost Algorithm and Over-Sampling Method. Theoretical Economics Letters, 11(2), 258–267. https://doi.org/10.4236/tel.2021.112019
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.subject.lemb	Riesgo (Finanzas)
dc.subject.lemb	Bank Loans
dc.subject.lemb	Préstamos bancarios
dc.subject.proposal	Pérdida crediticia esperada
dc.subject.proposal	Provisión
dc.subject.proposal	Entidades financieras
dc.subject.proposal	Riesgo de default
dc.subject.proposal	Extremely Randomized Trees
dc.subject.proposal	Extra Trees
dc.subject.proposal	Expected Credit Loss
dc.subject.proposal	ECL
dc.subject.proposal	Provision
dc.subject.proposal	Financial institutions
dc.subject.proposal	Default risk
dc.subject.proposal	Default
dc.title.translated	Forecasting expected credit loss of high-risk bank clients using parametric and non-parametric models
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.content	Text
dc.type.redcol	http://purl.org/redcol/resource_type/TM
oaire.accessrights	http://purl.org/coar/access_right/c_abf2
dcterms.audience.professionaldevelopment	Estudiantes
dcterms.audience.professionaldevelopment	Investigadores
dcterms.audience.professionaldevelopment	Maestros
dc.description.curriculararea	Área Curricular Estadística
dc.contributor.orcid	González Álvarez, Nelfi Gertrudis [0000-0003-0129-1316]
dc.contributor.cvlac	González Álvarez, Nelfi Gertrudis [0000063002]

Archivos en el documento

Nombre:: 1035438564.2023 .pdf
Tamaño:: 2.834Mb
Formato:: PDF
Descripción:: Tesis de Maestría en Ciencias - ...

Descargar

Este documento aparece en la(s) siguiente(s) colección(ones)

Maestría en Ciencias - Estadística [136]

Mostrar el registro sencillo del documento

Atribución-NoComercial-SinDerivadas 4.0 Internacional

Esta obra está bajo licencia internacional Creative Commons Reconocimiento-NoComercial 4.0.Este documento ha sido depositado por parte de el(los) autor(es) bajo la siguiente constancia de depósito