Estudio de la reducción del sobreajuste en arquitecturas de redes neuronales residuales ResNet en un escenario de clasificación de patrones

Chacón Chamorro, Manuela Viviana

Estudio de la reducción del sobreajuste en arquitecturas de redes neuronales residuales ResNet en un escenario de clasificación de patrones

dc.contributor.advisor	Riaño Rojas, Juan Carlos
dc.contributor.advisor	Gallego Restrepo, Fernando Andrés
dc.contributor.author	Chacón Chamorro, Manuela Viviana
dc.contributor.cvlac	Chacón Chamorro, Manuela [0000166834]	spa
dc.contributor.researchgroup	Pcm Computational Applications	spa
dc.date.accessioned	2023-07-18T19:29:38Z
dc.date.available	2023-07-18T19:29:38Z
dc.date.issued	2023
dc.description	graficas, tablas	spa
dc.description.abstract	Las redes neuronales artificiales son una técnica de aprendizaje automático inspirada en el funcionamiento biológico de las neuronas, actualmente soportan gran parte de la denominada Inteligencia Artificial. Pese a su notable evolución estos algoritmos presentan el problema de sobreajuste, "memorización de los datos de entrenamiento", lo cual disminuye la capacidad de generalización. En este trabajo se estudió el sobreajuste en un escenario de clasificación de patrones y se determinó un método para resolver el problema. Este estudio se realizó para la arquitectura de neuronal residual (ResNet) y se sustentó en el análisis de las propiedades matemáticas de la función que representa esta estructura, en particular, la continuidad de Lipschitz. La validación del método se realizó comparando su desempeño con las técnicas convencionales de reducción de sobreajuste: la regularización L1, L2 y Dropout. Variando la profundidad de la red se realizaron dos experimentos de clasificación con los conjuntos de datos Digits y Fashion de MNIST. También se efectuaron pruebas en arquitecturas definidas para 3 conjuntos de datos convencionales y 3 de datos sintéticos. Adicionalmente, se realizaron dos experimentos que incluyeron imágenes adversarias. El método desarrollado presenta un desempeño destacable logrando: comportamiento similar en las curvas de aprendizaje para entrenamiento y prueba, menor variabilidad del modelo al cambiar el conjunto de entrenamiento, reducción de la cota de Lipschitz, tolerancia a las pruebas adversarias. En síntesis, el método propuesto resultó idóneo en la reducción del sobreajuste en las arquitecturas residuales de los experimentos y tolera de manera sobresaliente ataques adversarios. (Texto tomado de la fuente)	spa
dc.description.abstract	Artificial neural networks are a technique of machine learning inspired by the biological functioning of neurons, currently supporting a significant portion of the so-called Artificial Intelligence. Despite their notable evolution, these algorithms present the problem of overfitting, "training data memorization", which reduces the capacity of generalization. In this work, overfitting in a pattern classification scenario was studied and a method to solve the problem was determined. This study was carried out for the Residual Neural Network architecture (ResNet) and was based on the analysis of the mathematical properties of the function that represents this structure, in particular, the Lipschitz continuity. The method was validated by comparing its performance with conventional overfitting reduction techniques: L1, L2 and Dropout regularization. Varying the depth of the network, two classification experiments were performed with the data sets Digits and Fashion MNIST. Tests were also performed on architectures defined for 3 conventional data sets and 3 synthetic data sets. Additionally, two experiments were conducted that included adversarial images. The developed method posed remarkable performance achieving: similar behavior in the learning curves for train and test set, less variability of the model when changing the train set, reduction of the Lipschitz bound and adversarial test tolerance. In summary, the method is suitable to reduce overfitting in residual architectures of the experiments and it tolerates adversary attacks in an outstanding way.	eng
dc.description.curriculararea	Matemáticas Y Estadística.Sede Manizales	spa
dc.description.degreelevel	Maestría	spa
dc.description.degreename	Magíster en Ciencias - Matemática Aplicada	spa
dc.format.extent	xx, 150 páginas	spa
dc.format.mimetype	application/pdf	spa
dc.identifier.instname	Universidad Nacional de Colombia	spa
dc.identifier.reponame	Repositorio Institucional Universidad Nacional de Colombia	spa
dc.identifier.repourl	https://repositorio.unal.edu.co/	spa
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/84211
dc.language.iso	spa	spa
dc.publisher	Universidad Nacional de Colombia	spa
dc.publisher.branch	Universidad Nacional de Colombia - Sede Manizales	spa
dc.publisher.faculty	Facultad de Ciencias Exactas y Naturales	spa
dc.publisher.program	Manizales - Ciencias Exactas y Naturales - Maestría en Ciencias - Matemática Aplicada	spa
dc.relation.references	C. C. Aggarwal, Neural Networks and Deep Learning. Springer, 2018.	spa
dc.relation.references	Z.-Q. Zhao, P. Zheng, S.-t. Xu, and X. Wu, “Object detection with deep learning: A review,” 2018. [Online]. Available: https://arxiv.org/abs/1807.05511	spa
dc.relation.references	A. Kamilaris and F. X. Prenafeta-Bold ́u, “A review of the use of convolutional neural networks in agriculture,” The Journal of Agricultural Science, vol. 156, no. 3, p.312–322, 2018. [Online]. Available: https://doi.org/10.1017/S0021859618000436	spa
dc.relation.references	A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, “Generative adversarial networks: An overview,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 53–65, jan 2018. [Online]. Available: https://doi.org/10.1109%2Fmsp.2017.2765202	spa
dc.relation.references	A. Fadaeddini, M. Eshghi, and B. Majidi, “A deep residual neural network for low altitude remote sensing image classification,” in 2018 6th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS), 2018, pp. 43–46. [Online]. Available: https://ieeexplore.ieee.org/document/8336623	spa
dc.relation.references	L. Massidda, M. Marrocu, and S. Manca, “Non-intrusive load disaggregation by convolutional neural network and multilabel classification,” Applied Sciences, vol. 10, no. 4, 2020. [Online]. Available: https://www.mdpi.com/2076-3417/10/4/1454	spa
dc.relation.references	K. Muralitharan, R. Sakthivel, and R. Vishnuvarthan, “Neural network based optimization approach for energy demand prediction in smart grid,” Neurocomputing, vol. 273, pp. 199–208, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231217313681	spa
dc.relation.references	O. Al-Salman, J. Mustafina, and G. Shahoodh, “A systematic review of artificial neural networks in medical science and applications,” in 2020 13th International Conference on Developments in eSystems Engineering (DeSE), 2020, pp. 279–282. [Online]. Available: https://ieeexplore.ieee.org/document/9450245	spa
dc.relation.references	M. M. Bejani and M. Ghatee, “A systematic review on overfitting control in shallow and deep neural networks,” Artificial Intelligence Review, vol. 54, no. 8, pp. 6391–6438, 2021. [Online]. Available: https://doi.org/10.1007/s10462-021-09975-1	spa
dc.relation.references	S. Salman and X. Liu, “Overfitting mechanism and avoidance in deep neural networks,” arXiv preprint arXiv:1901.06566, 2019. [Online]. Available: https://arxiv.org/abs/1901.06566	spa
dc.relation.references	X. Ying, “An overview of overfitting and its solutions,” in Journal of physics: Conference series, vol. 1168, no. 2. IOP Publishing, 2019, p. 022022. [Online]. Available: https://iopscience.iop.org/article/10.1088/1742-6596/1168/2/022022	spa
dc.relation.references	I. Bilbao and J. Bilbao, “Overfitting problem and the over-training in the era of data: Particularly for artificial neural networks,” in 2017 eighth international conference on intelligent computing and information systems (ICICIS). IEEE, 2017, pp. 173–177. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/8260032	spa
dc.relation.references	R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” Advances in neural information processing systems, vol. 31, 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf	spa
dc.relation.references	D. Karlsson and O. Svanstr ̈om, “Modelling dynamical systems using neural ordinary differential equations,” Master’s thesis, Chalmers University of Technology, 2019. [Online]. Available: https://odr.chalmers.se/server/api/core/bitstreams/83a4c17f-35e7-43ce-ac60-43a0b799f82f/content	spa
dc.relation.references	E. Weinan, “A proposal on machine learning via dynamical systems,” Communications in Mathematics and Statistics, vol. 1, no. 5, pp. 1–11, 2017. [Online]. Available: https://link.springer.com/article/10.1007/s40304-017-0103-z	spa
dc.relation.references	M. Benning, E. Celledoni, M. J. Ehrhardt, B. Owren, and C.-B. Schonlieb, “Deep learning as optimal control problems: Models and numerical methods,” arXiv preprint arXiv:1904.05657, 2019. [Online]. Available: https://arxiv.org/abs/1904.05657	spa
dc.relation.references	Q. Li, L. Chen, C. Tai et al., “Maximum principle based algorithms for deep learning,” Journal of Machine Learning Research, pp. 1–29, 2018. [Online]. Available: https://www.jmlr.org/papers/volume18/17-653/17-653.pdf	spa
dc.relation.references	Q. Li and S. Hao, “An optimal control approach to deep learning and applications to discrete-weight neural networks,” in International Conference on Machine Learning. PMLR, 2018, pp. 2985–2994. [Online]. Available: http://proceedings.mlr.press/v80/li18b/li18b.pdf	spa
dc.relation.references	J. Han, Q. Li et al., “A mean-field optimal control formulation of deep learning,” Research in the Mathematical Sciences, vol. 6, no. 1, pp. 1–41, 2019. [Online]. Available: https://link.springer.com/article/10.1007/s40687-018-0172-y	spa
dc.relation.references	E. Haber and L. Ruthotto, “Stable architectures for deep neural networks,” Inverse problems, vol. 34, no. 1, p. 014004, 2017. [Online]. Available: https://iopscience.iop.org/article/10.1088/1361-6420/aa9a90/meta	spa
dc.relation.references	B. Chang, L. Meng, E. Haber, F. Tung, and D. Begert, “Multi-level residual networks from dynamical systems view,” arXiv preprint arXiv:1710.10348, 2017. [Online]. Available: https://arxiv.org/abs/1710.10348	spa
dc.relation.references	M. Ciccone, M. Gallieri, J. Masci, C. Osendorfer, and F. Gomez, “Nais-net: Stable deep networks from non-autonomous differential equations,” Advances in Neural Information Processing Systems, vol. 31, 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/file/7bd28f15a49d5e5848d6ec70e584e625-Paper.pdf	spa
dc.relation.references	C. Finlay, J. Calder, B. Abbasi, and A. Oberman, “Lipschitz regularized deep neural networks generalize and are adversarially robust,” arXiv preprint arXiv:1808.09540, 2018. [Online]. Available: https://arxiv.org/abs/1808.09540	spa
dc.relation.references	P. Pauli, A. Koch, J. Berberich, P. Kohler, and F. Allg ̈ower, “Training robust neural networks using lipschitz bounds,” IEEE Control Systems Letters, vol. 6, pp. 121–126, 2021. [Online]. Available: https://ieeexplore.ieee.org/document/9319198	spa
dc.relation.references	H. Gouk, E. Frank, B. Pfahringer, and M. J. Cree, “Regularisation of neural networks by enforcing lipschitz continuity,” Machine Learning, vol. 110, no. 2, pp. 393–416, 2021. [Online]. Available: https://link.springer.com/article/10.1007/s10994-020-05929-w	spa
dc.relation.references	B. Dherin, M. Munn, M. Rosca, and D. G. Barrett, “Why neural networks find simple solutions: the many regularizers of geometric complexity,” arXiv preprint arXiv:2209.13083, 2022. [Online]. Available: https://arxiv.org/abs/2209.13083	spa
dc.relation.references	T. Zhou, Q. Li, H. Lu, Q. Cheng, and X. Zhang, “Gan review: Models and medical image fusion applications,” Information Fusion, vol. 91, pp. 134–148, 2023. [Online]. Available: https://doi.org/10.1016/j.inffus.2022.10.017	spa
dc.relation.references	C. A. Charu, Neural networks and deep learning: a textbook. Spinger, 2018.	spa
dc.relation.references	S. Theodoridis, “Neural networks and deep learning,” Machine Learning, pp. 875–936, 2015	spa
dc.relation.references	F. Rosenblatt, “The perceptron: a probabilistic model for information storage and organization in the brain.” Psychological review, vol. 65, no. 6, p. 386, 1958. [Online]. Available: https://psycnet.apa.org/record/1959-09865-001	spa
dc.relation.references	B. Pang, E. Nijkamp, and Y. N. Wu, “Deep learning with tensorflow: A review,” Journal of Educational and Behavioral Statistics, vol. 45, no. 2, pp. 227–248, 2020. [Online]. Available: https://doi.org/10.3102/1076998619872761	spa
dc.relation.references	K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [Online]. Available: https://ieeexplore.ieee.org/document/7780459	spa
dc.relation.references	D. Hunter, H. Yu, M. S. Pukish III, J. Kolbusz, and B. M. Wilamowski, “Selection of proper neural network sizes and architectures a comparative study,” IEEE Transactions on Industrial Informatics, vol. 8, no. 2, pp. 228–240, 2012. [Online]. Available: https://ieeexplore.ieee.org/document/6152147	spa
dc.relation.references	MATLAB, “Statistics and machine learning toolbox.” [Online]. Available: https://la.mathworks.com/products/statistics.html	spa
dc.relation.references	R, “neuralnet: Training of neural networks.” [Online]. Available: https://www.rdocumentation.org/packages/neuralnet/versions/1.44.2/topics/neuralnet	spa
dc.relation.references	TensorFlow, “Tensorflow 2.10.0.” [Online]. Available: https://www.tensorflow.org/	spa
dc.relation.references	J. Reunanen, “Overfitting in making comparisons between variable selection methods,” Journal of Machine Learning Research, vol. 3, pp. 1371–1382, 2003. [Online]. Available: https://www.jmlr.org/papers/volume3/reunanen03a/reunanen03a.pdf	spa
dc.relation.references	I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” Journal of machine learning research, vol. 3, pp. 1157–1182, 2003. [Online]. Available: https://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf	spa
dc.relation.references	F. Johnson, A. Valderrama, C. Valle, B. Crawford, R. Soto, and R. Nanculef, “Automating configuration of convolutional neural network hyperparameters using genetic algorithm,” IEEE Access, vol. 8, pp. 156 139–156 152, 2020. [Online]. Available: https://ieeexplore.ieee.org/document/9177040	spa
dc.relation.references	Y. Zhu, G. Li, R. Wang, S. Tang, H. Su, and K. Cao, “Intelligent fault diagnosis of hydraulic piston pump combining improved lenet-5 and pso hyperparameter optimization,” Applied Acoustics, vol. 183, p. 108336, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/S0003682X21004308	spa
dc.relation.references	A. Gaspar, D. Oliva, E. Cuevas, D. Zaldıvar, M. Pérez, and G. Pajares, “Hyperparameter optimization in a convolutional neural network using metaheuristic algorithms,” in Metaheuristics in Machine Learning: Theory and Applications. Springer, 2021, pp. 37–59. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-030-70542-8 2	spa
dc.relation.references	O. S. Steinholtz, “A comparative study of black-box optimization algorithms for tuning of hyper-parameters in deep neural networks,” Lulea University of Technology, 2018. [Online]. Available: https://ltu.divaportal.org/smash/get/diva2:1223709/FULLTEXT01.pdf	spa
dc.relation.references	L. Lugo, “A recurrent neural network approach for whole genome bacteria classification,” Master’s thesis, Universidad Nacional de Colombia, Bogota, Colombia, 2018.	spa
dc.relation.references	A. T. Sarmiento and O. Soto, “New product forecasting demand by using neural networks and similar product analysis.” Master’s thesis, Universidad Nacional de Colombia, Medellin, Colombia, 2014.	spa
dc.relation.references	A. E. Casas Fajardo, “Propuesta metodológica para calcular el avalúo de un predio empleando redes neuronales artificiales,” Master’s thesis, Universidad Nacional de Colombia, Bogotá, Colombia, 2014.	spa
dc.relation.references	S. Ortega Alzate, “Exploración de las redes neuronales para la proyección de la máxima pérdida esperada de una póliza de seguros: aplicación para un seguro previsionales,” Master’s thesis, Universidad Nacional de Colombia, Medellin, Colombia, 2021.	spa
dc.relation.references	D. Collazos, “Kernel-based enhancement of general stochastic network for supervised learning,” Master’s thesis, Universidad Nacional de Colombia, Manizales, Colombia, 2016.	spa
dc.relation.references	Y. Lu, A. Zhong, Q. Li, and B. Dong, “Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations,” in International Conference on Machine Learning. PMLR, 2018, pp. 3276–3285. [Online]. Available: http://proceedings.mlr.press/v80/lu18d/lu18d.pdf	spa
dc.relation.references	B. Geshkovski and E. Zuazua, “Turnpike in optimal control of pdes, resnets, and beyond,” Acta Numerica, vol. 31, pp. 135–263, 2022. [Online]. Available: https://doi.org/10.1017/S0962492922000046	spa
dc.relation.references	D. Ruiz-Balet and E. Zuazua, “Neural ode control for classification, approximation and transport,” arXiv preprint arXiv:2104.05278, 2021. [Online]. Available: https://arxiv.org/abs/2104.05278	spa
dc.relation.references	D. Ruiz-Balet, E. Affili, and E. Zuazua, “Interpolation and approximation via momentum resnets and neural odes,” Systems & Control Letters, vol. 162, p. 105182, 2022. [Online]. Available: https://doi.org/10.1016/j.sysconle.2022.105182	spa
dc.relation.references	M. Fazlyab, A. Robey, H. Hassani, M. Morari, and G. Pappas, “Efficient and accurate estimation of lipschitz constants for deep neural networks,” Advances in Neural Information Processing Systems, vol. 32, 2019. [Online]. Available: https://proceedings.neurips.cc/paper/2019/file/95e1533eb1b20a97777749fb94fdb944-Paper.pdf	spa
dc.relation.references	A. Xue, L. Lindemann, A. Robey, H. Hassani, G. J. Pappas, and R. Alur, “Chordal sparsity for lipschitz constant estimation of deep neural networks,” arXiv preprint arXiv:2204.00846, 2022. [Online]. Available: https://arxiv.org/abs/2204.00846	spa
dc.relation.references	L. Massidda, M. Marrocu, and S. Manca, “Non-intrusive load disaggregation by convolutional neural network and multilabel classification,” Applied Sciences, vol. 10, no. 4, p. 1454, 2020. [Online]. Available: https://www.mdpi.com/2076-3417/10/4/1454	spa
dc.relation.references	K. Muralitharan, R. Sakthivel, and R. Vishnuvarthan, “Neural network based optimization approach for energy demand prediction in smart grid,” Neurocomputing, vol. 273, pp. 199–208, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/S0925231217313681	spa
dc.relation.references	R. Lu and S. H. Hong, “Incentive-based demand response for smart grid with reinforcement learning and deep neural network,” Applied energy, vol. 236, pp. 937–949, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/S0306261918318798	spa
dc.relation.references	A. P. Marug ́an, F. P. G. M ́arquez, J. M. P. Perez, and D. Ruiz-Hern ́andez, “A survey of artificial neural network in wind energy systems,” Applied energy, vol. 228, pp. 1822–1836, 2018. [Online]. Available: https://doi.org/10.1016/j.apenergy.2018.07.084	spa
dc.relation.references	F. Saeed, M. A. Khan, M. Sharif, M. Mittal, L. M. Goyal, and S. Roy, “Deep neural network features fusion and selection based on pls regression with an application for crops diseases classification,” Applied Soft Computing, vol. 103, p. 107164, 2021. [Online]. Available: https://doi.org/10.1016/j.asoc.2021.107164	spa
dc.relation.references	M. Loey, A. ElSawy, and M. Afify, “Deep learning in plant diseases detection for agricultural crops: A survey,” International Journal of Service Science, Management, Engineering, and Technology (IJSSMET), vol. 11, no. 2, pp. 41–58, 2020. [Online]. Available: https://www.igi-global.com/article/deep-learning-in-plant-diseases-detection-for-agricultural-crops/248499	spa
dc.relation.references	B. Pandey, D. K. Pandey, B. P. Mishra, and W. Rhmann, “A comprehensive survey of deep learning in the field of medical imaging and medical natural language processing: Challenges and research directions,” Journal of King Saud University-Computer and Information Sciences, 2021.	spa
dc.relation.references	A. Nogales, A. J. Garcia-Tejedor, D. Monge, J. S. Vara, and C. Antón,“A survey of deep learning models in medical therapeutic areas,” Artificial Intelligence in Medicine, vol. 112, p. 102020, 2021. [Online]. Available: https://doi.org/10.1016/j.artmed.2021.102020	spa
dc.relation.references	Colciencias, “Plan Nacional de CTeI para el desarrollo del sector Tecnologías de la Información TIC 2017 - 2022,,” Bogotá, Colombia, 2017	spa
dc.relation.references	República de Colombia, “Plan de Desarrollo Nacional 2018-2022 “Pacto por Colombia, pacto por la equidad”,” Bogotá, Colombia, 2018	spa
dc.relation.references	Colciencias, “Política Nacional de Ciencia e Innovación para el Desarrollo Sostenible Libro Verde 2030,” Bogotá, Colombia, 2018.	spa
dc.relation.references	J. C. Riaño Rojas, “Desarrollo de una metodología como soporte para la detección de enfermedades vasculares del tejido conectivo a través de imágenes capilaroscópicas,” Ph.D. dissertation, Universidad Nacional de Colombia, Bogotá, Colombia, 2010	spa
dc.relation.references	T. T. Tang, J. A. Zawaski, K. N. Francis, A. A. Qutub, and M. W. Gaber, “Image-based classification of tumor type and growth rate using machine learning: a preclinical study,” Scientific reports, vol. 9, no. 1, pp. 1–10, 2019. [Online]. Available: https://www.nature.com/articles/s41598-019-48738-5	spa
dc.relation.references	C. A. Pedraza Bonilla and L. Rodríguez Mújica, “Método para la estimación de maleza en cultivos de lechuga utilizando aprendizaje profundo e imágenes multiespectrales,” Master’s thesis, Universidad Nacional de Colombia., Bogotá, Colombia, 2016.	spa
dc.relation.references	C. Barrios Pérez, “Zonificación agroecológica para el cultivo de arroz de riego (Oryza Sativa L.) en Colombia,” Master’s thesis, Universidad Católica de Colombia., Palmira, Colombia, 2016	spa
dc.relation.references	A. F. Montenegro and C. D. Parada, “Diseño e implementación de un sistema de detección de malezas en cultivos cundiboyacenses,” Master’s thesis, Universidad Católica de Colombia., Bogotá, Colombia, 2015	spa
dc.relation.references	M. Minsky and S. Papert, “An introduction to computational geometry,” Cambridge tiass., HIT, vol. 479, p. 480, 1969.	spa
dc.relation.references	G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathematics of control, signals and systems, vol. 2, no. 4, pp. 303–314, 1989. [Online]. Available: https://link.springer.com/article/10.1007/BF02551274	spa
dc.relation.references	D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” nature, vol. 323, no. 6088, pp. 533–536, 1986. [Online]. Available: https://www.nature.com/articles/323533a0	spa
dc.relation.references	Y. Yu, X. Si, C. Hu, and J. Zhang, “A review of recurrent neural networks: LSTM cells and network architectures,” Neural computation, vol. 31, no. 7, pp. 1235–1270, 2019. [Online]. Available: https://ieeexplore.ieee.org/document/8737887	spa
dc.relation.references	N. Li, S. Liu, Y. Liu, S. Zhao, and M. Liu, “Neural speech synthesis with transformer network,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 6706–6713	spa
dc.relation.references	G. Parascandolo, H. Huttunen, and T. Virtanen, “Taming the waves: sine as activation function in deep neural networks,” 2017. [Online]. Available: https://openreview.net/forum?id=Sks3zF9eg	spa
dc.relation.references	J. Heredia-Juesas and J. A. Martínez-Lorenzo, “Consensus function from an Lp− norm regularization term for its use as adaptive activation functions in neural networks,” arXiv e-prints, pp. arXiv–2206, 2022. [Online]. Available: https://arxiv.org/abs/2206.15017	spa
dc.relation.references	A. D. Jagtap, Y. Shin, K. Kawaguchi, and G. E. Karniadakis, “Deep kronecker neural networks: A general framework for neural networks with adaptive activation functions,” Neurocomputing, vol. 468, pp. 165–180, 2022. [Online]. Available: https://doi.org/10.1016/j.neucom.2021.10.036	spa
dc.relation.references	D. Devikanniga, K. Vetrivel, and N. Badrinath, “Review of meta-heuristic optimization based artificial neural networks and its applications,” in Journal of Physics: Conference Series, vol. 1362, no. 1. IOP Publishing, 2019, p. 012074. [Online]. Available: https://iopscience.iop.org/article/10.1088/1742-6596/1362/1/012074/meta	spa
dc.relation.references	N. Gupta, M. Khosravy, N. Patel, S. Gupta, and G. Varshney, “Evolutionary artificial neural networks: comparative study on state-of-the-art optimizers,” in Frontier applications of nature inspired computation. Springer, 2020, pp. 302–318. [Online]. Available: https://link.springer.com/chapter/10.1007/978-981-15-2133-1 14	spa
dc.relation.references	J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization.” Journal of machine learning research, vol. 12, no. 7, 2011. [Online]. Available: https://jmlr.org/papers/volume12/duchi11a/duchi11a.pdf	spa
dc.relation.references	D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. [Online]. Available: https://arxiv.org/abs/1412.6980	spa
dc.relation.references	M. D. Zeiler, “ADADELTA: an adaptive learning rate method,” arXiv preprint arXiv:1212.5701, 2012. [Online]. Available: https://arxiv.org/abs/1212.5701	spa
dc.relation.references	S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747, 2016. [Online]. Available: https://arxiv.org/abs/1609.04747	spa
dc.relation.references	P. Netrapalli, “Stochastic gradient descent and its variants in machine learning,” Journal of the Indian Institute of Science, vol. 99, no. 2, pp. 201–213, 2019. [Online]. Available: https://link.springer.com/article/10.1007/s41745-019-0098-4	spa
dc.relation.references	S. Lawrence, C. L. Giles, and A. C. Tsoi, “Lessons in neural network training: Overfitting may be harder than expected,” in Proceedings of the Fourteenth National Conference on Artificial Intelligence, 1997, pp. 540–545. [Online]. Available: https://clgiles.ist.psu.edu/papers/AAAI-97.overfitting.hard_to_do.pdf	spa
dc.relation.references	X.-x. Wu and J.-g. Liu, “A new early stopping algorithm for improving neural network generalization,” in 2009 Second International Conference on Intelligent Computation Technology and Automation, vol. 1. IEEE, 2009, pp. 15–18. [Online]. Available: https://ieeexplore.ieee.org/document/5287721	spa
dc.relation.references	H. Liang, S. Zhang, J. Sun, X. He, W. Huang, K. Zhuang, and Z. Li, “Darts+: Improved differentiable architecture search with early stopping,” arXiv preprint arXiv:1909.06035, 2019. [Online]. Available: https://arxiv.org/abs/1909.06035	spa
dc.relation.references	L. Prechelt, “Early stopping-but when?” in Neural Networks: Tricks of the trade. Springer, 1998, pp. 55–69. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-642-35289-8 5	spa
dc.relation.references	M. Mahsereci, L. Balles, C. Lassner, and P. Hennig, “Early stopping without a validation set,” arXiv preprint arXiv:1703.09580, 2017. [Online]. Available: https://arxiv.org/abs/1703.09580	spa
dc.relation.references	C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of big data, vol. 6, no. 1, pp. 1–48, 2019. [Online]. Available: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0	spa
dc.relation.references	A. Mikolajczyk and M. Grochowski, “Data augmentation for improving deep learning in image classification problem,” in 2018 International Interdisciplinary PhD Workshop (IIPhDW). IEEE, 2018, pp. 117–122. [Online]. Available: https://ieeexplore.ieee.org/document/8388338	spa
dc.relation.references	K. el Hindi and A.-A. Mousa, “Smoothing decision boundaries to avoid overfitting in neural network training,” Neural Network World, vol. 21, no. 4, p. 311, 2011. [Online]. Available: https://www.researchgate.net/publication/272237391_Smoothing_decision_boundaries_to_avoid_overfitting_in_neural_network_training	spa
dc.relation.references	H. Jabbar and R. Z. Khan, “Methods to avoid over-fitting and under-fitting in supervised machine learning (comparative study),” Computer Science, Communication and Instrumentation Devices, vol. 70, 2015	spa
dc.relation.references	K.-j. Kim, “Artificial neural networks with evolutionary instance selection for financial forecasting,” Expert Systems with Applications, vol. 30, no. 3, pp. 519–526, 2006. [Online]. Available: https://doi.org/10.1016/j.eswa.2005.10.007	spa
dc.relation.references	N. Srivastava, “Improving neural networks with dropout,” Master’s thesis, University of Toronto, 2013. [Online]. Available: http://www.cs.toronto.edu/∼nitish/msc thesis.pdf	spa
dc.relation.references	N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014. [Online]. Available: https://jmlr.org/papers/v15/srivastava14a.html	spa
dc.relation.references	J. Ba and B. Frey, “Adaptive dropout for training deep neural networks,” Advances in neural information processing systems, vol. 26, 2013. [Online]. Available: https://proceedings.neurips.cc/paper/2013/file/7b5b23f4aadf9513306bcd59afb6e4c9-Paper.pdf	spa
dc.relation.references	B. Ko, H.-G. Kim, K.-J. Oh, and H.-J. Choi, “Controlled dropout: A different approach to using dropout on deep neural network,” in 2017 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 2017, pp. 358–362. [Online]. Available: https://ieeexplore.ieee.org/document/7881693	spa
dc.relation.references	D. Molchanov, A. Ashukha, and D. Vetrov, “Variational dropout sparsifies deep neural networks,” in International Conference on Machine Learning. PMLR, 2017, pp. 2498–2507. [Online]. Available: https://arxiv.org/abs/1701.05369	spa
dc.relation.references	G. Zhang, C. Wang, B. Xu, and R. Grosse, “Three mechanisms of weight decay regularization,” in International Conference on Learning Representations, 2018. [Online]. Available: https://www.researchgate.net/publication/328598833_Three_Mechanisms_of_Weight_Decay_Regularization	spa
dc.relation.references	S. J. Nowlan and G. E. Hinton, “Simplifying neural networks by soft weight sharing,” in The Mathematics of Generalization. CRC Press, 2018, pp. 373–394. [Online]. Available: https://ieeexplore.ieee.org/document/6796174	spa
dc.relation.references	R. Ghosh and M. Motani, “Network-to-network regularization: Enforcing occam’s razor to improve generalization,” Advances in Neural Information Processing Systems, vol. 34, pp. 6341–6352, 2021. [Online]. Available: https://proceedings.neurips.cc/paper/2021/file/321cf86b4c9f5ddd04881a44067c2a5a-Paper.pdf	spa
dc.relation.references	B. Neal, S. Mittal, A. Baratin, V. Tantia, M. Scicluna, S. Lacoste-Julien, and I. Mitliagkas, “A modern take on the bias-variance tradeoff in neural networks,” arXiv preprint arXiv:1810.08591, 2018. [Online]. Available: https://arxiv.org/abs/1810.08591	spa
dc.relation.references	P. Nakkiran, G. Kaplun, Y. Bansal, T. Yang, B. Barak, and I. Sutskever, “Deep double descent: Where bigger models and more data hurt,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2021, no. 12, p. 124003, 2021. [Online]. Available: https://iopscience.iop.org/article/10.1088/1742-5468/ac3a74/meta	spa
dc.relation.references	Z. Yang, Y. Yu, C. You, J. Steinhardt, and Y. Ma, “Rethinking bias-variance trade-off for generalization of neural networks,” in International Conference on Machine Learning. PMLR, 2020, pp. 10 767–10 777. [Online]. Available: http://proceedings.mlr.press/v119/yang20j/yang20j.pdf	spa
dc.relation.references	Y. Dar, V. Muthukumar, and R. G. Baraniuk, “A farewell to the bias-variance tradeoff? an overview of the theory of overparameterized machine learning,” arXiv preprint arXiv:2109.02355, 2021. [Online]. Available: https://arxiv.org/abs/2109.02355	spa
dc.relation.references	B. Ghojogh and M. Crowley, “The theory behind overfitting, cross validation, regularization, bagging, and boosting: tutorial,” arXiv preprint arXiv:1905.12787, 2019. [Online]. Available: https://arxiv.org/abs/1905.12787	spa
dc.relation.references	Y. Yoshida and T. Miyato, “Spectral norm regularization for improving the generalizability of deep learning,” arXiv preprint arXiv:1705.10941, 2017. [Online]. Available: https://arxiv.org/abs/1705.10941	spa
dc.relation.references	Y. Tsuzuku, I. Sato, and M. Sugiyama, “Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks,” Advances in neural information processing systems, vol. 31, 2018. [Online]. Available: https://proceedings.neurips.cc/paper/2018/file/485843481a7edacbfce101ecb1e4d2a8-Paper.pdf	spa
dc.relation.references	H. Li, J. Li, X. Guan, B. Liang, Y. Lai, and X. Luo, “Research on overfitting of deep learning,” in 2019 15th International Conference on Computational Intelligence and Security (CIS). IEEE, 2019, pp. 78–81. [Online]. Available: https://ieeexplore.ieee.org/document/9023664	spa
dc.relation.references	A. Gavrilov, A. Jordache, M. Vasdani, and J. Deng, “Convolutional neural networks: Estimating relations in the ising model on overfitting,” in 2018 IEEE 17th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE, 2018, pp. 154–158. [Online]. Available: https://ieeexplore.ieee.org/document/8482067	spa
dc.relation.references	L. Deng, “The mnist database of handwritten digit images for machine learning research,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, 2012. [Online]. Available: https://ieeexplore.ieee.org/document/6296535	spa
dc.relation.references	H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017. [Online]. Available: https://arxiv.org/abs/1708.07747	spa
dc.relation.references	F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential model-based optimization for general algorithm configuration,” in International conference on learning and intelligent optimization. Springer, 2011, pp. 507–523. [Online]. Available: https://ml.informatik.uni-freiburg.de/wp-content/uploads/papers/11-LION5-SMAC.pdf	spa
dc.relation.references	M. López-Ibáñez, J. Dubois-Lacoste, L. Pérez Cáceres, M. Birattari, and T. Stutzle, “The irace package: Iterated racing for automatic algorithm configuration,” Operations Research Perspectives, vol. 3, pp. 43–58, 2016. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2214716015300270	spa
dc.relation.references	R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of eugenics, vol. 7, no. 2, pp. 179–188, 1936	spa
dc.relation.references	D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml	spa
dc.relation.references	I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in International Conference on Learning Representations, 2015. [Online]. Available: https://arxiv.org/abs/1412.6572	spa
dc.relation.references	A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in Artificial intelligence safety and security. Chapman and Hall/CRC, 2018, pp. 99–112. [Online]. Available: https://openreview.net/pdf?id=S1OufnIlx	spa
dc.relation.references	Z. Chen, Q. Li, and Z. Zhang, “Towards robust neural networks via close-loop control,” arXiv preprint arXiv:2102.01862, 2021. [Online]. Available: https://arxiv.org/abs/2102.01862	spa
dc.relation.references	L. Bottcher, N. Antulov-Fantulin, and T. Asikis, “AI Pontryagin or how artificial neural networks learn to control dynamical systems,” Nature Communications, vol. 13, no. 1, pp. 1–9, 2022. [Online]. Available: https://www.nature.com/articles/s41467-021-27590-0	spa
dc.relation.references	J. Zhuang, N. C. Dvornek, S. Tatikonda, and J. S. Duncan, “MALI: A memory efficient and reverse accurate integrator for neural ODEs,” arXiv preprint arXiv:2102.04668, 2021. [Online]. Available: https://arxiv.org/abs/2102.04668	spa
dc.relation.references	C. Rackauckas, M. Innes, Y. Ma, J. Bettencourt, L. White, and V. Dixit, “Diffeqflux.jl-a julia library for neural differential equations,” arXiv preprint arXiv:1902.02376, 2019. [Online]. Available: https://arxiv.org/abs/1902.02376	spa
dc.relation.references	D. M. Grobman, “Homeomorphism of systems of differential equations,” Doklady Akademii Nauk SSSR, vol. 128, no. 5, pp. 880–881, 1959.	spa
dc.relation.references	P. Hartman, “A lemma in the theory of structural stability of differential equations,” Proceedings of the American Mathematical Society, vol. 11, no. 4, pp. 610–620, 1960. [Online]. Available: https://www.ams.org/journals/proc/1960-011-04/S0002-9939-1960-0121542-7/S0002-9939-1960-0121542-7.pdf	spa
dc.relation.references	Ministerio de ciencia, tecnología e innovación. Minciencias, “Guía técnica para el reconocimiento de actores del SNCTeI,” 2021. [Online]. Available: https://minciencias.gov.co/sites/default/files/upload/reconocimiento/m601pr05g07_guia_tecnica_para_el_reconocimiento_del_centro_de_desarrollo_tecnologico_cdt_v00_0.pdf	spa
dc.rights.accessrights	info:eu-repo/semantics/openAccess	spa
dc.rights.license	Atribución-CompartirIgual 4.0 Internacional	spa
dc.rights.uri	http://creativecommons.org/licenses/by-sa/4.0/	spa
dc.subject.ddc	510 - Matemáticas::519 - Probabilidades y matemáticas aplicadas	spa
dc.subject.proposal	Continuidad Lipschitz	spa
dc.subject.proposal	Generalización	spa
dc.subject.proposal	ODENet	spa
dc.subject.proposal	Regularización	spa
dc.subject.proposal	Redes neuronales	spa
dc.subject.proposal	ResNet	spa
dc.subject.proposal	Sobreajuste	spa
dc.subject.proposal	Generalization	eng
dc.subject.proposal	Lipschitz continuity	eng
dc.subject.proposal	Neural Networks	eng
dc.subject.proposal	Overfitting	eng
dc.subject.proposal	Regularization	eng
dc.subject.proposal	ResNet	eng
dc.title	Estudio de la reducción del sobreajuste en arquitecturas de redes neuronales residuales ResNet en un escenario de clasificación de patrones	spa
dc.title.translated	Study of overfitting reduction in residual neural network architectures (ResNet) in a pattern classification scenario	eng
dc.type	Trabajo de grado - Maestría	spa
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc	spa
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa	spa
dc.type.content	Image	spa
dc.type.content	Text	spa
dc.type.driver	info:eu-repo/semantics/masterThesis	spa
dc.type.version	info:eu-repo/semantics/acceptedVersion	spa
dcterms.audience.professionaldevelopment	Bibliotecarios	spa
dcterms.audience.professionaldevelopment	Estudiantes	spa
dcterms.audience.professionaldevelopment	Investigadores	spa
dcterms.audience.professionaldevelopment	Maestros	spa
dcterms.audience.professionaldevelopment	Público general	spa
oaire.accessrights	http://purl.org/coar/access_right/c_abf2	spa

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: 1085325637.2023.pdf
Tamaño:: 3.74 MB
Formato:: Adobe Portable Document Format
Descripción:: Tesis de Maestría en Ciencias - Matemática Aplicada

Descargar

Bloque de licencias

Mostrando 1 - 1 de 1

Nombre:: license.txt
Tamaño:: 5.74 KB
Formato:: Item-specific license agreed upon to submission
Descripción:

Descargar

Colecciones

Maestría en Ciencias - Matemática Aplicada