Una comparación entre modelos estadísticos y de Machine Learning para la predicción de series de tiempo multivariadas

dc.contributor.advisorCalderón Villanueva, Sergio Andrés
dc.contributor.authorMariño Villalba, Javier Andrés
dc.contributor.orcid0000-0003-2968-391Xspa
dc.date.accessioned2023-08-10T16:17:21Z
dc.date.available2023-08-10T16:17:21Z
dc.date.issued2023
dc.descriptionilustraciones, diagramasspa
dc.description.abstractLa predicción de series de tiempo es un área clásica de estudio en estadística y un campo en crecimiento del Aprendizaje Automático (o Machine Learning en inglés). El presente trabajo pretende comparar el desempeño para predecir un paso adelante en series de tiempo multivariadas entre los modelos estadísticos clásicos —Vector Autorregresivo (VAR) y Modelo de Corrección de Errores Vectoriales (VECM)— y los modelos de Machine Learning —Perceptrón Multicapa (MLP), Long Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). Como una alternativa a ambos tipos de modelos se consideró la combinación de pronósticos de varios modelos estadísticos y de Machine Learning para mejorar las predicciones. El desempeño de las predicciones se examinó mediante las métricas del error observado: Error Cuadrático Medio (ECM), Error Absoluto Medio (EAM) y la Raíz del Error Cuadrático Medio (RECM). Además, se contrastó la diferencia estadística entre las predicciones mediante una prueba Diebold-Mariano y de abarcamiento (o encompassing en inglés) multivariada. Para la construcción de los resultados empíricos se utilizaron siete series de tiempo multivariadas de diferentes ámbitos (economía, finanzas, epidemiología, meteorología, violencia y sociedad). (Texto tomado de la fuente)spa
dc.description.abstractTime series prediction is a classical area of study in statistics and a growing field in Machine Learning. This paper aims to compare the performance of classic statistical models — Vector Autoregressive (VAR) and Vector Error Correction Model (VECM) — with Machine Learning models — Multilayer Perceptron (MLP), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) — for one-step-ahead prediction in multivariate time series. As an alternative to both types of models, the combination of forecasts from various statistical and Machine Learning models was considered to improve predictions. The prediction performance was examined using the following error metrics: Mean Squared Error (MSE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). Additionally, the statistical difference between predictions was tested using the Diebold-Mariano test and multivariate encompassing test. For the construction of empirical results, seven multivariate time series from different domains (economy, finance, epidemiology, meteorology, violence, and society) were used.eng
dc.description.degreelevelMaestríaspa
dc.description.degreenameMagister en Ciencias-Estadísticaspa
dc.description.methodsEsta investigación tiene como objetivo comparar el desempeño en las predicciones un paso adelante para series de tiempo multivariadas entre los modelos estadísticos clásicos —Vector Autorregresivo (VAR) y Modelo de Corrección de Errores Vectoriales (VECM) — y los modelos de ML —Perceptrón Multicapa (MLP), Long Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). Para ello se utilizaron las métricas de error: Error Cuadrático Medio (ECM), Error Absoluto Medio (EAM) y la Raíz del Error Cuadrático Medio (RECM), junto con las pruebas Diebold-Mariano y de encompassing multivariadas. Se incluye, además, con los pronósticos resultantes de los modelos estadísticos y del ML, combinaciones de pronósticos, que también son incorporadas a la comparación.spa
dc.description.researchareaSeries de tiempospa
dc.format.extentxiii, 47 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/84522
dc.language.isospaspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.facultyFacultad de Cienciasspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ciencias - Maestría en Ciencias - Estadísticaspa
dc.relation.referencesAji, A., & Surjandari, I. (2020). Hybrid vector autoregression–recurrent neural networks to forecast multivariate time series jet fuel transaction price. IOP Conference Series: Materials Science and Engineering.spa
dc.relation.referencesBaquero, Oswaldo., Reis, Lidia., & Chiaravalloti-Neto, Francisco. (2018). Dengue forecasting in São Paulo city with generalized additive models, artificial neural networks and seasonal autoregressive integrated moving average models . PLoS ONE.spa
dc.relation.referencesCalderón, S., & Nieto, F. (2021). Forecasting with Multivariate Threshold Autoregressive Models. Rev.Colomb.Estad. vol.44 no.2, 369-383.spa
dc.relation.referencesChung, J., & otros. (2016). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. . arXiv.spa
dc.relation.referencesDissanayake, B., Hemachandra, O., Lakshitha, N., Haputhanthri, D., & Wijayasiri, A. (2021). A Comparison of ARIMAX, VAR and LSTM on Multivariate Short-Term Traffic Volume Forecasting. FRUCT Oy , 564-570.spa
dc.relation.referencesClements, M., & Hendry, D. (1993). On the Limitations of Comparing Mean Square Forecast Errors. Journal of Forecasting, 617-637.spa
dc.relation.referencesDiebold, F., & Mariano, R. (1995). Comparing Predictive Accuracy. Journal of Business and Economic Statistics, 253-265.spa
dc.relation.referencesEnders, W. (2006). Applied Econometric Time Series. Wiley.spa
dc.relation.referencesGal, Y., & Ghahramani, Z. (2016). A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. arXiv.spa
dc.relation.referencesGoel, H., Melnyk, I., Oza, N., Matthews, B., & Banerjee, A. (2016). Multivariate Aviation Time Series Modeling: VARs vs. LSTMs.spa
dc.relation.referencesHochreiter, S., & Schmidhuber, J. (2006). Long Short Term Memory.spa
dc.relation.referencesHornik, K. (1989). Multilayer feedforward networks are universal aproximators. Neural networks, 359-366.spa
dc.relation.referencesHoyos, M. (2019). Series de tiempo multivariada. Notas de clase Econometría 2.spa
dc.relation.referencesHyndman, R. (28 de Agosto de 2021). Rob J Hyndman. Obtenido de Detecting time series outliers: https://robjhyndman.com/hyndsight/tsoutliers/spa
dc.relation.referencesJames, G., Witten, D., Hastie, T., & Tibshirani, R. (San Francisco). An introduction to statistical learning. 2021: Springer.spa
dc.relation.referencesKohler, M., & Krzyak, A. (2020). On the rate of convergence of a deep recurrent neural network estimate in a regression problem with dependent data. arXiv.spa
dc.relation.referencesKolekar, V. (2020). Prediction of Suspended Particulate Matter Using Machine Learning. Dublín: National College of Ireland.spa
dc.relation.referencesLaurent, S., Rombouts, J., & Violante, F. (2011). On the forecasting accuracy of multivariate GARCH models. Journal of applied econometrics.spa
dc.relation.referencesLi, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2018). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. arXiv.spa
dc.relation.referencesMontenegro, A. (2021). Notas de Clase: Diplomado Ciencia de datos. Universidad Nacional de Colombia.spa
dc.relation.referencesPapoila, A. L., Rocha, C., Brás-Geraldes, C., & Xufre, P. (2013). item Generalized Linear Models, Generalized Additive Models and Neural Networks: Comparative Study in Medical Applications . Advances in Regression, Survival Analysis, Extreme Values, Markov Processes and Other Statistical Applications, 317-324.spa
dc.relation.referencesPathak, N., Ba, A., Ploennings, J., & Roy, N. (2018). Forecasting Gas Usage for Big Buildings Using Generalized Additive Models and Deep Learning (2018). 2018 IEEE International Conference on Smart Computing (SMARTCOMP).spa
dc.relation.referencesSchafer, A., & Zimmermann, H. (2006). Recurrent Neural Networks Are Universal Approximators. International Conference on Artificial Neural Networks.spa
dc.relation.referencesShafique, A. (2019). Comparison of Deep Learning and Classical Regression Approaches for Multivariate and Multi Step Time Series Forecasting. Semantic Scholar.spa
dc.relation.referencesSrivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. JMLR, 1929-1958.spa
dc.relation.referencesTakagi, T., Kurokawa, E., Miyata, K., Okamoto, K., Tanaka, Y., Kurokawa, K., & Yasunaga, T. (2002). The Comparison of Generalized Additive Model with Artificial Hierarchical Neural Network in the Analysis of Pharmaceutical Data . Journal of Computer Aided Chemistry, 36-52.spa
dc.relation.referencesThe Royal Society. (2017, Abril). Machine learning: the power and promise of computers that learn by example. Retrieved from The Royal Society: https://royalsociety.org/topics-policy/projects/machine-learning/spa
dc.relation.referencesTimmermann, A. (2005, Agosto 27). Forecast Combinations. Retrieved from https://rady.ucsd.edu/faculty/directory/timmermann/pub/docs/forecast-combinations.pdfspa
dc.relation.referencesTsay, R. (1998). Testing and Modeling Multivariate Threshold Models. Journal of the American Statistical Association, 1188-1202.spa
dc.relation.referencesWang, Z., & Bessler, D. (2004). Forecasting performance of multivariate time series models with full and reduced rank: an empirical examination. International Journal of Forecasting , 683-695.spa
dc.relation.referencesWeiss, C., Raviv, E., & Roetzer, G. F. (2018). Forecast Combinations in R using the ForecastComb Package. The R Journal Vol. 10/2, 261-281.spa
dc.relation.referencesYang, Z., Zhang, A., & Sudjianto, A. (2021). GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions. Arxiv.spa
dc.relation.referencesYamak, P., Yujian, L., & Gadosey, P. (2019). A Comparison between ARIMA, LSTM, and GRU for Time Series Forecasting. ACAI '19, 49-55.spa
dc.relation.referencesZhang, P., Patuwo, E., & Hu, M. (1999). A simulation study of artificial neural networks for nonlinear time-series forecasting. Computers & Operations Research, 381-396.spa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseReconocimiento 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/spa
dc.subject.ddc510 - Matemáticas::519 - Probabilidades y matemáticas aplicadasspa
dc.subject.lembAnálisi de series de tiempo-modelos matemáticosspa
dc.subject.lembTime-series analysis - mathematical modelseng
dc.subject.lembEstadística matemáticaspa
dc.subject.lembMathematical statisticseng
dc.subject.lembProbabilidadesspa
dc.subject.lembProbabilitieseng
dc.subject.proposalSeries de tiempo multivariadasspa
dc.subject.proposalPredicción de series de tiempospa
dc.subject.proposalModelos de Aprendizaje Automáticospa
dc.subject.proposalModelos estadísticosspa
dc.subject.proposalPrueba Diebold-Marianospa
dc.subject.proposalPrueba de abarcamiento multivariadaspa
dc.subject.proposalError Cuadrático Medio (ECM)spa
dc.titleUna comparación entre modelos estadísticos y de Machine Learning para la predicción de series de tiempo multivariadasspa
dc.title.translatedA comparison between statistical models and Machine Learning for the prediction of multivariate time serieseng
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdccspa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/masterThesisspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TMspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
dcterms.audience.professionaldevelopmentReceptores de fondos federales y solicitantesspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Trabajo_Final_de_Maestría_Javier_Mariño__Versión_final_.pdf
Tamaño:
543.63 KB
Formato:
Adobe Portable Document Format
Descripción:
Tesis de Maestría en Ciencias - Estadística

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
5.74 KB
Formato:
Item-specific license agreed upon to submission
Descripción: