Una comparación entre modelos estadísticos y de Machine Learning para la predicción de series de tiempo multivariadas
dc.contributor.advisor | Calderón Villanueva, Sergio Andrés | |
dc.contributor.author | Mariño Villalba, Javier Andrés | |
dc.contributor.orcid | 0000-0003-2968-391X | spa |
dc.date.accessioned | 2023-08-10T16:17:21Z | |
dc.date.available | 2023-08-10T16:17:21Z | |
dc.date.issued | 2023 | |
dc.description | ilustraciones, diagramas | spa |
dc.description.abstract | La predicción de series de tiempo es un área clásica de estudio en estadística y un campo en crecimiento del Aprendizaje Automático (o Machine Learning en inglés). El presente trabajo pretende comparar el desempeño para predecir un paso adelante en series de tiempo multivariadas entre los modelos estadísticos clásicos —Vector Autorregresivo (VAR) y Modelo de Corrección de Errores Vectoriales (VECM)— y los modelos de Machine Learning —Perceptrón Multicapa (MLP), Long Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). Como una alternativa a ambos tipos de modelos se consideró la combinación de pronósticos de varios modelos estadísticos y de Machine Learning para mejorar las predicciones. El desempeño de las predicciones se examinó mediante las métricas del error observado: Error Cuadrático Medio (ECM), Error Absoluto Medio (EAM) y la Raíz del Error Cuadrático Medio (RECM). Además, se contrastó la diferencia estadística entre las predicciones mediante una prueba Diebold-Mariano y de abarcamiento (o encompassing en inglés) multivariada. Para la construcción de los resultados empíricos se utilizaron siete series de tiempo multivariadas de diferentes ámbitos (economía, finanzas, epidemiología, meteorología, violencia y sociedad). (Texto tomado de la fuente) | spa |
dc.description.abstract | Time series prediction is a classical area of study in statistics and a growing field in Machine Learning. This paper aims to compare the performance of classic statistical models — Vector Autoregressive (VAR) and Vector Error Correction Model (VECM) — with Machine Learning models — Multilayer Perceptron (MLP), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) — for one-step-ahead prediction in multivariate time series. As an alternative to both types of models, the combination of forecasts from various statistical and Machine Learning models was considered to improve predictions. The prediction performance was examined using the following error metrics: Mean Squared Error (MSE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). Additionally, the statistical difference between predictions was tested using the Diebold-Mariano test and multivariate encompassing test. For the construction of empirical results, seven multivariate time series from different domains (economy, finance, epidemiology, meteorology, violence, and society) were used. | eng |
dc.description.degreelevel | Maestría | spa |
dc.description.degreename | Magister en Ciencias-Estadística | spa |
dc.description.methods | Esta investigación tiene como objetivo comparar el desempeño en las predicciones un paso adelante para series de tiempo multivariadas entre los modelos estadísticos clásicos —Vector Autorregresivo (VAR) y Modelo de Corrección de Errores Vectoriales (VECM) — y los modelos de ML —Perceptrón Multicapa (MLP), Long Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). Para ello se utilizaron las métricas de error: Error Cuadrático Medio (ECM), Error Absoluto Medio (EAM) y la Raíz del Error Cuadrático Medio (RECM), junto con las pruebas Diebold-Mariano y de encompassing multivariadas. Se incluye, además, con los pronósticos resultantes de los modelos estadísticos y del ML, combinaciones de pronósticos, que también son incorporadas a la comparación. | spa |
dc.description.researcharea | Series de tiempo | spa |
dc.format.extent | xiii, 47 páginas | spa |
dc.format.mimetype | application/pdf | spa |
dc.identifier.instname | Universidad Nacional de Colombia | spa |
dc.identifier.reponame | Repositorio Institucional Universidad Nacional de Colombia | spa |
dc.identifier.repourl | https://repositorio.unal.edu.co/ | spa |
dc.identifier.uri | https://repositorio.unal.edu.co/handle/unal/84522 | |
dc.language.iso | spa | spa |
dc.publisher | Universidad Nacional de Colombia | spa |
dc.publisher.branch | Universidad Nacional de Colombia - Sede Bogotá | spa |
dc.publisher.faculty | Facultad de Ciencias | spa |
dc.publisher.place | Bogotá, Colombia | spa |
dc.publisher.program | Bogotá - Ciencias - Maestría en Ciencias - Estadística | spa |
dc.relation.references | Aji, A., & Surjandari, I. (2020). Hybrid vector autoregression–recurrent neural networks to forecast multivariate time series jet fuel transaction price. IOP Conference Series: Materials Science and Engineering. | spa |
dc.relation.references | Baquero, Oswaldo., Reis, Lidia., & Chiaravalloti-Neto, Francisco. (2018). Dengue forecasting in São Paulo city with generalized additive models, artificial neural networks and seasonal autoregressive integrated moving average models . PLoS ONE. | spa |
dc.relation.references | Calderón, S., & Nieto, F. (2021). Forecasting with Multivariate Threshold Autoregressive Models. Rev.Colomb.Estad. vol.44 no.2, 369-383. | spa |
dc.relation.references | Chung, J., & otros. (2016). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. . arXiv. | spa |
dc.relation.references | Dissanayake, B., Hemachandra, O., Lakshitha, N., Haputhanthri, D., & Wijayasiri, A. (2021). A Comparison of ARIMAX, VAR and LSTM on Multivariate Short-Term Traffic Volume Forecasting. FRUCT Oy , 564-570. | spa |
dc.relation.references | Clements, M., & Hendry, D. (1993). On the Limitations of Comparing Mean Square Forecast Errors. Journal of Forecasting, 617-637. | spa |
dc.relation.references | Diebold, F., & Mariano, R. (1995). Comparing Predictive Accuracy. Journal of Business and Economic Statistics, 253-265. | spa |
dc.relation.references | Enders, W. (2006). Applied Econometric Time Series. Wiley. | spa |
dc.relation.references | Gal, Y., & Ghahramani, Z. (2016). A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. arXiv. | spa |
dc.relation.references | Goel, H., Melnyk, I., Oza, N., Matthews, B., & Banerjee, A. (2016). Multivariate Aviation Time Series Modeling: VARs vs. LSTMs. | spa |
dc.relation.references | Hochreiter, S., & Schmidhuber, J. (2006). Long Short Term Memory. | spa |
dc.relation.references | Hornik, K. (1989). Multilayer feedforward networks are universal aproximators. Neural networks, 359-366. | spa |
dc.relation.references | Hoyos, M. (2019). Series de tiempo multivariada. Notas de clase Econometría 2. | spa |
dc.relation.references | Hyndman, R. (28 de Agosto de 2021). Rob J Hyndman. Obtenido de Detecting time series outliers: https://robjhyndman.com/hyndsight/tsoutliers/ | spa |
dc.relation.references | James, G., Witten, D., Hastie, T., & Tibshirani, R. (San Francisco). An introduction to statistical learning. 2021: Springer. | spa |
dc.relation.references | Kohler, M., & Krzyak, A. (2020). On the rate of convergence of a deep recurrent neural network estimate in a regression problem with dependent data. arXiv. | spa |
dc.relation.references | Kolekar, V. (2020). Prediction of Suspended Particulate Matter Using Machine Learning. Dublín: National College of Ireland. | spa |
dc.relation.references | Laurent, S., Rombouts, J., & Violante, F. (2011). On the forecasting accuracy of multivariate GARCH models. Journal of applied econometrics. | spa |
dc.relation.references | Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., & Talwalkar, A. (2018). Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. arXiv. | spa |
dc.relation.references | Montenegro, A. (2021). Notas de Clase: Diplomado Ciencia de datos. Universidad Nacional de Colombia. | spa |
dc.relation.references | Papoila, A. L., Rocha, C., Brás-Geraldes, C., & Xufre, P. (2013). item Generalized Linear Models, Generalized Additive Models and Neural Networks: Comparative Study in Medical Applications . Advances in Regression, Survival Analysis, Extreme Values, Markov Processes and Other Statistical Applications, 317-324. | spa |
dc.relation.references | Pathak, N., Ba, A., Ploennings, J., & Roy, N. (2018). Forecasting Gas Usage for Big Buildings Using Generalized Additive Models and Deep Learning (2018). 2018 IEEE International Conference on Smart Computing (SMARTCOMP). | spa |
dc.relation.references | Schafer, A., & Zimmermann, H. (2006). Recurrent Neural Networks Are Universal Approximators. International Conference on Artificial Neural Networks. | spa |
dc.relation.references | Shafique, A. (2019). Comparison of Deep Learning and Classical Regression Approaches for Multivariate and Multi Step Time Series Forecasting. Semantic Scholar. | spa |
dc.relation.references | Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. JMLR, 1929-1958. | spa |
dc.relation.references | Takagi, T., Kurokawa, E., Miyata, K., Okamoto, K., Tanaka, Y., Kurokawa, K., & Yasunaga, T. (2002). The Comparison of Generalized Additive Model with Artificial Hierarchical Neural Network in the Analysis of Pharmaceutical Data . Journal of Computer Aided Chemistry, 36-52. | spa |
dc.relation.references | The Royal Society. (2017, Abril). Machine learning: the power and promise of computers that learn by example. Retrieved from The Royal Society: https://royalsociety.org/topics-policy/projects/machine-learning/ | spa |
dc.relation.references | Timmermann, A. (2005, Agosto 27). Forecast Combinations. Retrieved from https://rady.ucsd.edu/faculty/directory/timmermann/pub/docs/forecast-combinations.pdf | spa |
dc.relation.references | Tsay, R. (1998). Testing and Modeling Multivariate Threshold Models. Journal of the American Statistical Association, 1188-1202. | spa |
dc.relation.references | Wang, Z., & Bessler, D. (2004). Forecasting performance of multivariate time series models with full and reduced rank: an empirical examination. International Journal of Forecasting , 683-695. | spa |
dc.relation.references | Weiss, C., Raviv, E., & Roetzer, G. F. (2018). Forecast Combinations in R using the ForecastComb Package. The R Journal Vol. 10/2, 261-281. | spa |
dc.relation.references | Yang, Z., Zhang, A., & Sudjianto, A. (2021). GAMI-Net: An Explainable Neural Network based on Generalized Additive Models with Structured Interactions. Arxiv. | spa |
dc.relation.references | Yamak, P., Yujian, L., & Gadosey, P. (2019). A Comparison between ARIMA, LSTM, and GRU for Time Series Forecasting. ACAI '19, 49-55. | spa |
dc.relation.references | Zhang, P., Patuwo, E., & Hu, M. (1999). A simulation study of artificial neural networks for nonlinear time-series forecasting. Computers & Operations Research, 381-396. | spa |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | spa |
dc.rights.license | Reconocimiento 4.0 Internacional | spa |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | spa |
dc.subject.ddc | 510 - Matemáticas::519 - Probabilidades y matemáticas aplicadas | spa |
dc.subject.lemb | Análisi de series de tiempo-modelos matemáticos | spa |
dc.subject.lemb | Time-series analysis - mathematical models | eng |
dc.subject.lemb | Estadística matemática | spa |
dc.subject.lemb | Mathematical statistics | eng |
dc.subject.lemb | Probabilidades | spa |
dc.subject.lemb | Probabilities | eng |
dc.subject.proposal | Series de tiempo multivariadas | spa |
dc.subject.proposal | Predicción de series de tiempo | spa |
dc.subject.proposal | Modelos de Aprendizaje Automático | spa |
dc.subject.proposal | Modelos estadísticos | spa |
dc.subject.proposal | Prueba Diebold-Mariano | spa |
dc.subject.proposal | Prueba de abarcamiento multivariada | spa |
dc.subject.proposal | Error Cuadrático Medio (ECM) | spa |
dc.title | Una comparación entre modelos estadísticos y de Machine Learning para la predicción de series de tiempo multivariadas | spa |
dc.title.translated | A comparison between statistical models and Machine Learning for the prediction of multivariate time series | eng |
dc.type | Trabajo de grado - Maestría | spa |
dc.type.coar | http://purl.org/coar/resource_type/c_bdcc | spa |
dc.type.coarversion | http://purl.org/coar/version/c_ab4af688f83e57aa | spa |
dc.type.content | Text | spa |
dc.type.driver | info:eu-repo/semantics/masterThesis | spa |
dc.type.redcol | http://purl.org/redcol/resource_type/TM | spa |
dc.type.version | info:eu-repo/semantics/acceptedVersion | spa |
dcterms.audience.professionaldevelopment | Receptores de fondos federales y solicitantes | spa |
oaire.accessrights | http://purl.org/coar/access_right/c_abf2 | spa |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- Trabajo_Final_de_Maestría_Javier_Mariño__Versión_final_.pdf
- Tamaño:
- 543.63 KB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Tesis de Maestría en Ciencias - Estadística
Bloque de licencias
1 - 1 de 1
Cargando...
- Nombre:
- license.txt
- Tamaño:
- 5.74 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción: