Generación de series de tiempo financieras sintéticas para "data augmentation" usando redes neuronales generativas adversarias (GAN)
dc.contributor.advisor | Villa Garzón, Fernán Alonso | |
dc.contributor.author | Villarraga Ossa, Edwin Fernando | |
dc.date.accessioned | 2021-03-25T20:36:10Z | |
dc.date.available | 2021-03-25T20:36:10Z | |
dc.date.issued | 2021-03-24 | |
dc.description.abstract | Los modelos GAN se han usado de forma exitosa para realizar aumento de datos en problemas relacionados con imágenes, audio y video, pues logran representar adecuadamente las propiedades de los datos reales, pero incorporando suficiente diversidad en los datos sintéticos generados como para poder mejorar el desempeño de los modelos de machine learning y deep learning en las evaluaciones por fuera de muestra. Las series de tiempo financieras se requieren para la modelación y solución de problemas en finanzas, sin embargo, dada la escasez de datos históricos, no solo originados por problemas de recolección de datos, sino también porque una serie de tiempo es solamente la realización de un proceso estocástico y por ende se presenta un sub muestreo. En este trabajo se generaron series de tiempo sintéticas usando DCGAN y cCGAN para generar datos de rendimientos, volúmenes, bid-ask spread, y precios con transformación fraccional, de acciones de Estados Unidos de América, con periodicidad diaria e intradiaria. Se pudo verificar que estos modelos GAN logran generar series simuladas que representan adecuadamente las propiedades distribucionales de las series históricas. Estas series sintéticas generadas pueden servir como insumo del tipo data augmentation en modelos de machine learning y deep learning para mejorar su desempeño con datos por fuera de muestra. | spa |
dc.description.abstract | GAN models have been used successfully as a data augmentation method applied to problems related to images, audio and video, since they manage to adequately represent the properties of the real data, but incorporating diversity in the synthetic data generated in order to improve the out-of-sample performance of Machine Learning and Deep Learning models. Financial time series are required for modeling and solving problems in finance, however, given the scarcity of historical data, not only caused by data collection problems, but also because a time series is the realization of only one stochastic process and therefore a subsampling is presented. In this work, synthetic time series were generated using DCGAN and cCGAN to generate data on yields, volumes, bid-ask spread, and prices with fractional transformation, of shares of the United States of America, with daily and intraday periodicity. It was possible to verify that these GAN models manage to generate simulated series that adequately represent the distributional properties of the historical time series. These generated synthetic time series can serve as data augmentation to machine learning and deep learning models to improve their out-of-sample performance. | eng |
dc.description.degreelevel | Maestría | spa |
dc.format.extent | 76 páginas | spa |
dc.format.mimetype | application/pdf | spa |
dc.identifier.instname | Universidad Nacional de Colombia | spa |
dc.identifier.reponame | Repositorio Universidad Nacional de Colombia | spa |
dc.identifier.repourl | https://repositorio.unal.edu.co/ | spa |
dc.identifier.uri | https://repositorio.unal.edu.co/handle/unal/79374 | |
dc.language.iso | spa | spa |
dc.publisher | Universidad Nacional de Colombia | spa |
dc.publisher.branch | Universidad Nacional de Colombia - Sede Medellín | spa |
dc.publisher.faculty | Minas | spa |
dc.publisher.place | Medellín | spa |
dc.publisher.program | Medellín - Minas - Maestría en Ingeniería - Analítica | spa |
dc.relation.references | 1. Sezer OB, Gudelek MU, Ozbayoglu AM. Financial time series forecasting with deep learning : A systematic literature review: 2005–2019. Applied Soft Computing. 2020. p. 106181. doi:10.1016/j.asoc.2020.106181 | spa |
dc.relation.references | 2. Kakushadze Z, Serur JA. 151 Trading Strategies. 2018. doi:10.1007/978-3-030-02792-6 3. Brooks C, Hoepner AGF, McMillan DG, Vivian A, Simen CW. Financial Data Science: The Birth of a New Financial Research Paradigm Complementing Econometrics? SSRN Electronic Journal. doi:10.2139/ssrn.3580729 | spa |
dc.relation.references | 4. White H. A Reality Check for Data Snooping. Econometrica. 2000. pp. 1097–1126. doi:10.1111/1468-0262.00152 | spa |
dc.relation.references | 5. Prado ML de, de Prado ML. Advances in Financial Machine Learning: Lecture 3/10. SSRN Electronic Journal. doi:10.2139/ssrn.3257419 | spa |
dc.relation.references | 6. Gooijer JGD, De Gooijer JG, Hyndman RJ. 25 years of time series forecasting. International Journal of Forecasting. 2006. pp. 443–473. doi:10.1016/j.ijforecast.2006.01.001 | spa |
dc.relation.references | 7. Box GEP, Jenkins GM. Time Series Analysis: Forecasting and Control, Revised Ed. 1976. | spa |
dc.relation.references | 8. Cont R. Empirical properties of asset returns: stylized facts and statistical issues. Quant Finance. 2001;1: 223–236. | spa |
dc.relation.references | 9. Tsay RS. Analysis of Financial Time Series. Wiley Series in Probability and Statistics. 2005. doi:10.1002/0471746193 | spa |
dc.relation.references | 10. Zhou X, Pan Z, Hu G, Tang S, Zhao C. Stock Market Prediction on High-Frequency Data Using Generative Adversarial Nets. Mathematical Problems in Engineering. 2018. pp. 1–11. doi:10.1155/2018/4907423 | spa |
dc.relation.references | 11. Dacorogna M et al. An Introduction to High-Frequency Finance. 2001. doi:10.1016/b978-0-12-279671-5.x5000-x | spa |
dc.relation.references | 12. Rydberg TH. Realistic Statistical Modelling of Financial Data. Int Stat Rev. 2000;68: 233–258. | spa |
dc.relation.references | 13. Cartea Á, Jaimungal S, Ricci J. Algorithmic Trading, Stochastic Control, and Mutually Exciting Processes. SIAM Review. 2018. pp. 673–703. doi:10.1137/18m1176968 | spa |
dc.relation.references | 14. Cartea Á, Jaimungal S. Modeling Asset Prices for Algorithmic and High Frequency Trading. SSRN Electronic Journal. doi:10.2139/ssrn.1722202 | spa |
dc.relation.references | 15. Agudelo DA, Giraldo S, Villarraga E. Does PIN measure information? Informed trading effects on returns and liquidity in six emerging markets. International Review of Economics & Finance. 2015. pp. 149–161. oi:10.1016/j.iref.2015.04.00266 | spa |
dc.relation.references | 16. Glosten LR, Milgrom PR. Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. Journal of Financial Economics. 1985. pp. 71–100. doi:10.1016/0304-405x(85)90044-3 | spa |
dc.relation.references | 17. Kyle AS. Continuous Auctions and Insider Trading. Econometrica. 1985. p. 1315. doi:10.2307/1913210 | spa |
dc.relation.references | 18. Aldridge I. High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems. John Wiley and Sons; 2009. | spa |
dc.relation.references | 19. Lee CMC, Ready MJ. Inferring Trade Direction from Intraday Data. The Journal of Finance. 1991. pp. 733–746. doi:10.1111/j.1540-6261.1991.tb02683.x | spa |
dc.relation.references | 20. Easley D, de Prado MML, O’Hara M. Flow Toxicity and Liquidity in a High-frequency World. Review of Financial Studies. 2012. pp. 1457–1493. doi:10.1093/rfs/hhs053 | spa |
dc.relation.references | 21. Vanstone B, Hahn T. Data Characteristics for High-Frequency Trading Systems. The Handbook of High Frequency Trading. 2015. pp. 47–57. doi:10.1016/b978-0-12-802205-4.00003-8 | spa |
dc.relation.references | 22. Sirignano J, Cont R. Universal Features of Price Formation in Financial Markets: Perspectives From Deep Learning. SSRN Electronic Journal. doi:10.2139/ssrn.3141294 | spa |
dc.relation.references | 23. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative Adversarial Networks. arXiv [stat.ML]. 2014. Available: http://arxiv.org/abs/1406.2661 | spa |
dc.relation.references | 24. Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative Adversarial Networks: An Overview. IEEE Signal Processing Magazine. 2018. pp.53–65. doi:10.1109/msp.2017.2765202 | spa |
dc.relation.references | 25. Donahue J, Krähenbühl P, Darrell T. Adversarial Feature Learning. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1605.09782 | spa |
dc.relation.references | 26. Salehi P, Chalechale A, Taghizadeh M. Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments. arXiv [cs.CV]. 2020. Available: ttp://arxiv.org/abs/2005.13178 | spa |
dc.relation.references | 27. Radford A, Metz L, Chintala S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1511.06434 | spa |
dc.relation.references | 28. Mirza M, Osindero S. Conditional Generative Adversarial Nets. arXiv [cs.LG]. 2014. Available: http://arxiv.org/abs/1411.1784 | spa |
dc.relation.references | 29. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1606.03657 | spa |
dc.relation.references | 30. Odena A, Olah C, Shlens J. Conditional Image Synthesis With Auxiliary Classifier GANs. arXiv [stat.ML]. 2016. Available: http://arxiv.org/abs/1610.09585 | spa |
dc.relation.references | 31. Odena A. Semi-Supervised Learning with Generative Adversarial Networks. arXiv [stat.ML]. 2016. Available: http://arxiv.org/abs/1606.0158367 | spa |
dc.relation.references | 32. Brownlee J. Generative Adversarial Networks with Python: Deep Learning Generative Models for Image Synthesis and Image Translation. Machine Learning Mastery; 2019. | spa |
dc.relation.references | 33. Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial Autoencoders. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1511.05644 34. Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, et al. Adversarially Learned Inference. arXiv [stat.ML]. 2016. Available: http://arxiv.org/abs/1606.00704 | spa |
dc.relation.references | 35. Larsen ABL, Sønderby SK, Larochelle H, Winther O. Autoencoding beyond pixels using a learned similarity metric. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1512.09300 | spa |
dc.relation.references | 36. Metz L, Poole B, Pfau D, Sohl-Dickstein J. Unrolled Generative Adversarial Networks. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1611.02163 | spa |
dc.relation.references | 37. Arjovsky M, Chintala S, Bottou L. Wasserstein Generative Adversarial Networks. In: Precup D, Teh YW, editors. International Convention Centre, Sydney, Australia: PMLR; 2017. pp. 214–223. | spa |
dc.relation.references | 38. Petzka H, Fischer A, Lukovnicov D. On the regularization of Wasserstein GANs. arXiv [stat.ML]. 2017. Available: http://arxiv.org/abs/1709.08894 | spa |
dc.relation.references | 39. Zhu J-Y, Park T, Isola P, Efros AA. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv [cs.CV]. 2017. Available:http://arxiv.org/abs/1703.10593 | spa |
dc.relation.references | 40. Isola P, Zhu J-Y, Zhou T, Efros AA. Image-to-Image Translation with Conditional Adversarial Networks. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1611.07004 41. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1609.04802 | spa |
dc.relation.references | 42. Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, et al. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. arXiv [cs.CV]. 2018. Available: http://arxiv.org/abs/1809.00219 | spa |
dc.relation.references | 43. Such FP, Rawal A, Lehman J, Stanley KO, Clune J. Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data. arXiv [cs.LG]. 2019. Available: http://arxiv.org/abs/1912.07768 | spa |
dc.relation.references | 44. Borji A. Pros and Cons of GAN Evaluation Measures. arXiv [cs.CV]. 2018. Available: http://arxiv.org/abs/1802.03446 | spa |
dc.relation.references | 45. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X, et al. Improved Techniques for Training GANs. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, editors. Advances in Neural Information Processing Systems 29. Curran Associates,Inc.; 2016. pp. 2234–2242. | spa |
dc.relation.references | 46. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. arXiv [cs.LG]. 2017. Available: http://arxiv.org/abs/1706.0850068 | spa |
dc.relation.references | 47. Frid-Adar M, Klang E, Amitai M, Goldberger J, Greenspan H. Synthetic Data Augmentation using GAN for Improved Liver Lesion Classification. arXiv [cs.CV]. 2018. Available: http://arxiv.org/abs/1801.02385 | spa |
dc.relation.references | 48. Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R. Learning from Simulated and Unsupervised Images through Adversarial Training. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1612.07828 | spa |
dc.relation.references | 49. Antoniou A, Storkey A, Edwards H. Data Augmentation Generative Adversarial Networks. arXiv [stat.ML]. 2017. Available: http://arxiv.org/abs/1711.04340 | spa |
dc.relation.references | 50. Motamed S, Khalvati F. Inception Augmentation Generative Adversarial Network. arXiv [cs.CV]. 2020. Available: http://arxiv.org/abs/2006.03622 | spa |
dc.relation.references | 51. Lee H, Kim J, Kim EK, Kim S. Wasserstein Generative Adversarial Networks Based Data Augmentation for Radar Data Analysis. NATO Adv Sci Inst Ser E Appl Sci. 2020;10: 1449 | spa |
dc.relation.references | 52. Zhang K, Zhong G, Dong J, Wang S, Wang Y. Stock Market Prediction Based on Generative Adversarial Network. Procedia Comput Sci. 2019;147: 400–406. | spa |
dc.relation.references | 53. Koshiyama A, Firoozye N, Treleaven P. Generative Adversarial Networks for Financial Trading Strategies Fine-Tuning and Combination. arXiv [cs.LG]. 2019. Available: http://arxiv.org/abs/1901.01751 | spa |
dc.relation.references | 54. Wiese M, Knobloch R, Korn R, Kretschmer P. Quant GANs: deep generation of financial time series. Quantitative Finance. 2020. pp. 1–22. doi:10.1080/14697688.2020.1730426 | spa |
dc.relation.references | 55. Yoon J, Jarrett D, van der Schaar M. Time-series Generative Adversarial Networks. In:Wallach H, Larochelle H, Beygelzimer A, d\textquotesingle Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32. Curran Associates,Inc.; 2019. pp. 5508–5518. | spa |
dc.relation.references | 56. Esteban C, Hyland SL, Rätsch G. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv [stat.ML]. 2017. Available:http://arxiv.org/abs/1706.02633 | spa |
dc.relation.references | 57. Hartmann KG, Schirrmeister RT, Ball T. EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals. arXiv [eess.SP]. 2018. Available: http://arxiv.org/abs/1806.01875 | spa |
dc.relation.references | 58. Wiese M, Bai L, Wood B, Buehler H. Deep Hedging: Learning to Simulate Equity Option Markets. arXiv [q-fin.CP]. 2019. Available: http://arxiv.org/abs/1911.01700 | spa |
dc.relation.references | 59. Takahashi S, Chen Y, Tanaka-Ishii K. Modeling financial time-series with generative adversarial networks. Physica A: Statistical Mechanics and its Applications. 2019;527:121261. | spa |
dc.relation.references | 60. Guo Z, Wan Y, Ye H. A data imputation method for multivariate time series based on generative adversarial network. Neurocomputing. 2019;360: 185–197. | spa |
dc.relation.references | 61. Marti G. CORRGAN: Sampling Realistic Financial Correlation Matrices Using Generative Adversarial Networks. ICASSP 2020 - 2020 IEEE International Conferenceon Acoustics, Speech and Signal Processing (ICASSP). 2020.69 doi:10.1109/icassp40776.2020.9053276 | spa |
dc.relation.references | 62. Li D, Chen D, Jin B, Shi L, Goh J, Ng S-K. MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks. Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series. 2019. pp. 703–716.doi:10.1007/978-3-030-30490-4_56 | spa |
dc.rights | Derechos reservados - Universidad Nacional de Colombia | spa |
dc.rights.accessrights | info:eu-repo/semantics/openAccess | spa |
dc.rights.license | Atribución-NoComercial-SinDerivadas 4.0 Internacional | spa |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | spa |
dc.subject.ddc | 000 - Ciencias de la computación, información y obras generales::006 - Métodos especiales de computación | spa |
dc.subject.lemb | Finanzas - Modelos estocásticos | |
dc.subject.lemb | Análisis de series de tiempo | |
dc.subject.lemb | Análisis estocástico | |
dc.subject.proposal | Redes Neuronales | spa |
dc.subject.proposal | GAN | eng |
dc.subject.proposal | Simulación | spa |
dc.subject.proposal | Data Augmentation | eng |
dc.subject.proposal | Overfitting | eng |
dc.subject.proposal | Modelo generativo | spa |
dc.subject.proposal | Deep Learning | eng |
dc.title | Generación de series de tiempo financieras sintéticas para "data augmentation" usando redes neuronales generativas adversarias (GAN) | spa |
dc.title.translated | Generation of synthetic financial time series for "data augmentation" using generative adverdarial networks (GAN) | |
dc.type | Trabajo de grado - Maestría | spa |
dc.type.coar | http://purl.org/coar/resource_type/c_bdcc | spa |
dc.type.coarversion | http://purl.org/coar/version/c_ab4af688f83e57aa | spa |
dc.type.content | Text | spa |
dc.type.driver | info:eu-repo/semantics/masterThesis | spa |
dc.type.redcol | http://purl.org/redcol/resource_type/TM | spa |
dc.type.version | info:eu-repo/semantics/acceptedVersion | spa |
oaire.accessrights | http://purl.org/coar/access_right/c_abf2 | spa |
Archivos
Bloque original
1 - 1 de 1
Cargando...
- Nombre:
- 71783907.2021.pdf
- Tamaño:
- 3.02 MB
- Formato:
- Adobe Portable Document Format
- Descripción:
- Maestría en Ingeniería - Analítica
Bloque de licencias
1 - 1 de 1
No hay miniatura disponible
- Nombre:
- license.txt
- Tamaño:
- 3.87 KB
- Formato:
- Item-specific license agreed upon to submission
- Descripción: