Generación de series de tiempo financieras sintéticas para "data augmentation" usando redes neuronales generativas adversarias (GAN)

dc.contributor.advisorVilla Garzón, Fernán Alonso
dc.contributor.authorVillarraga Ossa, Edwin Fernando
dc.date.accessioned2021-03-25T20:36:10Z
dc.date.available2021-03-25T20:36:10Z
dc.date.issued2021-03-24
dc.description.abstractLos modelos GAN se han usado de forma exitosa para realizar aumento de datos en problemas relacionados con imágenes, audio y video, pues logran representar adecuadamente las propiedades de los datos reales, pero incorporando suficiente diversidad en los datos sintéticos generados como para poder mejorar el desempeño de los modelos de machine learning y deep learning en las evaluaciones por fuera de muestra. Las series de tiempo financieras se requieren para la modelación y solución de problemas en finanzas, sin embargo, dada la escasez de datos históricos, no solo originados por problemas de recolección de datos, sino también porque una serie de tiempo es solamente la realización de un proceso estocástico y por ende se presenta un sub muestreo. En este trabajo se generaron series de tiempo sintéticas usando DCGAN y cCGAN para generar datos de rendimientos, volúmenes, bid-ask spread, y precios con transformación fraccional, de acciones de Estados Unidos de América, con periodicidad diaria e intradiaria. Se pudo verificar que estos modelos GAN logran generar series simuladas que representan adecuadamente las propiedades distribucionales de las series históricas. Estas series sintéticas generadas pueden servir como insumo del tipo data augmentation en modelos de machine learning y deep learning para mejorar su desempeño con datos por fuera de muestra.spa
dc.description.abstractGAN models have been used successfully as a data augmentation method applied to problems related to images, audio and video, since they manage to adequately represent the properties of the real data, but incorporating diversity in the synthetic data generated in order to improve the out-of-sample performance of Machine Learning and Deep Learning models. Financial time series are required for modeling and solving problems in finance, however, given the scarcity of historical data, not only caused by data collection problems, but also because a time series is the realization of only one stochastic process and therefore a subsampling is presented. In this work, synthetic time series were generated using DCGAN and cCGAN to generate data on yields, volumes, bid-ask spread, and prices with fractional transformation, of shares of the United States of America, with daily and intraday periodicity. It was possible to verify that these GAN models manage to generate simulated series that adequately represent the distributional properties of the historical time series. These generated synthetic time series can serve as data augmentation to machine learning and deep learning models to improve their out-of-sample performance.eng
dc.description.degreelevelMaestríaspa
dc.format.extent76 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/79374
dc.language.isospaspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Medellínspa
dc.publisher.facultyMinasspa
dc.publisher.placeMedellínspa
dc.publisher.programMedellín - Minas - Maestría en Ingeniería - Analíticaspa
dc.relation.references1. Sezer OB, Gudelek MU, Ozbayoglu AM. Financial time series forecasting with deep learning : A systematic literature review: 2005–2019. Applied Soft Computing. 2020. p. 106181. doi:10.1016/j.asoc.2020.106181spa
dc.relation.references2. Kakushadze Z, Serur JA. 151 Trading Strategies. 2018. doi:10.1007/978-3-030-02792-6 3. Brooks C, Hoepner AGF, McMillan DG, Vivian A, Simen CW. Financial Data Science: The Birth of a New Financial Research Paradigm Complementing Econometrics? SSRN Electronic Journal. doi:10.2139/ssrn.3580729spa
dc.relation.references4. White H. A Reality Check for Data Snooping. Econometrica. 2000. pp. 1097–1126. doi:10.1111/1468-0262.00152spa
dc.relation.references5. Prado ML de, de Prado ML. Advances in Financial Machine Learning: Lecture 3/10. SSRN Electronic Journal. doi:10.2139/ssrn.3257419spa
dc.relation.references6. Gooijer JGD, De Gooijer JG, Hyndman RJ. 25 years of time series forecasting. International Journal of Forecasting. 2006. pp. 443–473. doi:10.1016/j.ijforecast.2006.01.001spa
dc.relation.references7. Box GEP, Jenkins GM. Time Series Analysis: Forecasting and Control, Revised Ed. 1976.spa
dc.relation.references8. Cont R. Empirical properties of asset returns: stylized facts and statistical issues. Quant Finance. 2001;1: 223–236.spa
dc.relation.references9. Tsay RS. Analysis of Financial Time Series. Wiley Series in Probability and Statistics. 2005. doi:10.1002/0471746193spa
dc.relation.references10. Zhou X, Pan Z, Hu G, Tang S, Zhao C. Stock Market Prediction on High-Frequency Data Using Generative Adversarial Nets. Mathematical Problems in Engineering. 2018. pp. 1–11. doi:10.1155/2018/4907423spa
dc.relation.references11. Dacorogna M et al. An Introduction to High-Frequency Finance. 2001. doi:10.1016/b978-0-12-279671-5.x5000-xspa
dc.relation.references12. Rydberg TH. Realistic Statistical Modelling of Financial Data. Int Stat Rev. 2000;68: 233–258.spa
dc.relation.references13. Cartea Á, Jaimungal S, Ricci J. Algorithmic Trading, Stochastic Control, and Mutually Exciting Processes. SIAM Review. 2018. pp. 673–703. doi:10.1137/18m1176968spa
dc.relation.references14. Cartea Á, Jaimungal S. Modeling Asset Prices for Algorithmic and High Frequency Trading. SSRN Electronic Journal. doi:10.2139/ssrn.1722202spa
dc.relation.references15. Agudelo DA, Giraldo S, Villarraga E. Does PIN measure information? Informed trading effects on returns and liquidity in six emerging markets. International Review of Economics & Finance. 2015. pp. 149–161. oi:10.1016/j.iref.2015.04.00266spa
dc.relation.references16. Glosten LR, Milgrom PR. Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. Journal of Financial Economics. 1985. pp. 71–100. doi:10.1016/0304-405x(85)90044-3spa
dc.relation.references17. Kyle AS. Continuous Auctions and Insider Trading. Econometrica. 1985. p. 1315. doi:10.2307/1913210spa
dc.relation.references18. Aldridge I. High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems. John Wiley and Sons; 2009.spa
dc.relation.references19. Lee CMC, Ready MJ. Inferring Trade Direction from Intraday Data. The Journal of Finance. 1991. pp. 733–746. doi:10.1111/j.1540-6261.1991.tb02683.xspa
dc.relation.references20. Easley D, de Prado MML, O’Hara M. Flow Toxicity and Liquidity in a High-frequency World. Review of Financial Studies. 2012. pp. 1457–1493. doi:10.1093/rfs/hhs053spa
dc.relation.references21. Vanstone B, Hahn T. Data Characteristics for High-Frequency Trading Systems. The Handbook of High Frequency Trading. 2015. pp. 47–57. doi:10.1016/b978-0-12-802205-4.00003-8spa
dc.relation.references22. Sirignano J, Cont R. Universal Features of Price Formation in Financial Markets: Perspectives From Deep Learning. SSRN Electronic Journal. doi:10.2139/ssrn.3141294spa
dc.relation.references23. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative Adversarial Networks. arXiv [stat.ML]. 2014. Available: http://arxiv.org/abs/1406.2661spa
dc.relation.references24. Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative Adversarial Networks: An Overview. IEEE Signal Processing Magazine. 2018. pp.53–65. doi:10.1109/msp.2017.2765202spa
dc.relation.references25. Donahue J, Krähenbühl P, Darrell T. Adversarial Feature Learning. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1605.09782spa
dc.relation.references26. Salehi P, Chalechale A, Taghizadeh M. Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments. arXiv [cs.CV]. 2020. Available: ttp://arxiv.org/abs/2005.13178spa
dc.relation.references27. Radford A, Metz L, Chintala S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1511.06434spa
dc.relation.references28. Mirza M, Osindero S. Conditional Generative Adversarial Nets. arXiv [cs.LG]. 2014. Available: http://arxiv.org/abs/1411.1784spa
dc.relation.references29. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1606.03657spa
dc.relation.references30. Odena A, Olah C, Shlens J. Conditional Image Synthesis With Auxiliary Classifier GANs. arXiv [stat.ML]. 2016. Available: http://arxiv.org/abs/1610.09585spa
dc.relation.references31. Odena A. Semi-Supervised Learning with Generative Adversarial Networks. arXiv [stat.ML]. 2016. Available: http://arxiv.org/abs/1606.0158367spa
dc.relation.references32. Brownlee J. Generative Adversarial Networks with Python: Deep Learning Generative Models for Image Synthesis and Image Translation. Machine Learning Mastery; 2019.spa
dc.relation.references33. Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial Autoencoders. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1511.05644 34. Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, et al. Adversarially Learned Inference. arXiv [stat.ML]. 2016. Available: http://arxiv.org/abs/1606.00704spa
dc.relation.references35. Larsen ABL, Sønderby SK, Larochelle H, Winther O. Autoencoding beyond pixels using a learned similarity metric. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1512.09300spa
dc.relation.references36. Metz L, Poole B, Pfau D, Sohl-Dickstein J. Unrolled Generative Adversarial Networks. arXiv [cs.LG]. 2016. Available: http://arxiv.org/abs/1611.02163spa
dc.relation.references37. Arjovsky M, Chintala S, Bottou L. Wasserstein Generative Adversarial Networks. In: Precup D, Teh YW, editors. International Convention Centre, Sydney, Australia: PMLR; 2017. pp. 214–223.spa
dc.relation.references38. Petzka H, Fischer A, Lukovnicov D. On the regularization of Wasserstein GANs. arXiv [stat.ML]. 2017. Available: http://arxiv.org/abs/1709.08894spa
dc.relation.references39. Zhu J-Y, Park T, Isola P, Efros AA. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv [cs.CV]. 2017. Available:http://arxiv.org/abs/1703.10593spa
dc.relation.references40. Isola P, Zhu J-Y, Zhou T, Efros AA. Image-to-Image Translation with Conditional Adversarial Networks. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1611.07004 41. Ledig C, Theis L, Huszar F, Caballero J, Cunningham A, Acosta A, et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1609.04802spa
dc.relation.references42. Wang X, Yu K, Wu S, Gu J, Liu Y, Dong C, et al. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. arXiv [cs.CV]. 2018. Available: http://arxiv.org/abs/1809.00219spa
dc.relation.references43. Such FP, Rawal A, Lehman J, Stanley KO, Clune J. Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data. arXiv [cs.LG]. 2019. Available: http://arxiv.org/abs/1912.07768spa
dc.relation.references44. Borji A. Pros and Cons of GAN Evaluation Measures. arXiv [cs.CV]. 2018. Available: http://arxiv.org/abs/1802.03446spa
dc.relation.references45. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X, et al. Improved Techniques for Training GANs. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R, editors. Advances in Neural Information Processing Systems 29. Curran Associates,Inc.; 2016. pp. 2234–2242.spa
dc.relation.references46. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. arXiv [cs.LG]. 2017. Available: http://arxiv.org/abs/1706.0850068spa
dc.relation.references47. Frid-Adar M, Klang E, Amitai M, Goldberger J, Greenspan H. Synthetic Data Augmentation using GAN for Improved Liver Lesion Classification. arXiv [cs.CV]. 2018. Available: http://arxiv.org/abs/1801.02385spa
dc.relation.references48. Shrivastava A, Pfister T, Tuzel O, Susskind J, Wang W, Webb R. Learning from Simulated and Unsupervised Images through Adversarial Training. arXiv [cs.CV]. 2016. Available: http://arxiv.org/abs/1612.07828spa
dc.relation.references49. Antoniou A, Storkey A, Edwards H. Data Augmentation Generative Adversarial Networks. arXiv [stat.ML]. 2017. Available: http://arxiv.org/abs/1711.04340spa
dc.relation.references50. Motamed S, Khalvati F. Inception Augmentation Generative Adversarial Network. arXiv [cs.CV]. 2020. Available: http://arxiv.org/abs/2006.03622spa
dc.relation.references51. Lee H, Kim J, Kim EK, Kim S. Wasserstein Generative Adversarial Networks Based Data Augmentation for Radar Data Analysis. NATO Adv Sci Inst Ser E Appl Sci. 2020;10: 1449spa
dc.relation.references52. Zhang K, Zhong G, Dong J, Wang S, Wang Y. Stock Market Prediction Based on Generative Adversarial Network. Procedia Comput Sci. 2019;147: 400–406.spa
dc.relation.references53. Koshiyama A, Firoozye N, Treleaven P. Generative Adversarial Networks for Financial Trading Strategies Fine-Tuning and Combination. arXiv [cs.LG]. 2019. Available: http://arxiv.org/abs/1901.01751spa
dc.relation.references54. Wiese M, Knobloch R, Korn R, Kretschmer P. Quant GANs: deep generation of financial time series. Quantitative Finance. 2020. pp. 1–22. doi:10.1080/14697688.2020.1730426spa
dc.relation.references55. Yoon J, Jarrett D, van der Schaar M. Time-series Generative Adversarial Networks. In:Wallach H, Larochelle H, Beygelzimer A, d\textquotesingle Alché-Buc F, Fox E, Garnett R, editors. Advances in Neural Information Processing Systems 32. Curran Associates,Inc.; 2019. pp. 5508–5518.spa
dc.relation.references56. Esteban C, Hyland SL, Rätsch G. Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs. arXiv [stat.ML]. 2017. Available:http://arxiv.org/abs/1706.02633spa
dc.relation.references57. Hartmann KG, Schirrmeister RT, Ball T. EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals. arXiv [eess.SP]. 2018. Available: http://arxiv.org/abs/1806.01875spa
dc.relation.references58. Wiese M, Bai L, Wood B, Buehler H. Deep Hedging: Learning to Simulate Equity Option Markets. arXiv [q-fin.CP]. 2019. Available: http://arxiv.org/abs/1911.01700spa
dc.relation.references59. Takahashi S, Chen Y, Tanaka-Ishii K. Modeling financial time-series with generative adversarial networks. Physica A: Statistical Mechanics and its Applications. 2019;527:121261.spa
dc.relation.references60. Guo Z, Wan Y, Ye H. A data imputation method for multivariate time series based on generative adversarial network. Neurocomputing. 2019;360: 185–197.spa
dc.relation.references61. Marti G. CORRGAN: Sampling Realistic Financial Correlation Matrices Using Generative Adversarial Networks. ICASSP 2020 - 2020 IEEE International Conferenceon Acoustics, Speech and Signal Processing (ICASSP). 2020.69 doi:10.1109/icassp40776.2020.9053276spa
dc.relation.references62. Li D, Chen D, Jin B, Shi L, Goh J, Ng S-K. MAD-GAN: Multivariate Anomaly Detection for Time Series Data with Generative Adversarial Networks. Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series. 2019. pp. 703–716.doi:10.1007/978-3-030-30490-4_56spa
dc.rightsDerechos reservados - Universidad Nacional de Colombiaspa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseAtribución-NoComercial-SinDerivadas 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/spa
dc.subject.ddc000 - Ciencias de la computación, información y obras generales::006 - Métodos especiales de computaciónspa
dc.subject.lembFinanzas - Modelos estocásticos
dc.subject.lembAnálisis de series de tiempo
dc.subject.lembAnálisis estocástico
dc.subject.proposalRedes Neuronalesspa
dc.subject.proposalGANeng
dc.subject.proposalSimulaciónspa
dc.subject.proposalData Augmentationeng
dc.subject.proposalOverfittingeng
dc.subject.proposalModelo generativospa
dc.subject.proposalDeep Learningeng
dc.titleGeneración de series de tiempo financieras sintéticas para "data augmentation" usando redes neuronales generativas adversarias (GAN)spa
dc.title.translatedGeneration of synthetic financial time series for "data augmentation" using generative adverdarial networks (GAN)
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdccspa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/masterThesisspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TMspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
71783907.2021.pdf
Tamaño:
3.02 MB
Formato:
Adobe Portable Document Format
Descripción:
Maestría en Ingeniería - Analítica

Bloque de licencias

Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
3.87 KB
Formato:
Item-specific license agreed upon to submission
Descripción: