Evaluating the impact of curriculum learning on the training process for an intelligent agent in a video game

Sáenz Imbacuán, Rigoberto

Mostrar el registro sencillo del documento

dc.rights.license	Atribución-NoComercial 4.0 Internacional
dc.contributor.advisor	Camargo Mendoza, Jorge Eliécer
dc.contributor.author	Sáenz Imbacuán, Rigoberto
dc.date.accessioned	2020-11-06T14:33:26Z
dc.date.available	2020-11-06T14:33:26Z
dc.date.issued	2020-07-07
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/78592
dc.description.abstract	We want to measure the impact of the curriculum learning technique on a reinforcement training setup, several experiments were designed with different training curriculums adapted for the video game chosen as a case study. Then all were executed on a selected game simulation platform, using two reinforcement learning algorithms, and using the mean cumulative reward as a performance measure. Results suggest that curriculum learning has a significant impact on the training process, increasing training times in some cases, and decreasing them up to 40% percent in some other cases.
dc.description.abstract	Se desea medir el impacto de la técnica de aprendizaje por currículos sobre el tiempo de entrenamiento de un agente inteligente que está aprendiendo a jugar un video juego usando aprendizaje por refuerzo, para esto se diseñaron varios experimentos con diferentes currículos adaptados para el video juego seleccionado como caso de estudio, y se ejecutaron en una plataforma de simulación de juegos seleccionada, usando dos algoritmos de aprendizaje por refuerzo, y midiendo su desempeño usando la recompensa media acumulada. Los resultados sugieren que usar aprendizaje por currículos tiene un impacto significativo sobre el proceso de entrenamiento, en algunos casos alargando los tiempos de entrenamiento, y en otros casos disminuyéndolos en hasta en un 40% por ciento.
dc.format.extent	97
dc.format.mimetype	application/pdf
dc.language.iso	eng
dc.rights	Derechos reservados - Universidad Nacional de Colombia
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/
dc.subject.ddc	000 - Ciencias de la computación, información y obras generales
dc.title	Evaluating the impact of curriculum learning on the training process for an intelligent agent in a video game
dc.title.alternative	Evaluando el impacto del aprendizaje por currículos en el proceso de entrenamiento de un agente inteligente en un videojuego
dc.type	Otro
dc.rights.spa	Acceso abierto
dc.description.project	Evaluating the impact of curriculum learning on the training process for an intelligent agent in a video game
dc.description.additional	Línea de investigación: Aprendizaje por refuerzo en videojuegos. In this document we present the results of several experiments with curriculum learning applied to a game AI learning process to measure its effects on the learning time, specifically we trained an agent using a reinforcement learning algorithm to play a video game running on a game simulation platform, then we trained another agent under the same conditions but including a training curriculum, which is a set of rules that modify the learning environment at specific times to make it easier to master by the agent at the beginning, then we compared both results. Our initial hypothesis is that in some cases using a training curriculum would allow the agent to learn faster, reducing the training time required. We describe in detail all the main elements of our work, including the choice of the game simulation platform to run the training experiments, the review of the reinforcement learning algorithms used to train the agent, the description of the video game selected as case study, the parameters used to design the training curriculums, and the discussion of the results obtained.
dc.type.driver	info:eu-repo/semantics/other
dc.type.version	info:eu-repo/semantics/acceptedVersion
dc.publisher.program	Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación
dc.description.degreelevel	Maestría
dc.publisher.branch	Universidad Nacional de Colombia - Sede Bogotá
dc.relation.references	Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. Proceedings of the 26th International Conference on Machine Learning, ICML 2009, 41–48. https://dl.acm.org/doi/10.1145/1553374.1553380
dc.relation.references	Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71–99. https://doi.org/10.1016/S0010-0277(02)00106-3
dc.relation.references	Harris, C. (1991). Parallel distributed processing models and metaphors for language and development. Ph.D. dissertation, University of California, San Diego. https://elibrary.ru/item.asp?id=5839109
dc.relation.references	Juliani, Arthur. (2017, December 8). Introducing ML-Agents Toolkit v0.2: Curriculum Learning, new environments, and more. https://blogs.unity3d.com/2017/12/08/introducing-ml-agents-v0-2-curriculum-learning-new-environments-and-more/
dc.relation.references	Gulcehre, C., Moczulski, M., Visin, F., & Bengio, Y. (2019). Mollifying networks. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings. http://arxiv.org/abs/1608.04980
dc.relation.references	Allgower, E. L., & Georg, K. (2003). Introduction to numerical continuation methods. In Classics in Applied Mathematics (Vol. 45). Colorado State University. https://doi.org/10.1137/1.9780898719154
dc.relation.references	Justesen, N., Bontrager, P., Togelius, J., & Risi, S. (2017). Deep Learning for Video Game Playing. IEEE Transactions on Games, 12(1), 1–20. https://doi.org/10.1109/tg.2019.2896986
dc.relation.references	Bellemare, M. G., Naddaf, Y., Veness, J., & Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. IJCAI International Joint Conference on Artificial Intelligence, 2013, 4148–4152. https://doi.org/10.1613/jair.3912
dc.relation.references	Montfort, N., & Bogost, I. (2009). Racing the beam: The Atari video computer system. MIT Press, Cambridge Massachusetts. https://pdfs.semanticscholar.org/2e91/086740f228934e05c3de97f01bc58368d313.pdf
dc.relation.references	Bhonker, N., Rozenberg, S., & Hubara, I. (2017). Playing SNES in the Retro Learning Environment. https://arxiv.org/pdf/1611.02205.pdf
dc.relation.references	Buşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. Studies in Computational Intelligence, 310, 183–221. https://doi.org/10.1007/978-3-642-14435-6_7
dc.relation.references	Kempka, M., Wydmuch, M., Runc, G., Toczek, J., & Jaskowski, W. (2016). ViZDoom: A Doom-based AI research platform for visual reinforcement learning. IEEE Conference on Computational Intelligence and Games, CIG, 0. https://doi.org/10.1109/CIG.2016.7860433
dc.relation.references	Beattie, C., Leibo, J. Z., Teplyashin, D., Ward, T., Wainwright, M., Küttler, H., Lefrancq, A., Green, S., Valdés, V., Sadik, A., Schrittwieser, J., Anderson, K., York, S., Cant, M., Cain, A., Bolton, A., Gaffney, S., King, H., Hassabis, D., … Petersen, S. (2016). DeepMind Lab. https://arxiv.org/pdf/1612.03801.pdf
dc.relation.references	Johnson, M., Hofmann, K., Hutton, T., & Bignell, D. (2016). The malmo platform for artificial intelligence experimentation. Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), 2016-January, 4246–4247. http://stella.sourceforge.net/
dc.relation.references	Synnaeve, G., Nardelli, N., Auvolat, A., Chintala, S., Lacroix, T., Lin, Z., Richoux, F., & Usunier, N. (2016). TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games. https://arxiv.org/pdf/1611.00625.pdf
dc.relation.references	Silva, V. do N., & Chaimowicz, L. (2017). MOBA: a New Arena for Game AI. https://arxiv.org/pdf/1705.10443.pdf
dc.relation.references	Karpov, I. V., Sheblak, J., & Miikkulainen, R. (2008). OpenNERO: A game platform for AI research and education. Proceedings of the 4th Artificial Intelligence and Interactive Digital Entertainment Conference, AIIDE 2008, 220–221. https://www.aaai.org/Papers/AIIDE/2008/AIIDE08-038.pdf
dc.relation.references	Juliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity: A General Platform for Intelligent Agents. https://arxiv.org/pdf/1809.02627.pdf
dc.relation.references	Juliani, A. (2017). Introducing: Unity Machine Learning Agents Toolkit. https://blogs.unity3d.com/2017/09/19/introducing-unity-machine-learning-agents/
dc.relation.references	Alpaydin, E. (2010). Introduction to Machine Learning. In Massachusetts Institute of Technology (Second Edition). The MIT Press. https://kkpatel7.files.wordpress.com/2015/04/alppaydin_machinelearning_2010
dc.relation.references	Sutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (Second Edition). The MIT Press. http://incompleteideas.net/sutton/book/RLbook2018.pdf
dc.relation.references	Wolfshaar, J. Van De. (2017). Deep Reinforcement Learning of Video Games [University of Groningen, The Netherlands]. http://fse.studenttheses.ub.rug.nl/15851/1/Artificial_Intelligence_Deep_R_1.pdf
dc.relation.references	Legg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444. https://doi.org/10.1007/s11023-007-9079-x
dc.relation.references	Schaul, T., Togelius, J., & Schmidhuber, J. (2011). Measuring Intelligence through Games. https://arxiv.org/pdf/1109.1314.pdf
dc.relation.references	Ortega, D. B., & Alonso, J. B. (2015). Machine Learning Applied to Pac-Man [Barcelona School of Informatics]. https://upcommons.upc.edu/bitstream/handle/2099.1/26448/108745.pdf
dc.relation.references	Lample, G., & Chaplot, D. S. (2016). Playing FPS Games with Deep Reinforcement Learning. https://arxiv.org/pdf/1609.05521.pdf
dc.relation.references	Adil, K., Jiang, F., Liu, S., Grigorev, A., Gupta, B. B., & Rho, S. (2017). Training an Agent for FPS Doom Game using Visual Reinforcement Learning and VizDoom. In (IJACSA) International Journal of Advanced Computer Science and Applications (Vol. 8, Issue 12). https://pdfs.semanticscholar.org/74c3/5bb13e71cdd8b5a553a7e65d9ed125ce958e.pdf
dc.relation.references	Wang, E., Kosson, A., & Mu, T. (2017). Deep Action Conditional Neural Network for Frame Prediction in Atari Games. http://cs231n.stanford.edu/reports/2017/pdfs/602.pdf
dc.relation.references	Karttunen, J., Kanervisto, A., Kyrki, V., & Hautamäki, V. (2020). From Video Game to Real Robot: The Transfer between Action Spaces. 5. https://arxiv.org/pdf/1905.00741.pdf
dc.relation.references	Martinez, M., Sitawarin, C., Finch, K., Meincke, L., Yablonski, A., & Kornhauser, A. (2017). Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars [Princeton University]. https://arxiv.org/pdf/1712.01397.pdf
dc.relation.references	Singh, S., Barto, A. G., & Chentanez, N. (2005). Intrinsically Motivated Reinforcement Learning. http://www.cs.cornell.edu/~helou/IMRL.pdf
dc.relation.references	Rockstar Games. (2020). https://www.rockstargames.com/
dc.relation.references	Mattar, M., Shih, J., Berges, V.-P., Elion, C., & Goy, C. (2020). Announcing ML-Agents Unity Package v1.0! Unity Blog. https://blogs.unity3d.com/2020/05/12/announcing-ml-agents-unity-package-v1-0/
dc.relation.references	Bertsekas, D., & Tsitsiklis, J. (1996). Neuro-Dynamic Programming. In Encyclopedia of Optimization. Springer US. https://doi.org/10.1007/978-0-387-74759-0_440
dc.relation.references	Shao, K., Tang, Z., Zhu, Y., Li, N., & Zhao, D. (2019). A Survey of Deep Reinforcement Learning in Video Games. https://arxiv.org/pdf/1912.10944.pdf
dc.relation.references	Wu, Y., & Tian, Y. (2017). Training agent for first-person shooter game with actor-critic curriculum learning. ICLR 2017, 10. https://openreview.net/pdf?id=Hk3mPK5gg
dc.relation.references	Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., Silver, D., & Kavukcuoglu, K. (2016, February 4). Asynchronous Methods for Deep Reinforcement Learning. 33rd International Conference on Machine Learning. https://arxiv.org/pdf/1602.01783.pdf
dc.relation.references	Arulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017, August 19). A Brief Survey of Deep Reinforcement Learning. IEEE Signal Processing Magazine. https://doi.org/10.1109/MSP.2017.2743240
dc.relation.references	Narvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M. E., & Stone, P. (2020). Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey. https://arxiv.org/pdf/2003.04960.pdf
dc.relation.references	Juliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity ML-Agents Toolkit. https://github.com/Unity-Technologies/ml-agents
dc.relation.references	Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://arxiv.org/pdf/1707.06347.pdf
dc.relation.references	Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. https://arxiv.org/pdf/1801.01290.pdf
dc.relation.references	Weng, L. (2018). A (Long) Peek into Reinforcement Learning. Lil Log. https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html
dc.relation.references	Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236
dc.relation.references	Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic Policy Gradient Algorithms. Proceedings of the 31st International Conference on Machine Learning. https://hal.inria.fr/file/index/docid/938992/filename/dpg-icml2014.pdf
dc.relation.references	Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016, September 9). Continuous control with deep reinforcement learning. ICLR 2016. https://arxiv.org/pdf/1509.02971.pdf
dc.relation.references	Barth-Maron, G., Hoffman, M. W., Budden, D., Dabney, W., Horgan, D., Tb, D., Muldal, A., Heess, N., & Lillicrap, T. (2018). Distributed distributional deterministic policy gradients. ICLR 2018. https://openreview.net/pdf?id=SyZipzbCb
dc.relation.references	Schulman, J., Levine, S., Moritz, P., Jordan, M. I., & Abbeel, P. (2015, February 19). Trust Region Policy Optimization. Proceeding of the 31st International Conference on Machine Learning. https://arxiv.org/pdf/1502.05477.pdf
dc.relation.references	Wang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K., & de Freitas, N. (2017, November 3). Sample Efficient Actor-Critic with Experience Replay. ICLR 2017. https://arxiv.org/pdf/1611.01224.pdf
dc.relation.references	Wu, Y., Mansimov, E., Liao, S., Grosse, R., & Ba, J. (2017). Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. https://arxiv.org/pdf/1708.05144.pdf
dc.relation.references	Fujimoto, S., van Hoof, H., & Meger, D. (2018, February 26). Addressing Function Approximation Error in Actor-Critic Methods. Proceedings of the 35th International Conference on Machine Learning. https://arxiv.org/pdf/1802.09477.pdf
dc.relation.references	Liu, Y., Ramachandran, P., Liu, Q., & Peng, J. (2017). Stein Variational Policy Gradient. https://arxiv.org/pdf/1704.02399.pdf
dc.relation.references	Espeholt, L., Soyer, H., Munos, R., Simonyan, K., Mnih, V., Ward, T., Doron, Y., Firoiu, V., Harley, T., Dunning, I., Legg, S., & Kavukcuoglu, K. (2018). IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. https://arxiv.org/pdf/1802.01561.pdf
dc.relation.references	Schulman, J., Klimov, O., Wolski, F., Dhariwal, P., & Radford, A. (2017). Proximal Policy Optimization. https://openai.com/blog/openai-baselines-ppo/
dc.relation.references	Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., & Levine, S. (2019). Soft Actor-Critic Algorithms and Applications. https://arxiv.org/pdf/1812.05905.pdf
dc.relation.references	Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
dc.relation.references	Wydmuch, M., Kempka, M., & Jaskowski, W. (2018). ViZDoom Competitions: Playing Doom from Pixels. IEEE Transactions on Games, 11(3), 248–259. https://doi.org/10.1109/tg.2018.2877047
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.subject.proposal	Aprendizaje por Curriculos
dc.subject.proposal	Curriculum Learning
dc.subject.proposal	Aprendizaje por Refuerzo
dc.subject.proposal	Reinforcement Learning
dc.subject.proposal	Training Curriculum
dc.subject.proposal	Curriculo de Entrenamiento
dc.subject.proposal	Media de Recompensa Acumulada
dc.subject.proposal	Mean Cumulative Reward
dc.subject.proposal	Proximal Policy Optimization
dc.subject.proposal	Optimizacion por Politica Proxima
dc.subject.proposal	Videojuegos
dc.subject.proposal	Video Games
dc.subject.proposal	Game AI
dc.subject.proposal	Inteligencia Artificial en Videojuegos
dc.subject.proposal	Unity Machine Learning Agents
dc.subject.proposal	Agentes de Aprendizaje Automatico de Unity
dc.subject.proposal	Kit de Herramientas de Aprendizaje Automatico de Unity
dc.subject.proposal	Unity ML-Agents Toolkit
dc.subject.proposal	Unity Engine
dc.subject.proposal	Motor de Videojuegos Unity
dc.type.coar	http://purl.org/coar/resource_type/c_1843
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.content	Text
oaire.accessrights	http://purl.org/coar/access_right/c_abf2

Archivos en el documento

Nombre:: 1018417749.2020.pdf
Tamaño:: 11.15Mb
Formato:: PDF

Descargar

Nombre:: 1018417749.2020.paper.pdf
Tamaño:: 1.249Mb
Formato:: PDF

Descargar

Este documento aparece en la(s) siguiente(s) colección(ones)

Maestría en Ingeniería - Sistemas y Computación [311]

Mostrar el registro sencillo del documento

Atribución-NoComercial 4.0 Internacional

Esta obra está bajo licencia internacional Creative Commons Reconocimiento-NoComercial 4.0.Este documento ha sido depositado por parte de el(los) autor(es) bajo la siguiente constancia de depósito