Evaluating the impact of curriculum learning on the training process for an intelligent agent in a video game

dc.contributor.advisorCamargo Mendoza, Jorge Eliécerspa
dc.contributor.authorSáenz Imbacuán, Rigobertospa
dc.date.accessioned2020-11-06T14:33:26Zspa
dc.date.available2020-11-06T14:33:26Zspa
dc.date.issued2020-07-07spa
dc.description.abstractWe want to measure the impact of the curriculum learning technique on a reinforcement training setup, several experiments were designed with different training curriculums adapted for the video game chosen as a case study. Then all were executed on a selected game simulation platform, using two reinforcement learning algorithms, and using the mean cumulative reward as a performance measure. Results suggest that curriculum learning has a significant impact on the training process, increasing training times in some cases, and decreasing them up to 40% percent in some other cases.spa
dc.description.abstractSe desea medir el impacto de la técnica de aprendizaje por currículos sobre el tiempo de entrenamiento de un agente inteligente que está aprendiendo a jugar un video juego usando aprendizaje por refuerzo, para esto se diseñaron varios experimentos con diferentes currículos adaptados para el video juego seleccionado como caso de estudio, y se ejecutaron en una plataforma de simulación de juegos seleccionada, usando dos algoritmos de aprendizaje por refuerzo, y midiendo su desempeño usando la recompensa media acumulada. Los resultados sugieren que usar aprendizaje por currículos tiene un impacto significativo sobre el proceso de entrenamiento, en algunos casos alargando los tiempos de entrenamiento, y en otros casos disminuyéndolos en hasta en un 40% por ciento.spa
dc.description.additionalLínea de investigación: Aprendizaje por refuerzo en videojuegos. In this document we present the results of several experiments with curriculum learning applied to a game AI learning process to measure its effects on the learning time, specifically we trained an agent using a reinforcement learning algorithm to play a video game running on a game simulation platform, then we trained another agent under the same conditions but including a training curriculum, which is a set of rules that modify the learning environment at specific times to make it easier to master by the agent at the beginning, then we compared both results. Our initial hypothesis is that in some cases using a training curriculum would allow the agent to learn faster, reducing the training time required. We describe in detail all the main elements of our work, including the choice of the game simulation platform to run the training experiments, the review of the reinforcement learning algorithms used to train the agent, the description of the video game selected as case study, the parameters used to design the training curriculums, and the discussion of the results obtained.spa
dc.description.degreelevelMaestríaspa
dc.description.projectEvaluating the impact of curriculum learning on the training process for an intelligent agent in a video gamespa
dc.format.extent97spa
dc.format.mimetypeapplication/pdfspa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/78592
dc.language.isoengspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.programBogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computaciónspa
dc.relation.referencesBengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. Proceedings of the 26th International Conference on Machine Learning, ICML 2009, 41–48. https://dl.acm.org/doi/10.1145/1553374.1553380spa
dc.relation.referencesElman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71–99. https://doi.org/10.1016/S0010-0277(02)00106-3spa
dc.relation.referencesHarris, C. (1991). Parallel distributed processing models and metaphors for language and development. Ph.D. dissertation, University of California, San Diego. https://elibrary.ru/item.asp?id=5839109spa
dc.relation.referencesJuliani, Arthur. (2017, December 8). Introducing ML-Agents Toolkit v0.2: Curriculum Learning, new environments, and more. https://blogs.unity3d.com/2017/12/08/introducing-ml-agents-v0-2-curriculum-learning-new-environments-and-more/spa
dc.relation.referencesGulcehre, C., Moczulski, M., Visin, F., & Bengio, Y. (2019). Mollifying networks. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings. http://arxiv.org/abs/1608.04980spa
dc.relation.referencesAllgower, E. L., & Georg, K. (2003). Introduction to numerical continuation methods. In Classics in Applied Mathematics (Vol. 45). Colorado State University. https://doi.org/10.1137/1.9780898719154spa
dc.relation.referencesJustesen, N., Bontrager, P., Togelius, J., & Risi, S. (2017). Deep Learning for Video Game Playing. IEEE Transactions on Games, 12(1), 1–20. https://doi.org/10.1109/tg.2019.2896986spa
dc.relation.referencesBellemare, M. G., Naddaf, Y., Veness, J., & Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. IJCAI International Joint Conference on Artificial Intelligence, 2013, 4148–4152. https://doi.org/10.1613/jair.3912spa
dc.relation.referencesMontfort, N., & Bogost, I. (2009). Racing the beam: The Atari video computer system. MIT Press, Cambridge Massachusetts. https://pdfs.semanticscholar.org/2e91/086740f228934e05c3de97f01bc58368d313.pdfspa
dc.relation.referencesBhonker, N., Rozenberg, S., & Hubara, I. (2017). Playing SNES in the Retro Learning Environment. https://arxiv.org/pdf/1611.02205.pdfspa
dc.relation.referencesBuşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. Studies in Computational Intelligence, 310, 183–221. https://doi.org/10.1007/978-3-642-14435-6_7spa
dc.relation.referencesKempka, M., Wydmuch, M., Runc, G., Toczek, J., & Jaskowski, W. (2016). ViZDoom: A Doom-based AI research platform for visual reinforcement learning. IEEE Conference on Computational Intelligence and Games, CIG, 0. https://doi.org/10.1109/CIG.2016.7860433spa
dc.relation.referencesBeattie, C., Leibo, J. Z., Teplyashin, D., Ward, T., Wainwright, M., Küttler, H., Lefrancq, A., Green, S., Valdés, V., Sadik, A., Schrittwieser, J., Anderson, K., York, S., Cant, M., Cain, A., Bolton, A., Gaffney, S., King, H., Hassabis, D., … Petersen, S. (2016). DeepMind Lab. https://arxiv.org/pdf/1612.03801.pdfspa
dc.relation.referencesJohnson, M., Hofmann, K., Hutton, T., & Bignell, D. (2016). The malmo platform for artificial intelligence experimentation. Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), 2016-January, 4246–4247. http://stella.sourceforge.net/spa
dc.relation.referencesSynnaeve, G., Nardelli, N., Auvolat, A., Chintala, S., Lacroix, T., Lin, Z., Richoux, F., & Usunier, N. (2016). TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games. https://arxiv.org/pdf/1611.00625.pdfspa
dc.relation.referencesSilva, V. do N., & Chaimowicz, L. (2017). MOBA: a New Arena for Game AI. https://arxiv.org/pdf/1705.10443.pdfspa
dc.relation.referencesKarpov, I. V., Sheblak, J., & Miikkulainen, R. (2008). OpenNERO: A game platform for AI research and education. Proceedings of the 4th Artificial Intelligence and Interactive Digital Entertainment Conference, AIIDE 2008, 220–221. https://www.aaai.org/Papers/AIIDE/2008/AIIDE08-038.pdfspa
dc.relation.referencesJuliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity: A General Platform for Intelligent Agents. https://arxiv.org/pdf/1809.02627.pdfspa
dc.relation.referencesJuliani, A. (2017). Introducing: Unity Machine Learning Agents Toolkit. https://blogs.unity3d.com/2017/09/19/introducing-unity-machine-learning-agents/spa
dc.relation.referencesAlpaydin, E. (2010). Introduction to Machine Learning. In Massachusetts Institute of Technology (Second Edition). The MIT Press. https://kkpatel7.files.wordpress.com/2015/04/alppaydin_machinelearning_2010spa
dc.relation.referencesSutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (Second Edition). The MIT Press. http://incompleteideas.net/sutton/book/RLbook2018.pdfspa
dc.relation.referencesWolfshaar, J. Van De. (2017). Deep Reinforcement Learning of Video Games [University of Groningen, The Netherlands]. http://fse.studenttheses.ub.rug.nl/15851/1/Artificial_Intelligence_Deep_R_1.pdfspa
dc.relation.referencesLegg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444. https://doi.org/10.1007/s11023-007-9079-xspa
dc.relation.referencesSchaul, T., Togelius, J., & Schmidhuber, J. (2011). Measuring Intelligence through Games. https://arxiv.org/pdf/1109.1314.pdfspa
dc.relation.referencesOrtega, D. B., & Alonso, J. B. (2015). Machine Learning Applied to Pac-Man [Barcelona School of Informatics]. https://upcommons.upc.edu/bitstream/handle/2099.1/26448/108745.pdfspa
dc.relation.referencesLample, G., & Chaplot, D. S. (2016). Playing FPS Games with Deep Reinforcement Learning. https://arxiv.org/pdf/1609.05521.pdfspa
dc.relation.referencesAdil, K., Jiang, F., Liu, S., Grigorev, A., Gupta, B. B., & Rho, S. (2017). Training an Agent for FPS Doom Game using Visual Reinforcement Learning and VizDoom. In (IJACSA) International Journal of Advanced Computer Science and Applications (Vol. 8, Issue 12). https://pdfs.semanticscholar.org/74c3/5bb13e71cdd8b5a553a7e65d9ed125ce958e.pdfspa
dc.relation.referencesWang, E., Kosson, A., & Mu, T. (2017). Deep Action Conditional Neural Network for Frame Prediction in Atari Games. http://cs231n.stanford.edu/reports/2017/pdfs/602.pdfspa
dc.relation.referencesKarttunen, J., Kanervisto, A., Kyrki, V., & Hautamäki, V. (2020). From Video Game to Real Robot: The Transfer between Action Spaces. 5. https://arxiv.org/pdf/1905.00741.pdfspa
dc.relation.referencesMartinez, M., Sitawarin, C., Finch, K., Meincke, L., Yablonski, A., & Kornhauser, A. (2017). Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars [Princeton University]. https://arxiv.org/pdf/1712.01397.pdfspa
dc.relation.referencesSingh, S., Barto, A. G., & Chentanez, N. (2005). Intrinsically Motivated Reinforcement Learning. http://www.cs.cornell.edu/~helou/IMRL.pdfspa
dc.relation.referencesRockstar Games. (2020). https://www.rockstargames.com/spa
dc.relation.referencesMattar, M., Shih, J., Berges, V.-P., Elion, C., & Goy, C. (2020). Announcing ML-Agents Unity Package v1.0! Unity Blog. https://blogs.unity3d.com/2020/05/12/announcing-ml-agents-unity-package-v1-0/spa
dc.relation.referencesBertsekas, D., & Tsitsiklis, J. (1996). Neuro-Dynamic Programming. In Encyclopedia of Optimization. Springer US. https://doi.org/10.1007/978-0-387-74759-0_440spa
dc.relation.referencesShao, K., Tang, Z., Zhu, Y., Li, N., & Zhao, D. (2019). A Survey of Deep Reinforcement Learning in Video Games. https://arxiv.org/pdf/1912.10944.pdfspa
dc.relation.referencesWu, Y., & Tian, Y. (2017). Training agent for first-person shooter game with actor-critic curriculum learning. ICLR 2017, 10. https://openreview.net/pdf?id=Hk3mPK5ggspa
dc.relation.referencesMnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., Silver, D., & Kavukcuoglu, K. (2016, February 4). Asynchronous Methods for Deep Reinforcement Learning. 33rd International Conference on Machine Learning. https://arxiv.org/pdf/1602.01783.pdfspa
dc.relation.referencesArulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017, August 19). A Brief Survey of Deep Reinforcement Learning. IEEE Signal Processing Magazine. https://doi.org/10.1109/MSP.2017.2743240spa
dc.relation.referencesNarvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M. E., & Stone, P. (2020). Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey. https://arxiv.org/pdf/2003.04960.pdfspa
dc.relation.referencesJuliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity ML-Agents Toolkit. https://github.com/Unity-Technologies/ml-agentsspa
dc.relation.referencesSchulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://arxiv.org/pdf/1707.06347.pdfspa
dc.relation.referencesHaarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. https://arxiv.org/pdf/1801.01290.pdfspa
dc.relation.referencesWeng, L. (2018). A (Long) Peek into Reinforcement Learning. Lil Log. https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.htmlspa
dc.relation.referencesMnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236spa
dc.relation.referencesSilver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic Policy Gradient Algorithms. Proceedings of the 31st International Conference on Machine Learning. https://hal.inria.fr/file/index/docid/938992/filename/dpg-icml2014.pdfspa
dc.relation.referencesLillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016, September 9). Continuous control with deep reinforcement learning. ICLR 2016. https://arxiv.org/pdf/1509.02971.pdfspa
dc.relation.referencesBarth-Maron, G., Hoffman, M. W., Budden, D., Dabney, W., Horgan, D., Tb, D., Muldal, A., Heess, N., & Lillicrap, T. (2018). Distributed distributional deterministic policy gradients. ICLR 2018. https://openreview.net/pdf?id=SyZipzbCbspa
dc.relation.referencesSchulman, J., Levine, S., Moritz, P., Jordan, M. I., & Abbeel, P. (2015, February 19). Trust Region Policy Optimization. Proceeding of the 31st International Conference on Machine Learning. https://arxiv.org/pdf/1502.05477.pdfspa
dc.relation.referencesWang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K., & de Freitas, N. (2017, November 3). Sample Efficient Actor-Critic with Experience Replay. ICLR 2017. https://arxiv.org/pdf/1611.01224.pdfspa
dc.relation.referencesWu, Y., Mansimov, E., Liao, S., Grosse, R., & Ba, J. (2017). Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. https://arxiv.org/pdf/1708.05144.pdfspa
dc.relation.referencesFujimoto, S., van Hoof, H., & Meger, D. (2018, February 26). Addressing Function Approximation Error in Actor-Critic Methods. Proceedings of the 35th International Conference on Machine Learning. https://arxiv.org/pdf/1802.09477.pdfspa
dc.relation.referencesLiu, Y., Ramachandran, P., Liu, Q., & Peng, J. (2017). Stein Variational Policy Gradient. https://arxiv.org/pdf/1704.02399.pdfspa
dc.relation.referencesEspeholt, L., Soyer, H., Munos, R., Simonyan, K., Mnih, V., Ward, T., Doron, Y., Firoiu, V., Harley, T., Dunning, I., Legg, S., & Kavukcuoglu, K. (2018). IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. https://arxiv.org/pdf/1802.01561.pdfspa
dc.relation.referencesSchulman, J., Klimov, O., Wolski, F., Dhariwal, P., & Radford, A. (2017). Proximal Policy Optimization. https://openai.com/blog/openai-baselines-ppo/spa
dc.relation.referencesHaarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., & Levine, S. (2019). Soft Actor-Critic Algorithms and Applications. https://arxiv.org/pdf/1812.05905.pdfspa
dc.relation.referencesHochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735spa
dc.relation.referencesWydmuch, M., Kempka, M., & Jaskowski, W. (2018). ViZDoom Competitions: Playing Doom from Pixels. IEEE Transactions on Games, 11(3), 248–259. https://doi.org/10.1109/tg.2018.2877047spa
dc.rightsDerechos reservados - Universidad Nacional de Colombiaspa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseAtribución-NoComercial 4.0 Internacionalspa
dc.rights.spaAcceso abiertospa
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/spa
dc.subject.ddc000 - Ciencias de la computación, información y obras generalesspa
dc.subject.proposalAprendizaje por Curriculosspa
dc.subject.proposalCurriculum Learningeng
dc.subject.proposalAprendizaje por Refuerzospa
dc.subject.proposalReinforcement Learningeng
dc.subject.proposalTraining Curriculumeng
dc.subject.proposalCurriculo de Entrenamientospa
dc.subject.proposalMedia de Recompensa Acumuladaspa
dc.subject.proposalMean Cumulative Rewardeng
dc.subject.proposalProximal Policy Optimizationeng
dc.subject.proposalOptimizacion por Politica Proximaspa
dc.subject.proposalVideojuegosspa
dc.subject.proposalVideo Gameseng
dc.subject.proposalGame AIeng
dc.subject.proposalInteligencia Artificial en Videojuegosspa
dc.subject.proposalUnity Machine Learning Agentseng
dc.subject.proposalAgentes de Aprendizaje Automatico de Unityspa
dc.subject.proposalKit de Herramientas de Aprendizaje Automatico de Unityspa
dc.subject.proposalUnity ML-Agents Toolkiteng
dc.subject.proposalUnity Engineeng
dc.subject.proposalMotor de Videojuegos Unityspa
dc.titleEvaluating the impact of curriculum learning on the training process for an intelligent agent in a video gamespa
dc.title.alternativeEvaluando el impacto del aprendizaje por currículos en el proceso de entrenamiento de un agente inteligente en un videojuegospa
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdccspa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/masterThesisspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 2 de 2
Cargando...
Miniatura
Nombre:
1018417749.2020.pdf
Tamaño:
11.16 MB
Formato:
Adobe Portable Document Format
Cargando...
Miniatura
Nombre:
1018417749.2020.paper.pdf
Tamaño:
1.25 MB
Formato:
Adobe Portable Document Format

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
3.8 KB
Formato:
Item-specific license agreed upon to submission
Descripción: