Mostrar el registro sencillo del documento

dc.rights.licenseAtribución-NoComercial 4.0 Internacional
dc.contributor.advisorCamargo Mendoza, Jorge Eliécer
dc.contributor.authorSáenz Imbacuán, Rigoberto
dc.date.accessioned2020-11-06T14:33:26Z
dc.date.available2020-11-06T14:33:26Z
dc.date.issued2020-07-07
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/78592
dc.description.abstractWe want to measure the impact of the curriculum learning technique on a reinforcement training setup, several experiments were designed with different training curriculums adapted for the video game chosen as a case study. Then all were executed on a selected game simulation platform, using two reinforcement learning algorithms, and using the mean cumulative reward as a performance measure. Results suggest that curriculum learning has a significant impact on the training process, increasing training times in some cases, and decreasing them up to 40% percent in some other cases.
dc.description.abstractSe desea medir el impacto de la técnica de aprendizaje por currículos sobre el tiempo de entrenamiento de un agente inteligente que está aprendiendo a jugar un video juego usando aprendizaje por refuerzo, para esto se diseñaron varios experimentos con diferentes currículos adaptados para el video juego seleccionado como caso de estudio, y se ejecutaron en una plataforma de simulación de juegos seleccionada, usando dos algoritmos de aprendizaje por refuerzo, y midiendo su desempeño usando la recompensa media acumulada. Los resultados sugieren que usar aprendizaje por currículos tiene un impacto significativo sobre el proceso de entrenamiento, en algunos casos alargando los tiempos de entrenamiento, y en otros casos disminuyéndolos en hasta en un 40% por ciento.
dc.format.extent97
dc.format.mimetypeapplication/pdf
dc.language.isoeng
dc.rightsDerechos reservados - Universidad Nacional de Colombia
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subject.ddc000 - Ciencias de la computación, información y obras generales
dc.titleEvaluating the impact of curriculum learning on the training process for an intelligent agent in a video game
dc.title.alternativeEvaluando el impacto del aprendizaje por currículos en el proceso de entrenamiento de un agente inteligente en un videojuego
dc.typeOtro
dc.rights.spaAcceso abierto
dc.description.projectEvaluating the impact of curriculum learning on the training process for an intelligent agent in a video game
dc.description.additionalLínea de investigación: Aprendizaje por refuerzo en videojuegos. In this document we present the results of several experiments with curriculum learning applied to a game AI learning process to measure its effects on the learning time, specifically we trained an agent using a reinforcement learning algorithm to play a video game running on a game simulation platform, then we trained another agent under the same conditions but including a training curriculum, which is a set of rules that modify the learning environment at specific times to make it easier to master by the agent at the beginning, then we compared both results. Our initial hypothesis is that in some cases using a training curriculum would allow the agent to learn faster, reducing the training time required. We describe in detail all the main elements of our work, including the choice of the game simulation platform to run the training experiments, the review of the reinforcement learning algorithms used to train the agent, the description of the video game selected as case study, the parameters used to design the training curriculums, and the discussion of the results obtained.
dc.type.driverinfo:eu-repo/semantics/other
dc.type.versioninfo:eu-repo/semantics/acceptedVersion
dc.publisher.programBogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación
dc.description.degreelevelMaestría
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotá
dc.relation.referencesBengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. Proceedings of the 26th International Conference on Machine Learning, ICML 2009, 41–48. https://dl.acm.org/doi/10.1145/1553374.1553380
dc.relation.referencesElman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71–99. https://doi.org/10.1016/S0010-0277(02)00106-3
dc.relation.referencesHarris, C. (1991). Parallel distributed processing models and metaphors for language and development. Ph.D. dissertation, University of California, San Diego. https://elibrary.ru/item.asp?id=5839109
dc.relation.referencesJuliani, Arthur. (2017, December 8). Introducing ML-Agents Toolkit v0.2: Curriculum Learning, new environments, and more. https://blogs.unity3d.com/2017/12/08/introducing-ml-agents-v0-2-curriculum-learning-new-environments-and-more/
dc.relation.referencesGulcehre, C., Moczulski, M., Visin, F., & Bengio, Y. (2019). Mollifying networks. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings. http://arxiv.org/abs/1608.04980
dc.relation.referencesAllgower, E. L., & Georg, K. (2003). Introduction to numerical continuation methods. In Classics in Applied Mathematics (Vol. 45). Colorado State University. https://doi.org/10.1137/1.9780898719154
dc.relation.referencesJustesen, N., Bontrager, P., Togelius, J., & Risi, S. (2017). Deep Learning for Video Game Playing. IEEE Transactions on Games, 12(1), 1–20. https://doi.org/10.1109/tg.2019.2896986
dc.relation.referencesBellemare, M. G., Naddaf, Y., Veness, J., & Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. IJCAI International Joint Conference on Artificial Intelligence, 2013, 4148–4152. https://doi.org/10.1613/jair.3912
dc.relation.referencesMontfort, N., & Bogost, I. (2009). Racing the beam: The Atari video computer system. MIT Press, Cambridge Massachusetts. https://pdfs.semanticscholar.org/2e91/086740f228934e05c3de97f01bc58368d313.pdf
dc.relation.referencesBhonker, N., Rozenberg, S., & Hubara, I. (2017). Playing SNES in the Retro Learning Environment. https://arxiv.org/pdf/1611.02205.pdf
dc.relation.referencesBuşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. Studies in Computational Intelligence, 310, 183–221. https://doi.org/10.1007/978-3-642-14435-6_7
dc.relation.referencesKempka, M., Wydmuch, M., Runc, G., Toczek, J., & Jaskowski, W. (2016). ViZDoom: A Doom-based AI research platform for visual reinforcement learning. IEEE Conference on Computational Intelligence and Games, CIG, 0. https://doi.org/10.1109/CIG.2016.7860433
dc.relation.referencesBeattie, C., Leibo, J. Z., Teplyashin, D., Ward, T., Wainwright, M., Küttler, H., Lefrancq, A., Green, S., Valdés, V., Sadik, A., Schrittwieser, J., Anderson, K., York, S., Cant, M., Cain, A., Bolton, A., Gaffney, S., King, H., Hassabis, D., … Petersen, S. (2016). DeepMind Lab. https://arxiv.org/pdf/1612.03801.pdf
dc.relation.referencesJohnson, M., Hofmann, K., Hutton, T., & Bignell, D. (2016). The malmo platform for artificial intelligence experimentation. Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16), 2016-January, 4246–4247. http://stella.sourceforge.net/
dc.relation.referencesSynnaeve, G., Nardelli, N., Auvolat, A., Chintala, S., Lacroix, T., Lin, Z., Richoux, F., & Usunier, N. (2016). TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games. https://arxiv.org/pdf/1611.00625.pdf
dc.relation.referencesSilva, V. do N., & Chaimowicz, L. (2017). MOBA: a New Arena for Game AI. https://arxiv.org/pdf/1705.10443.pdf
dc.relation.referencesKarpov, I. V., Sheblak, J., & Miikkulainen, R. (2008). OpenNERO: A game platform for AI research and education. Proceedings of the 4th Artificial Intelligence and Interactive Digital Entertainment Conference, AIIDE 2008, 220–221. https://www.aaai.org/Papers/AIIDE/2008/AIIDE08-038.pdf
dc.relation.referencesJuliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity: A General Platform for Intelligent Agents. https://arxiv.org/pdf/1809.02627.pdf
dc.relation.referencesJuliani, A. (2017). Introducing: Unity Machine Learning Agents Toolkit. https://blogs.unity3d.com/2017/09/19/introducing-unity-machine-learning-agents/
dc.relation.referencesAlpaydin, E. (2010). Introduction to Machine Learning. In Massachusetts Institute of Technology (Second Edition). The MIT Press. https://kkpatel7.files.wordpress.com/2015/04/alppaydin_machinelearning_2010
dc.relation.referencesSutton, R. S., & Barto, A. G. (2018). Reinforcement Learning: An Introduction (Second Edition). The MIT Press. http://incompleteideas.net/sutton/book/RLbook2018.pdf
dc.relation.referencesWolfshaar, J. Van De. (2017). Deep Reinforcement Learning of Video Games [University of Groningen, The Netherlands]. http://fse.studenttheses.ub.rug.nl/15851/1/Artificial_Intelligence_Deep_R_1.pdf
dc.relation.referencesLegg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444. https://doi.org/10.1007/s11023-007-9079-x
dc.relation.referencesSchaul, T., Togelius, J., & Schmidhuber, J. (2011). Measuring Intelligence through Games. https://arxiv.org/pdf/1109.1314.pdf
dc.relation.referencesOrtega, D. B., & Alonso, J. B. (2015). Machine Learning Applied to Pac-Man [Barcelona School of Informatics]. https://upcommons.upc.edu/bitstream/handle/2099.1/26448/108745.pdf
dc.relation.referencesLample, G., & Chaplot, D. S. (2016). Playing FPS Games with Deep Reinforcement Learning. https://arxiv.org/pdf/1609.05521.pdf
dc.relation.referencesAdil, K., Jiang, F., Liu, S., Grigorev, A., Gupta, B. B., & Rho, S. (2017). Training an Agent for FPS Doom Game using Visual Reinforcement Learning and VizDoom. In (IJACSA) International Journal of Advanced Computer Science and Applications (Vol. 8, Issue 12). https://pdfs.semanticscholar.org/74c3/5bb13e71cdd8b5a553a7e65d9ed125ce958e.pdf
dc.relation.referencesWang, E., Kosson, A., & Mu, T. (2017). Deep Action Conditional Neural Network for Frame Prediction in Atari Games. http://cs231n.stanford.edu/reports/2017/pdfs/602.pdf
dc.relation.referencesKarttunen, J., Kanervisto, A., Kyrki, V., & Hautamäki, V. (2020). From Video Game to Real Robot: The Transfer between Action Spaces. 5. https://arxiv.org/pdf/1905.00741.pdf
dc.relation.referencesMartinez, M., Sitawarin, C., Finch, K., Meincke, L., Yablonski, A., & Kornhauser, A. (2017). Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars [Princeton University]. https://arxiv.org/pdf/1712.01397.pdf
dc.relation.referencesSingh, S., Barto, A. G., & Chentanez, N. (2005). Intrinsically Motivated Reinforcement Learning. http://www.cs.cornell.edu/~helou/IMRL.pdf
dc.relation.referencesRockstar Games. (2020). https://www.rockstargames.com/
dc.relation.referencesMattar, M., Shih, J., Berges, V.-P., Elion, C., & Goy, C. (2020). Announcing ML-Agents Unity Package v1.0! Unity Blog. https://blogs.unity3d.com/2020/05/12/announcing-ml-agents-unity-package-v1-0/
dc.relation.referencesBertsekas, D., & Tsitsiklis, J. (1996). Neuro-Dynamic Programming. In Encyclopedia of Optimization. Springer US. https://doi.org/10.1007/978-0-387-74759-0_440
dc.relation.referencesShao, K., Tang, Z., Zhu, Y., Li, N., & Zhao, D. (2019). A Survey of Deep Reinforcement Learning in Video Games. https://arxiv.org/pdf/1912.10944.pdf
dc.relation.referencesWu, Y., & Tian, Y. (2017). Training agent for first-person shooter game with actor-critic curriculum learning. ICLR 2017, 10. https://openreview.net/pdf?id=Hk3mPK5gg
dc.relation.referencesMnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., Silver, D., & Kavukcuoglu, K. (2016, February 4). Asynchronous Methods for Deep Reinforcement Learning. 33rd International Conference on Machine Learning. https://arxiv.org/pdf/1602.01783.pdf
dc.relation.referencesArulkumaran, K., Deisenroth, M. P., Brundage, M., & Bharath, A. A. (2017, August 19). A Brief Survey of Deep Reinforcement Learning. IEEE Signal Processing Magazine. https://doi.org/10.1109/MSP.2017.2743240
dc.relation.referencesNarvekar, S., Peng, B., Leonetti, M., Sinapov, J., Taylor, M. E., & Stone, P. (2020). Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey. https://arxiv.org/pdf/2003.04960.pdf
dc.relation.referencesJuliani, A., Berges, V.-P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., Mattar, M., & Lange, D. (2020). Unity ML-Agents Toolkit. https://github.com/Unity-Technologies/ml-agents
dc.relation.referencesSchulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. https://arxiv.org/pdf/1707.06347.pdf
dc.relation.referencesHaarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. https://arxiv.org/pdf/1801.01290.pdf
dc.relation.referencesWeng, L. (2018). A (Long) Peek into Reinforcement Learning. Lil Log. https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-learning.html
dc.relation.referencesMnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236
dc.relation.referencesSilver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic Policy Gradient Algorithms. Proceedings of the 31st International Conference on Machine Learning. https://hal.inria.fr/file/index/docid/938992/filename/dpg-icml2014.pdf
dc.relation.referencesLillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016, September 9). Continuous control with deep reinforcement learning. ICLR 2016. https://arxiv.org/pdf/1509.02971.pdf
dc.relation.referencesBarth-Maron, G., Hoffman, M. W., Budden, D., Dabney, W., Horgan, D., Tb, D., Muldal, A., Heess, N., & Lillicrap, T. (2018). Distributed distributional deterministic policy gradients. ICLR 2018. https://openreview.net/pdf?id=SyZipzbCb
dc.relation.referencesSchulman, J., Levine, S., Moritz, P., Jordan, M. I., & Abbeel, P. (2015, February 19). Trust Region Policy Optimization. Proceeding of the 31st International Conference on Machine Learning. https://arxiv.org/pdf/1502.05477.pdf
dc.relation.referencesWang, Z., Bapst, V., Heess, N., Mnih, V., Munos, R., Kavukcuoglu, K., & de Freitas, N. (2017, November 3). Sample Efficient Actor-Critic with Experience Replay. ICLR 2017. https://arxiv.org/pdf/1611.01224.pdf
dc.relation.referencesWu, Y., Mansimov, E., Liao, S., Grosse, R., & Ba, J. (2017). Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. https://arxiv.org/pdf/1708.05144.pdf
dc.relation.referencesFujimoto, S., van Hoof, H., & Meger, D. (2018, February 26). Addressing Function Approximation Error in Actor-Critic Methods. Proceedings of the 35th International Conference on Machine Learning. https://arxiv.org/pdf/1802.09477.pdf
dc.relation.referencesLiu, Y., Ramachandran, P., Liu, Q., & Peng, J. (2017). Stein Variational Policy Gradient. https://arxiv.org/pdf/1704.02399.pdf
dc.relation.referencesEspeholt, L., Soyer, H., Munos, R., Simonyan, K., Mnih, V., Ward, T., Doron, Y., Firoiu, V., Harley, T., Dunning, I., Legg, S., & Kavukcuoglu, K. (2018). IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. https://arxiv.org/pdf/1802.01561.pdf
dc.relation.referencesSchulman, J., Klimov, O., Wolski, F., Dhariwal, P., & Radford, A. (2017). Proximal Policy Optimization. https://openai.com/blog/openai-baselines-ppo/
dc.relation.referencesHaarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., & Levine, S. (2019). Soft Actor-Critic Algorithms and Applications. https://arxiv.org/pdf/1812.05905.pdf
dc.relation.referencesHochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
dc.relation.referencesWydmuch, M., Kempka, M., & Jaskowski, W. (2018). ViZDoom Competitions: Playing Doom from Pixels. IEEE Transactions on Games, 11(3), 248–259. https://doi.org/10.1109/tg.2018.2877047
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.subject.proposalAprendizaje por Curriculos
dc.subject.proposalCurriculum Learning
dc.subject.proposalAprendizaje por Refuerzo
dc.subject.proposalReinforcement Learning
dc.subject.proposalTraining Curriculum
dc.subject.proposalCurriculo de Entrenamiento
dc.subject.proposalMedia de Recompensa Acumulada
dc.subject.proposalMean Cumulative Reward
dc.subject.proposalProximal Policy Optimization
dc.subject.proposalOptimizacion por Politica Proxima
dc.subject.proposalVideojuegos
dc.subject.proposalVideo Games
dc.subject.proposalGame AI
dc.subject.proposalInteligencia Artificial en Videojuegos
dc.subject.proposalUnity Machine Learning Agents
dc.subject.proposalAgentes de Aprendizaje Automatico de Unity
dc.subject.proposalKit de Herramientas de Aprendizaje Automatico de Unity
dc.subject.proposalUnity ML-Agents Toolkit
dc.subject.proposalUnity Engine
dc.subject.proposalMotor de Videojuegos Unity
dc.type.coarhttp://purl.org/coar/resource_type/c_1843
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.contentText
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2


Archivos en el documento

Thumbnail
Thumbnail

Este documento aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del documento

Atribución-NoComercial 4.0 InternacionalEsta obra está bajo licencia internacional Creative Commons Reconocimiento-NoComercial 4.0.Este documento ha sido depositado por parte de el(los) autor(es) bajo la siguiente constancia de depósito