Impacto de la inteligencia artificial generativa en la implementación de Test-Driven Development: evaluación de la calidad del código y su aplicabilidad en entornos académicos

dc.contributor.advisorAponte Melo, Jairo Hernánspa
dc.contributor.authorÁlvarez Rodríguez , Óscar Eduardospa
dc.contributor.researchgroupColectivo de Investigación en Ingeniería de Software Colswespa
dc.date.accessioned2026-01-22T19:51:59Z
dc.date.available2026-01-22T19:51:59Z
dc.date.issued2025-12-01
dc.descriptionilustraciones, diagramasspa
dc.description.abstractLa inteligencia artificial generativa (IAG) ha acelerado la escritura de código, pero aún hay incertidumbre sobre riesgos en la calidad, la seguridad y la pertinencia de las soluciones. Este trabajo examina la integración de IAG en el Desarrollo Guiado por Pruebas (TDD) en un entorno académico controlado. Se empleó un diseño con dos condiciones (con y sin apoyo de IAG) en un reto de programación orientado a TDD, con instrucción previa homogénea, misma consigna y ventana temporal equivalente. La calidad del producto se evaluó mediante métricas estándar (complejidad ciclomática e índice de mantenibilidad), herramientas de análisis estático (Pylint/Flake8), seguridad (Bandit) y cobertura de pruebas; Además, se recogió la percepción de los participantes: antes del ejercicio, respecto a su propio nivel de programación y etapa académica; y después del ejercicio, en relación con la dificultad percibida y el grado de completitud alcanzado. Los resultados no evidencian diferencias estadísticamente significativas en complejidad ciclomática ni en mantenibilidad entre los grupos CON IA y SIN IA. Gráficamente se observan algunas variaciones descriptivas entre ambos grupos, pero estas no alcanzan significancia estadística y deben interpretarse con cautela. La cobertura no presenta un patrón uniforme y depende de la calidad de los casos de prueba. En el caso de la métrica de seguridad, Bandit no reportó vulnerabilidades en ninguno de los proyectos, lo cual es consistente con el alcance acotado del reto y limita la posibilidad de extraer conclusiones generales sobre la seguridad del software. La evidencia cualitativa sugiere que la IAG puede favorecer la velocidad de avance y la estructuración inicial del código, pero requiere pautas de prompting y supervisión para evitar errores sutiles y dependencias excesivas. Se discuten amenazas a la validez y se proponen lineamientos prácticos para cursos de programación que deseen incorporar IAG sin desplazar el razonamiento propio del estudiante. (Texto tomado de la fuente).spa
dc.description.abstractGenerative artificial intelligence (GenAI) has accelerated code writing, but there is still uncertainty regarding its risks for solution quality, security, and appropriateness. This work examines the integration of GenAI into Test-Driven Development (TDD) in a controlled academic setting. A two-condition design (with and without GenAI support) was employed in a TDD-oriented programming challenge, using homogeneous prior instruction, the same task description, and an equivalent time window. Product quality was evaluated using standard metrics (cyclomatic complexity and maintainability index), static analysis tools (Pylint/Flake8), security analysis (Bandit), and test coverage. In addition, participants’ perceptions were collected: before the exercise, regarding their own programming level and academic stage; and after the exercise, regarding perceived difficulty and the degree of completion achieved. The results do not show statistically significant differences in cyclomatic complexity or maintainability between the GenAI and non-GenAI groups. Some descriptive variations are observable between groups in graphical form, but these do not reach statistical significance and should be interpreted with caution. Test coverage does not exhibit a uniform pattern and depends on the quality of the test cases. For the security metric, Bandit did not report vulnerabilities in any of the projects, which is consistent with the limited scope of the challenge and constrains the possibility of drawing general conclusions about software security. Qualitative evidence suggests that GenAI may support faster progress and the initial structuring of code, but it requires prompting guidelines and supervision to avoid subtle errors and excessive dependence. Threats to validity are discussed, and practical guidelines are proposed for programming courses that wish to incorporate GenAI without displacing students’ own reasoning.eng
dc.description.degreelevelMaestríaspa
dc.description.degreenameMagíster en Ingeniería de Sistemas y Computaciónspa
dc.description.researchareaIngeniería de softwarespa
dc.format.extent77 páginasspa
dc.format.mimetypeapplication/pdf
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/89300
dc.language.isospa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.facultyFacultad de Ingenieríaspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computaciónspa
dc.relation.referencesBajaj, Y., & Samal, M. K. (2023). Accelerating Software Quality: Unleashing the Power of Generative AI for Automated Test-Case Generation and Bug Identification. International Journal for Research in Applied Science and Engineering Technology, 11(7), 345-350. https://doi.org/10.22214/ijraset.2023.54628
dc.relation.referencesBeck, K. (2003). Test-driven development: By example. Addison-Wesley.
dc.relation.referencesBull, C., & Kharrufa, A. (2024). Generative Artificial Intelligence Assistants in Software Development Education: A Vision for Integrating Generative Artificial Intelligence Into Educational Practice, Not Instinctively Defending Against It. IEEE Software, 41(2), 52-59. https://doi.org/10.1109/MS.2023.3300574
dc.relation.referencesCalais, P., & Franzini, L. (2023). Test-Driven Development Benefits Beyond Design Quality: Flow State and Developer Experience. 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), 106-111. https://doi.org/10.1109/ICSE-NIER58687.2023.00025
dc.relation.referencesEbert, C., & Louridas, P. (2023). Generative AI for Software Practitioners. IEEE Software, 40(4), 30-38. https://doi.org/10.1109/MS.2023.3265877
dc.relation.referencesEkin, S. (2023). Prompt Engineering For ChatGPT: A Quick Guide To Techniques, Tips, And Best Practices. https://doi.org/10.36227/techrxiv.22683919.v2
dc.relation.referencesFaizan, R. (2023, diciembre). Code Analysis and Data Collection using Python Static Analysis Tools and SQLite. FH JOANNEUM – University of Applied Sciences, Austria.
dc.relation.referencesGarousi, V., Rainer, A., Lauvås, P., & Arcuri, A. (2020). Software-testing education: A systematic literature mapping. Journal of Systems and Software, 165, 110570. https://doi.org/10.1016/j.jss.2020.110570
dc.relation.referencesGeorge, B., & Williams, L. (2004). A structured experiment of test-driven development. Information and Software Technology, 46(5), 337-342. https://doi.org/10.1016/j.infsof.2003.09.011
dc.relation.referencesGiray, L. (2023). Prompt Engineering with ChatGPT: A Guide for Academic Writers. Annals of Biomedical Engineering, 51(12), 2629-2633. https://doi.org/10.1007/s10439-023-03272-4
dc.relation.referencesHanifi, K., Cetin, O., & Yilmaz, C. (2023). On ChatGPT: Perspectives from Software Engineering Students. 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security (QRS), 196-205. https://doi.org/10.1109/QRS60937.2023.00028
dc.relation.referencesHariri, W. (2025). Unlocking the Potential of ChatGPT: A Comprehensive Exploration of its Applications, Advantages, Limitations, and Future Directions in Natural Language Processing (No. arXiv:2304.02017). arXiv. https://doi.org/10.48550/arXiv.2304.02017
dc.relation.referencesHassan, H. B., Sarhan, Q. I., & Beszédes, Á. (2024). Evaluating Python Static Code Analysis Tools Using FAIR Principles. IEEE Access, 12, 173647-173659. https://doi.org/10.1109/ACCESS.2024.3503493
dc.relation.referencesKarahan Adalı, G., & Bilgili, A. (2025). Generative AI in Higher Education: Students’ Perspectives on Adoption, Ethical Concerns, and Academic Impact. Acta Infologica, 0(0), 0-0. https://doi.org/10.26650/acin.1670197
dc.relation.referencesKumar Sharma, P., Singla, P., Gupta, V., Paras, & Garg, P. (2023). An Era of ChatGPT: Systematic Analysis of Utility and Challenges. 2023 2nd International Conference on Edge Computing and Applications (ICECAA), 897-902. https://doi.org/10.1109/ICECAA58104.2023.10212359
dc.relation.referencesKummita, S., Piskachev, G., Spath, J., & Bodden, E. (2021). Qualitative and Quantitative Analysis of Callgraph Algorithms for Python. 2021 International Conference on Code Quality (ICCQ), 1-15. https://doi.org/10.1109/ICCQ51190.2021.9392986
dc.relation.referencesMagalhães, M., Morgado, Â., Jesus, H., & Pombo, N. (2023). Unlocking the Potential of Dynamic Languages: An Exploration of Automated Unit Test Generation Techniques. 2023 IEEE International Conference On Artificial Intelligence Testing (AITest), 122-126. https://doi.org/10.1109/AITest58265.2023.00027
dc.relation.referencesMajdinasab, V., Bishop, M. J., Rasheed, S., Moradidakhel, A., Tahir, A., & Khomh, F. (2024). Assessing the Security of GitHub Copilot’s Generated Code—A Targeted Replication Study. 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), 435-444. https://doi.org/10.1109/SANER60148.2024.00051
dc.relation.referencesMarabesi, M., García-Holgado, A., & García-Peñalvo, F. J. (2024). Exploring the Connection between the TDD Practice and Test Smells—A Systematic Literature Review. Computers, 13(3), 79. https://doi.org/10.3390/computers13030079
dc.relation.referencesMartins, L., Brito, V., Feitosa, D., Rocha, L., Costa, H., & Machado, I. (2021). From Blackboard to the Office: A Look Into How Practitioners Perceive Software Testing Education (No. arXiv:2106.06422). arXiv. https://doi.org/10.48550/arXiv.2106.06422
dc.relation.referencesMcKinsey, A. (s. f.). What every CEO should know about generative AI.
dc.relation.referencesMock, M., Melegati, J., & Russo, B. (2025). Generative AI for Test Driven Development: Preliminary Results (Vol. 524, pp. 24-32). https://doi.org/10.1007/978-3-031-72781-8_3
dc.relation.referencesMolnar, A.-J., Motogna, S., & Vlad, C. (2020). Using static analysis tools to assist student project evaluation. Proceedings of the 2nd ACM SIGSOFT International Workshop on Education through Advanced Software Engineering and Artificial Intelligence, 7-12. https://doi.org/10.1145/3412453.3423195
dc.relation.referencesOzkaya, I. (2023). The Next Frontier in Software Development: AI-Augmented Software Development Processes. IEEE Software, 40(4), 4-9. https://doi.org/10.1109/MS.2023.3278056
dc.relation.referencesPaez, N., Fontdevila, D., & Zangara, A. (2024). Test-Driven Development: Challenges and Recommendations for Trainers and Educators. 2024 L Latin American Computer Conference (CLEI), 1-10. https://doi.org/10.1109/CLEI64178.2024.10700153
dc.relation.referencesPanagiotis Vasilikos. (2020, diciembre). Source code static analysis for software security. The Alexandra Institute.
dc.relation.referencesRamzan, H. A., Ramzan, S., & Kalsum, T. (2024). Test-Driven Development (TDD) in Small Software Development Teams: Advantages and Challenges. 2024 5th International Conference on Advancements in Computational Sciences (ICACS), 1-5. https://doi.org/10.1109/ICACS60934.2024.10473291
dc.relation.referencesRivera Berrío, J. G. (2024). Inteligencias artificiales generativas 2024. Red Educativa Digital Descartes. https://prometeo.matem.unam.mx/recursos/VariosNiveles/iCartesiLibri/recursos/Inteligencias_Artificiales_Generativas_2024/index.html
dc.relation.referencesRodriguez, A. D., Dearstyne, K. R., & Cleland-Huang, J. (2023). Prompts Matter: Insights and Strategies for Prompt Engineering in Automated Software Traceability. 2023 IEEE 31st International Requirements Engineering Conference Workshops (REW), 455-464. https://doi.org/10.1109/REW57809.2023.00087
dc.relation.referencesScoccia, G. L. (2023). Exploring Early Adopters’ Perceptions of ChatGPT as a Code Generation Tool. 2023 38th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW), 88-93. https://doi.org/10.1109/ASEW60602.2023.00016
dc.relation.referencesSiddiq, M. L., Samee, A., Azgor, S. R., Haider, Md. A., Sawraz, S. I., & Santos, J. C. S. (2023). Zero-shot Prompting for Code Complexity Prediction Using GitHub Copilot. 2023 IEEE/ACM 2nd International Workshop on Natural Language-Based Software Engineering (NLBSE), 56-59. https://doi.org/10.1109/NLBSE59153.2023.00018
dc.relation.referencesSoftware Developers Statistics 2024—State of Developer Ecosystem Report. (s. f.). JetBrains: Developer Tools for Professionals and Teams. Recuperado 17 de junio de 2025, de https://www.jetbrains.com/lp/devecosystem-2024
dc.relation.referencesSohail, S. S., Farhat, F., Himeur, Y., Nadeem, M., Madsen, D. Ø., Singh, Y., Atalla, S., & Mansoor, W. (2023). Decoding ChatGPT: A taxonomy of existing research, current challenges, and possible future directions. Journal of King Saud University - Computer and Information Sciences, 35(8), 101675. https://doi.org/10.1016/j.jksuci.2023.101675
dc.relation.referencesSpasić, A. J., & Janković, D. S. (2023). Using ChatGPT Standard Prompt Engineering Techniques in Lesson Preparation: Role, Instructions and Seed-Word Prompts. 2023 58th International Scientific Conference on Information, Communication and Energy Systems and Technologies (ICEST), 47-50. https://doi.org/10.1109/ICEST58410.2023.10187269
dc.relation.referencesSultan Al Olama, O. (2023, abril). 100 Practical Applications and Use Cases of Generative AI. United Arab Emirates Government.
dc.relation.referencesSzabó, Z., & Bilicki, V. (2023). A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection. Future Internet, 15(10), 326. https://doi.org/10.3390/fi15100326
dc.relation.referencesTampubolon, S. M., & Raharjo, T. (2024). Unveiling the Benefits and Challenges of Test-Driven Development in Agile: A Systematic Literature Review. Indonesian Journal of Computer Science, 13(2). https://doi.org/10.33022/ijcs.v13i2.3857
dc.relation.referencesVayadande, K., Mukhopadhyay, K., Chaudhari, V., Manwadkar, S., Mutalik, T., & Gawali, I. (2023). Let Us Lint: A Tool for Code Formatting And Code Enhancing. 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), 1-8. https://doi.org/10.1109/ICCCNT56998.2023.10306770
dc.relation.referencesVelásquez-Henao, J. D., Franco-Cardona, C. J., & Cadavid-Higuita, L. (2023). Prompt Engineering: A methodology for optimizing interactions with AI-Language Models in the field of engineering. DYNA, 90(230), 9-17. https://doi.org/10.15446/dyna.v90n230.111700
dc.relation.referencesYan, D., Gao, Z., & Liu, Z. (2023). A Closer Look at Different Difficulty Levels Code Generation Abilities of ChatGPT. 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), 1887-1898. https://doi.org/10.1109/ASE56229.2023.00096
dc.relation.referencesZimmermann, D., & Koziolek, A. (2023). Automating GUI-based Software Testing with GPT-3. 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), 62-65. https://doi.org/10.1109/ICSTW58534.2023.00022
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.rights.licenseReconocimiento 4.0 Internacional
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.ddc000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computaciónspa
dc.subject.proposalIA generativaspa
dc.subject.proposalTDDspa
dc.subject.proposalCalidad de softwarespa
dc.subject.proposalMantenibilidadspa
dc.subject.proposalComplejidad ciclomáticaspa
dc.subject.proposalAnálisis estáticospa
dc.subject.proposalCobertura de pruebasspa
dc.subject.proposalGenerative AIeng
dc.subject.proposalSoftware qualityeng
dc.subject.proposalMaintainabilityeng
dc.subject.proposalCyclomatic complexityeng
dc.subject.proposalStatic analysiseng
dc.subject.proposalTest coverageeng
dc.subject.proposalTDDeng
dc.subject.unescoInteligencia artificialspa
dc.subject.unescoArtificial intelligenceeng
dc.subject.unescoEnseñanza de la informáticaspa
dc.subject.unescoComputer science educationeng
dc.subject.unescoProgramación informáticaspa
dc.subject.unescoComputer programmingeng
dc.subject.wikidatadesarrollo de softwarespa
dc.subject.wikidatasoftware developmenteng
dc.titleImpacto de la inteligencia artificial generativa en la implementación de Test-Driven Development: evaluación de la calidad del código y su aplicabilidad en entornos académicosspa
dc.title.translatedImpact of generative artificial intelligence on the implementation of Test-Driven Development: evaluating code quality and its applicability in academic settingseng
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdcc
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.contentText
dc.type.driverinfo:eu-repo/semantics/masterThesis
dc.type.redcolhttp://purl.org/redcol/resource_type/TM
dc.type.versioninfo:eu-repo/semantics/acceptedVersion
dcterms.audience.professionaldevelopmentInvestigadoresspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
Impacto de la inteligencia artificial generativa en la implementación de Test-Driven Development: evaluación de la calidad del código y su aplicabilidad en entornos académicos.pdf
Tamaño:
6.48 MB
Formato:
Adobe Portable Document Format
Descripción:
Tesis de Maestría en Ingeniería - Ingeniería de Sistemas y Computación

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
5.74 KB
Formato:
Item-specific license agreed upon to submission
Descripción: