Método basado en aprendizaje automático para la calificación de ensayos cortos en inglés de una muestra de estudiantes de bachillerato

Bofill Barrera, Joan Gabriel

Método basado en aprendizaje automático para la calificación de ensayos cortos en inglés de una muestra de estudiantes de bachillerato

dc.contributor.advisor	Niño Vásquez, Luis Fernando	spa
dc.contributor.author	Bofill Barrera, Joan Gabriel	spa
dc.contributor.referee	León Guzmán, Elizabeth	spa
dc.contributor.researchgroup	laboratorio de Investigación en Sistemas Inteligentes Lisi	spa
dc.date.accessioned	2024-05-28T22:15:47Z
dc.date.available	2024-05-28T22:15:47Z
dc.date.issued	2024
dc.description	ilustraciones, diagramas	spa
dc.description.abstract	Este trabajo aborda el desafío de la calificación automática de ensayos argumentativos en inglés escritos por estudiantes de bachillerato que están aprendiendo el inglés como segunda lengua. El objetivo general es implementar un método automético basado en aprendizaje supervisado que permita resolver esta tarea para 6 indicadores en simultáneo: Cohesión, Sintaxis, Vocabulario, Gramática, Fraseología y Convenciones en escala de 1 a 5. Para lograrlo, se realiza un análisis descriptivo de los datos, se aplican procedimientos de preprocesamiento y se extraen características relevantes; se exploran diferentes estrategias, técnicas de representación y modelos desde algunos clásicos hasta aquellos con mejor desempeño en la actualidad, evaluando en cada iteración su rendimiento, contrastándola con las calificaciones humanas. Luego, se presenta el modelo con menor error que está basado principalmente en DeBERTa al cual se le aplican distintas técnicas para mejorar su desempeño y se combina con un modelo SVR que toma como características los embeddings de los textos concatenados en 10 modelos preentrenados sin fine-tuning. Con esta estrategia, el resultado se acerca bastante a las calificaciones humanas, presentando un RMSE de 0.45 sobre todos los indicadores. (Texto tomado de la fuente).	spa
dc.description.abstract	This work addresses the challenge of automatically grading argumentative essays in English written by high school students that learn English as a second language. The general objective is to implement an automatic method based on supervised learning that allows solving this task for 6 indicators simultaneously: Cohesion, Syntax, Vocabulary, Grammar, Phraseology and Conventions rated on a scale from 1 to 5. To achieve this, a descriptive analysis of the data is conducted, preprocessing procedures are applied and relevant features are extracted; different strategies, representation techniques and models are explored, from some classic ones to the currently best performing models. Their performance is evaluated in each iteration, contrasting it with human ratings with a chosen measure. Then, the method with the best performance is presented, it is based mainly on DeBERTa V3 Large, where different techniques are applied to improve its performance. Finally, and is combined with a regressor model SVR that takes as features the concatenated embeddings of the texts in 10 different pretrained models. With this strategy, the result is quite close to human ratings, presenting a root mean square error of 0.45 over all indicators.	eng
dc.description.degreelevel	Maestría	spa
dc.description.degreename	Magíster en Ingeniería - Ingeniería de Sistemas y Computación	spa
dc.description.researcharea	Sistemas inteligentes	spa
dc.format.extent	vii, 61 páginas	spa
dc.format.mimetype	application/pdf	spa
dc.identifier.instname	Universidad Nacional de Colombia	spa
dc.identifier.reponame	Repositorio Institucional Universidad Nacional de Colombia	spa
dc.identifier.repourl	https://repositorio.unal.edu.co/	spa
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/86173
dc.language.iso	spa	spa
dc.publisher	Universidad Nacional de Colombia	spa
dc.publisher.branch	Universidad Nacional de Colombia - Sede Bogotá	spa
dc.publisher.faculty	Facultad de Ingeniería	spa
dc.publisher.place	Bogotá, Colombia	spa
dc.publisher.program	Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación	spa
dc.relation.references	P. Kline, The New Psychometrics: Science, Psychology and Measurement. Routledge, 1 ed., 1999.	spa
dc.relation.references	T. N. Fitria, “Artificial intelligence (AI) technology in OpenAI ChatGPT application: A review of ChatGPT in writing English essay.,” ELT Forum: Journal of English Language Teaching, vol. 12, no. 1, pp. 44–58, 2023.	spa
dc.relation.references	E. B. Page, “Grading Essays by Computer: Progress Report. Proceedings of the 1966 Invitational Conference on Testing Problems.,” Princeton, N.J. Educational Testing Service, pp. 87–100, 1967.	spa
dc.relation.references	E. Page, “The use of the computer in analyzing student essays,” Int Rev Educ, pp. 210–225, 1968.	spa
dc.relation.references	K. L. Gwet, “Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters,” Advanced Analytics, LLC, 2014.	spa
dc.relation.references	Mark D. S., “Contrasting State-of-the-Art in the Machine Scoring of Short-Form Cons-tructed Responses,” Educational Assessment, vol. 20, no. 1, pp. 46–65, 2015.	spa
dc.relation.references	Alex Franklin, Natalie Rambis, Maggie Meg Benner, Perpetual Baffour, Ryan Holbrook, and u. Scott Crossley, “Feedback Prize - English Language Learning. Kaggle. ,” 2022.	spa
dc.relation.references	S. A. Crossley, K. Kyle, and D. S. Mcnamara, “To Aggregate or Not? Linguistic Features in Automatic Essay Scoring and Feedback Systems,” Grantee Submission, vol. 8, no. 1, 2015.	spa
dc.relation.references	C. Ramineni and D. M. Williamson, “Automated essay scoring: Psychometric guidelines and practices,” Assessing Writing, pp. 25–39, 2013.	spa
dc.relation.references	S. P. Balfour, “Assessing Writing in MOOCs: Automated Essay Scoring and Calibrated PeerReview™,” Research & Practice in Assessment, pp. 40–48, 2013.	spa
dc.relation.references	S. Cushing Weigle, “Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability,” Language Testing, vol. 27, no. 3, pp. 335–353, 2010.	spa
dc.relation.references	K. Taghipour, Robust trait-specific essay scoring using neural networks and density es-timators. PhD thesis, National University of Singapore, Singapore, 2017.	spa
dc.relation.references	H. Shi and V. Aryadoust, “Correction to: A systematic review of automated writing evaluation systems,” Education and Information Technologies, vol. 28, pp. 6189–6190, 5 2023.	spa
dc.relation.references	P. C. Jackson, Toward human-level artificial intelligence: Representation and computation of meaning in natural language. Dover Publications, 11 2019.	spa
dc.relation.references	E. Mayfield and C. P. Rosé, “LightSIDE,” in Handbook of Automated Essay Evaluation, Routledge, 1 ed., 2013.	spa
dc.relation.references	M. Shermis and J. Burstein, Handbook of Automated Essay Evaluation: Current Applications and New Directions. Routledge, 1 ed., 2013.	spa
dc.relation.references	S. Burrows, I. Gurevych, and B. Stein, “The Eras and Trends of Automatic Short Answer Grading,” Int J Artif Intell Educ, vol. 25, pp. 60–117, 2015.	spa
dc.relation.references	K. Zupanc and Z. Bosnic, “Automated essay evaluation with semantic analysis,” El-sevier, vol. 120, pp. 118–132, 2017.	spa
dc.relation.references	D. Yan, A. A. Rupp, and P. W. Foltz, Handbook of Automated Scoring; Theory into Practice. Chapman and Hall/CRC., 1 ed., 2020.	spa
dc.relation.references	B. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, and S. Linkman, “Systematic literature reviews in software engineering – A systematic literature review,” Information and Software Technology, vol. 51, no. 1, pp. 7–15, 2009.	spa
dc.relation.references	Y.-Y. Chen, C.-L. Liu, C.-H. Lee, and T.-H. Chang, “An unsupervised automated essay scoring system,” IEEE Intelligent Systems, vol. 25, no. 5, pp. 61–67, 2010.	spa
dc.relation.references	Y. Wang, Z. Wei, Y. Zhou, and X. Huang, “Automatic essay scoring incorporating rating schema via reinforcement learning,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 791–797, 2018.	spa
dc.relation.references	C. Lu and M. Cutumisu, “Integrating Deep Learning into An Automated Feedback Generation System for Automated Essay Scoring,” International Educational Data Mining Society, 2021.	spa
dc.relation.references	K. S. McCarthy, R. D. Roscoe, L. K. Allen, A. D. Likens, and D. S. McNamara, “Automated writing evaluation: Does spelling and grammar feedback support high-quality writing and revision?,” Assessing Writing, vol. 52, 4 2022.	spa
dc.relation.references	A. Sharma and D. B. Jayagopi, “Automated grading of handwritten essays,” in Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, vol. 2018-August, pp. 279–284, 2018.	spa
dc.relation.references	A. Vaswani, G. Brain, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Neural Information Processing Systems, 2017.	spa
dc.relation.references	T. Pedersen, S. Patwardhan, and J. Michelizzi, “WordNet::Similarity-Measuring the Relatedness of Concepts,” AAAI, vol. 4, pp. 25–29, 7 2004.	spa
dc.relation.references	F. Dong and Y. Zhang, “Automatic Features for Essay Scoring – An Empirical Study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,” Association for Computational Linguistics., pp. 1072–1077, 11 2016.	spa
dc.relation.references	Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, 2019.	spa
dc.relation.references	E. Mayfield and A. W. Black, “Should You Fine-Tune BERT for Automated Essay Scoring?,” Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 151–162, 7 2020.	spa
dc.relation.references	D. Ramesh and S. K. Sanampudi, “An automated essay scoring systems: a systematic literature review,” Artificial Intelligence Review, vol. 55, pp. 2495–2527, 3 2022.	spa
dc.relation.references	H. Yannakoudakis and R. Cummins, “Evaluating the performance of automated text scoring systems,” Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 213–223, 2015.	spa
dc.relation.references	R. Bhatt, M. Patel, G. Srivastava, and V. Mago, “A Graph Based Approach to Automate Essay Evaluation,” in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, vol. 2020-Octob, pp. 4379–4385, 2020.	spa
dc.relation.references	Z. Ke and V. Ng, “Automated essay scoring: A survey of the state of the art,” in IJCAI International Joint Conference on Artificial Intelligence, vol. 2019-Augus, pp. 6300–6308, 2019.	spa
dc.relation.references	J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2018.	spa
dc.relation.references	D. Castro-Castro, R. Lannes-Losada, M. Maritxalar, I. Niebla, C. Pérez-Marqués, N. Álamo-Suárez, and A. Pons-Porrata, “A multilingual application for automated essay scoring,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5290 LNAI, pp. 243–251, 2008.	spa
dc.relation.references	P. U. Rodriguez, A. Jafari, and C. M. Ormerod, “Language models and Automated Essay Scoring,” ArXiv, 9 2019.	spa
dc.relation.references	S. Ghannay, B. Favre, Y. Estève, and N. Camelin, “Word Embeddings Evaluation and Combination,” Proceedings of the Tenth International Conference on Language Resources and Evaluation, pp. 300–305, 5 2016.	spa
dc.relation.references	M. Mars, “From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough,” Applied Sciences (Switzerland), vol. 12, 9 2022.	spa
dc.relation.references	Y. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statistical framework,” International journal of machine learning and cybernetics, vol. 1, pp. 43–52, 12 2010.	spa
dc.relation.references	K. W. CHURCH, “Word2Vec,” Natural Language Engineering, vol. 23, pp. 155–162, 1 2017.	spa
dc.relation.references	J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1535–1543, 2014.	spa
dc.relation.references	V. Kumar and B. Subba, “A TfidfVectorizer and SVM based sentiment analysis framework for text data corpus,” in 2020 National Conference on Communications (NCC), pp. 1–6, IEEE, 2 2020.	spa
dc.relation.references	Microsoft, “GitHub - Microsoft/LightGBM:Light Gradient Boosting Machine.”	spa
dc.relation.references	G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.Y. Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” 31st Conference on Neural Information Processing Systems (NIPS 2017), pp. 3149–3157, 2017.	spa
dc.relation.references	T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13-17-August-2016, pp. 785–794, Association for Computing Machinery, 8 2016.	spa
dc.relation.references	J.J.Espinosa-Zúñiga, “Aplicación de algoritmos Random Forest y XGBoost en una base de solicitudes de tarjetas de crédito,” Ingeniería Investigación y Tecnología, vol. 21, no. 3, pp. 1–16, 2020.	spa
dc.relation.references	C. Cortes, V. Vapnik, and L. Saitta, “Support-Vector Networks,” Machine Leaming, vol. 20, pp. 273–297, 1995.	spa
dc.relation.references	M. Awad, R. Khanna, M. Awad, and R. Khanna, Support vector regression. Efficient learning machines: Theories, concepts, and applications for engineers and system designers. Apress, 2015.	spa
dc.relation.references	M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, “Multilayer Perceptron and Neural Networks,” WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579–588, 2009.	spa
dc.relation.references	T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language Models are Few-Shot Learners,” ArXiv, 5 2020.	spa
dc.relation.references	T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. Von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. M. Rush, “Transformers: State-of-the-Art Natural Language Processing,” Association for Computational Linguistics, pp. 38–45, 2020.	spa
dc.relation.references	P. He, X. Liu, J. Gao, W. Chen, and M. Dynamics, “Deberta: Decoding-enhanced bert with disentangled attention,” Conference paper at ICLR, 2021.	spa
dc.relation.references	P. Zhang, “Longformer-based Automated Writing Assessment for English Language Learners Stanford CS224N Custom Project,” tech. rep., Standford, 2023.	spa
dc.relation.references	K. K. Y. Chan, T. Bond, and Z. Yan, “Application of an Automated Essay Scoring engine to English writing assessment using Many-Facet Rasch Measurement,” Language Testing, vol. 40, pp. 61–85, 1 2023.	spa
dc.relation.references	A. Mizumoto and M. Eguchi, “Exploring the potential of using an AI language model for automated essay scoring,” Research Methods in Applied Linguistics, vol. 2, 8 2023.	spa
dc.relation.references	V. Mohan, M. J. Ilamathi, and M. Nithya, “Preprocessing Techniques for Text Mining-An Overview,” International Journal of Computer Science & Communication Networks, vol. 5, no. 1, pp. 7–16, 2015.	spa
dc.relation.references	M. Siino, I. Tinnirello, and M. La Cascia, “Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers,” Information Systems, vol. 121, p. 102342, 3 2024.	spa
dc.relation.references	S. A. Crossley, D. B. Allen, and J. S. Danielle McNamara, “Text readability and intuitive simplification: A comparison of readability formulas,” Reading in a foreign language, vol. 23, no. 1, pp. 84–101, 2011.	spa
dc.relation.references	F. Scarselli and A. C. Tsoi, “Universal Approximation Using Feedforward Neural Networks: A Survey of Some Existing Methods, and Some New Results,” Neural Networks, vol. 11, no. 1, pp. 15–37, 1998.	spa
dc.relation.references	P. He, J. Gao, and W. Chen, “DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing,” ArXiv, 11 2021.	spa
dc.relation.references	D. Wu, S.-t. Xia, and Y. Wang, “Adversarial Weight Perturbation Helps Robust Generalization,” ArXiv, 4 2020.	spa
dc.relation.references	H. Inoue, “Multi-Sample Dropout for Accelerated Training and Better Generalization,” ArXiv, 5 2019.	spa
dc.relation.references	A. Stanciu, I. Cristescu, E. M. Ciuperca, and C. E. Cˆırnu, “Using an ensemble of transformer-based models for automated writing evaluation of essays,” in 14th International Conference on Education and New Learning Technologies, (Palma, Spain), pp. 5276–5282, IATED, 7 2022.	spa
dc.relation.references	H. Zhang, Y. Gong, Y. Shen, W. Li, J. Lv, N. Duan, and W. Chen, “Poolingformer: Long Document Modeling with Pooling Attention,” ArXiv, 2021.	spa
dc.relation.references	A. Aziz, M. Akram Hossain, and A. Nowshed Chy, “CSECU-DSG at SemEval-2023 Task 4: Fine-tuning DeBERTa Transformer Model with Cross-fold Training and Multisample Dropout for Human Values Identification,” tech. rep., Department of Computer Science and Engineering University of Chittagong, Chattogram, Bangladesh, 2023.	spa
dc.relation.references	E. Mayfield and A. W. Black, “Should You Fine-Tune BERT for Automated Essay Scoring?,” Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 151–162, 7 2020.	spa
dc.rights.accessrights	info:eu-repo/semantics/openAccess	spa
dc.rights.license	Atribución-NoComercial 4.0 Internacional	spa
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/	spa
dc.subject.ddc	000 - Ciencias de la computación, información y obras generales::004 - Procesamiento de datos Ciencia de los computadores	spa
dc.subject.ddc	370 - Educación::373 - Educación secundaria	spa
dc.subject.proposal	PLN	spa
dc.subject.proposal	Aprendizaje supervisado	spa
dc.subject.proposal	Transformers	spa
dc.subject.proposal	Ensamble de modelos	spa
dc.subject.proposal	SVR	spa
dc.subject.proposal	NLP	eng
dc.subject.proposal	Automatic essay grading	eng
dc.subject.proposal	Supervised learning	eng
dc.subject.proposal	SVR	eng
dc.subject.proposal	Kaggle contest	eng
dc.subject.proposal	Calificación automática de ensayos	spa
dc.subject.proposal	Ensamble of models	eng
dc.subject.unesco	Método de evaluación	spa
dc.subject.unesco	Evaluation methods	eng
dc.subject.unesco	Procesamiento de datos	spa
dc.subject.unesco	Data processing	eng
dc.subject.wikidata	aprendizaje supervisado	spa
dc.subject.wikidata	supervised learning	eng
dc.title	Método basado en aprendizaje automático para la calificación de ensayos cortos en inglés de una muestra de estudiantes de bachillerato	spa
dc.title.translated	Machine learning based method for scoring short english essays from a high school student sample	eng
dc.type	Trabajo de grado - Maestría	spa
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc	spa
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa	spa
dc.type.content	Text	spa
dc.type.driver	info:eu-repo/semantics/masterThesis	spa
dc.type.redcol	http://purl.org/redcol/resource_type/TM	spa
dc.type.version	info:eu-repo/semantics/acceptedVersion	spa
dcterms.audience.professionaldevelopment	Estudiantes	spa
dcterms.audience.professionaldevelopment	Investigadores	spa
dcterms.audience.professionaldevelopment	Maestros	spa
dcterms.audience.professionaldevelopment	Público general	spa
oaire.accessrights	http://purl.org/coar/access_right/c_abf2	spa

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: 1032469305.2024.pdf
Tamaño:: 1.99 MB
Formato:: Adobe Portable Document Format
Descripción:: Tesis de Maestría en Ingeniería - Ingeniería de Sistemas y Computación

Descargar

Bloque de licencias

Mostrando 1 - 1 de 1

Nombre:: license.txt
Tamaño:: 5.74 KB
Formato:: Item-specific license agreed upon to submission
Descripción:

Descargar

Colecciones

Maestría en Ingeniería - Sistemas y Computación