Método basado en aprendizaje automático para la calificación de ensayos cortos en inglés de una muestra de estudiantes de bachillerato

Bofill Barrera, Joan Gabriel

Mostrar el registro sencillo del documento

dc.rights.license	Atribución-NoComercial 4.0 Internacional
dc.contributor.advisor	Niño Vásquez, Luis Fernando
dc.contributor.author	Bofill Barrera, Joan Gabriel
dc.date.accessioned	2024-05-28T22:15:47Z
dc.date.available	2024-05-28T22:15:47Z
dc.date.issued	2024
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/86173
dc.description	ilustraciones, diagramas
dc.description.abstract	Este trabajo aborda el desafío de la calificación automática de ensayos argumentativos en inglés escritos por estudiantes de bachillerato que están aprendiendo el inglés como segunda lengua. El objetivo general es implementar un método automético basado en aprendizaje supervisado que permita resolver esta tarea para 6 indicadores en simultáneo: Cohesión, Sintaxis, Vocabulario, Gramática, Fraseología y Convenciones en escala de 1 a 5. Para lograrlo, se realiza un análisis descriptivo de los datos, se aplican procedimientos de preprocesamiento y se extraen características relevantes; se exploran diferentes estrategias, técnicas de representación y modelos desde algunos clásicos hasta aquellos con mejor desempeño en la actualidad, evaluando en cada iteración su rendimiento, contrastándola con las calificaciones humanas. Luego, se presenta el modelo con menor error que está basado principalmente en DeBERTa al cual se le aplican distintas técnicas para mejorar su desempeño y se combina con un modelo SVR que toma como características los embeddings de los textos concatenados en 10 modelos preentrenados sin fine-tuning. Con esta estrategia, el resultado se acerca bastante a las calificaciones humanas, presentando un RMSE de 0.45 sobre todos los indicadores. (Texto tomado de la fuente).
dc.description.abstract	This work addresses the challenge of automatically grading argumentative essays in English written by high school students that learn English as a second language. The general objective is to implement an automatic method based on supervised learning that allows solving this task for 6 indicators simultaneously: Cohesion, Syntax, Vocabulary, Grammar, Phraseology and Conventions rated on a scale from 1 to 5. To achieve this, a descriptive analysis of the data is conducted, preprocessing procedures are applied and relevant features are extracted; different strategies, representation techniques and models are explored, from some classic ones to the currently best performing models. Their performance is evaluated in each iteration, contrasting it with human ratings with a chosen measure. Then, the method with the best performance is presented, it is based mainly on DeBERTa V3 Large, where different techniques are applied to improve its performance. Finally, and is combined with a regressor model SVR that takes as features the concatenated embeddings of the texts in 10 different pretrained models. With this strategy, the result is quite close to human ratings, presenting a root mean square error of 0.45 over all indicators.
dc.format.extent	vii, 61 páginas
dc.format.mimetype	application/pdf
dc.language.iso	spa
dc.publisher	Universidad Nacional de Colombia
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/
dc.subject.ddc	000 - Ciencias de la computación, información y obras generales::004 - Procesamiento de datos Ciencia de los computadores
dc.subject.ddc	370 - Educación::373 - Educación secundaria
dc.title	Método basado en aprendizaje automático para la calificación de ensayos cortos en inglés de una muestra de estudiantes de bachillerato
dc.type	Trabajo de grado - Maestría
dc.type.driver	info:eu-repo/semantics/masterThesis
dc.type.version	info:eu-repo/semantics/acceptedVersion
dc.publisher.program	Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación
dc.contributor.referee	León Guzmán, Elizabeth
dc.contributor.researchgroup	laboratorio de Investigación en Sistemas Inteligentes Lisi
dc.description.degreelevel	Maestría
dc.description.degreename	Magíster en Ingeniería - Ingeniería de Sistemas y Computación
dc.description.researcharea	Sistemas inteligentes
dc.identifier.instname	Universidad Nacional de Colombia
dc.identifier.reponame	Repositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourl	https://repositorio.unal.edu.co/
dc.publisher.faculty	Facultad de Ingeniería
dc.publisher.place	Bogotá, Colombia
dc.publisher.branch	Universidad Nacional de Colombia - Sede Bogotá
dc.relation.references	P. Kline, The New Psychometrics: Science, Psychology and Measurement. Routledge, 1 ed., 1999.
dc.relation.references	T. N. Fitria, “Artificial intelligence (AI) technology in OpenAI ChatGPT application: A review of ChatGPT in writing English essay.,” ELT Forum: Journal of English Language Teaching, vol. 12, no. 1, pp. 44–58, 2023.
dc.relation.references	E. B. Page, “Grading Essays by Computer: Progress Report. Proceedings of the 1966 Invitational Conference on Testing Problems.,” Princeton, N.J. Educational Testing Service, pp. 87–100, 1967.
dc.relation.references	E. Page, “The use of the computer in analyzing student essays,” Int Rev Educ, pp. 210–225, 1968.
dc.relation.references	K. L. Gwet, “Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters,” Advanced Analytics, LLC, 2014.
dc.relation.references	Mark D. S., “Contrasting State-of-the-Art in the Machine Scoring of Short-Form Cons-tructed Responses,” Educational Assessment, vol. 20, no. 1, pp. 46–65, 2015.
dc.relation.references	Alex Franklin, Natalie Rambis, Maggie Meg Benner, Perpetual Baffour, Ryan Holbrook, and u. Scott Crossley, “Feedback Prize - English Language Learning. Kaggle. ,” 2022.
dc.relation.references	S. A. Crossley, K. Kyle, and D. S. Mcnamara, “To Aggregate or Not? Linguistic Features in Automatic Essay Scoring and Feedback Systems,” Grantee Submission, vol. 8, no. 1, 2015.
dc.relation.references	C. Ramineni and D. M. Williamson, “Automated essay scoring: Psychometric guidelines and practices,” Assessing Writing, pp. 25–39, 2013.
dc.relation.references	S. P. Balfour, “Assessing Writing in MOOCs: Automated Essay Scoring and Calibrated PeerReview™,” Research & Practice in Assessment, pp. 40–48, 2013.
dc.relation.references	S. Cushing Weigle, “Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability,” Language Testing, vol. 27, no. 3, pp. 335–353, 2010.
dc.relation.references	K. Taghipour, Robust trait-specific essay scoring using neural networks and density es-timators. PhD thesis, National University of Singapore, Singapore, 2017.
dc.relation.references	H. Shi and V. Aryadoust, “Correction to: A systematic review of automated writing evaluation systems,” Education and Information Technologies, vol. 28, pp. 6189–6190, 5 2023.
dc.relation.references	P. C. Jackson, Toward human-level artificial intelligence: Representation and computation of meaning in natural language. Dover Publications, 11 2019.
dc.relation.references	E. Mayfield and C. P. Rosé, “LightSIDE,” in Handbook of Automated Essay Evaluation, Routledge, 1 ed., 2013.
dc.relation.references	M. Shermis and J. Burstein, Handbook of Automated Essay Evaluation: Current Applications and New Directions. Routledge, 1 ed., 2013.
dc.relation.references	S. Burrows, I. Gurevych, and B. Stein, “The Eras and Trends of Automatic Short Answer Grading,” Int J Artif Intell Educ, vol. 25, pp. 60–117, 2015.
dc.relation.references	K. Zupanc and Z. Bosnic, “Automated essay evaluation with semantic analysis,” El-sevier, vol. 120, pp. 118–132, 2017.
dc.relation.references	D. Yan, A. A. Rupp, and P. W. Foltz, Handbook of Automated Scoring; Theory into Practice. Chapman and Hall/CRC., 1 ed., 2020.
dc.relation.references	B. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, and S. Linkman, “Systematic literature reviews in software engineering – A systematic literature review,” Information and Software Technology, vol. 51, no. 1, pp. 7–15, 2009.
dc.relation.references	Y.-Y. Chen, C.-L. Liu, C.-H. Lee, and T.-H. Chang, “An unsupervised automated essay scoring system,” IEEE Intelligent Systems, vol. 25, no. 5, pp. 61–67, 2010.
dc.relation.references	Y. Wang, Z. Wei, Y. Zhou, and X. Huang, “Automatic essay scoring incorporating rating schema via reinforcement learning,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 791–797, 2018.
dc.relation.references	C. Lu and M. Cutumisu, “Integrating Deep Learning into An Automated Feedback Generation System for Automated Essay Scoring,” International Educational Data Mining Society, 2021.
dc.relation.references	K. S. McCarthy, R. D. Roscoe, L. K. Allen, A. D. Likens, and D. S. McNamara, “Automated writing evaluation: Does spelling and grammar feedback support high-quality writing and revision?,” Assessing Writing, vol. 52, 4 2022.
dc.relation.references	A. Sharma and D. B. Jayagopi, “Automated grading of handwritten essays,” in Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, vol. 2018-August, pp. 279–284, 2018.
dc.relation.references	A. Vaswani, G. Brain, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Neural Information Processing Systems, 2017.
dc.relation.references	T. Pedersen, S. Patwardhan, and J. Michelizzi, “WordNet::Similarity-Measuring the Relatedness of Concepts,” AAAI, vol. 4, pp. 25–29, 7 2004.
dc.relation.references	F. Dong and Y. Zhang, “Automatic Features for Essay Scoring – An Empirical Study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,” Association for Computational Linguistics., pp. 1072–1077, 11 2016.
dc.relation.references	Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, 2019.
dc.relation.references	E. Mayfield and A. W. Black, “Should You Fine-Tune BERT for Automated Essay Scoring?,” Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 151–162, 7 2020.
dc.relation.references	D. Ramesh and S. K. Sanampudi, “An automated essay scoring systems: a systematic literature review,” Artificial Intelligence Review, vol. 55, pp. 2495–2527, 3 2022.
dc.relation.references	H. Yannakoudakis and R. Cummins, “Evaluating the performance of automated text scoring systems,” Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 213–223, 2015.
dc.relation.references	R. Bhatt, M. Patel, G. Srivastava, and V. Mago, “A Graph Based Approach to Automate Essay Evaluation,” in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, vol. 2020-Octob, pp. 4379–4385, 2020.
dc.relation.references	Z. Ke and V. Ng, “Automated essay scoring: A survey of the state of the art,” in IJCAI International Joint Conference on Artificial Intelligence, vol. 2019-Augus, pp. 6300–6308, 2019.
dc.relation.references	J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2018.
dc.relation.references	D. Castro-Castro, R. Lannes-Losada, M. Maritxalar, I. Niebla, C. Pérez-Marqués, N. Álamo-Suárez, and A. Pons-Porrata, “A multilingual application for automated essay scoring,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5290 LNAI, pp. 243–251, 2008.
dc.relation.references	P. U. Rodriguez, A. Jafari, and C. M. Ormerod, “Language models and Automated Essay Scoring,” ArXiv, 9 2019.
dc.relation.references	S. Ghannay, B. Favre, Y. Estève, and N. Camelin, “Word Embeddings Evaluation and Combination,” Proceedings of the Tenth International Conference on Language Resources and Evaluation, pp. 300–305, 5 2016.
dc.relation.references	M. Mars, “From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough,” Applied Sciences (Switzerland), vol. 12, 9 2022.
dc.relation.references	Y. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statistical framework,” International journal of machine learning and cybernetics, vol. 1, pp. 43–52, 12 2010.
dc.relation.references	K. W. CHURCH, “Word2Vec,” Natural Language Engineering, vol. 23, pp. 155–162, 1 2017.
dc.relation.references	J. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1535–1543, 2014.
dc.relation.references	V. Kumar and B. Subba, “A TfidfVectorizer and SVM based sentiment analysis framework for text data corpus,” in 2020 National Conference on Communications (NCC), pp. 1–6, IEEE, 2 2020.
dc.relation.references	Microsoft, “GitHub - Microsoft/LightGBM:Light Gradient Boosting Machine.”
dc.relation.references	G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.Y. Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” 31st Conference on Neural Information Processing Systems (NIPS 2017), pp. 3149–3157, 2017.
dc.relation.references	T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13-17-August-2016, pp. 785–794, Association for Computing Machinery, 8 2016.
dc.relation.references	J.J.Espinosa-Zúñiga, “Aplicación de algoritmos Random Forest y XGBoost en una base de solicitudes de tarjetas de crédito,” Ingeniería Investigación y Tecnología, vol. 21, no. 3, pp. 1–16, 2020.
dc.relation.references	C. Cortes, V. Vapnik, and L. Saitta, “Support-Vector Networks,” Machine Leaming, vol. 20, pp. 273–297, 1995.
dc.relation.references	M. Awad, R. Khanna, M. Awad, and R. Khanna, Support vector regression. Efficient learning machines: Theories, concepts, and applications for engineers and system designers. Apress, 2015.
dc.relation.references	M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, “Multilayer Perceptron and Neural Networks,” WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579–588, 2009.
dc.relation.references	T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language Models are Few-Shot Learners,” ArXiv, 5 2020.
dc.relation.references	T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. Von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. M. Rush, “Transformers: State-of-the-Art Natural Language Processing,” Association for Computational Linguistics, pp. 38–45, 2020.
dc.relation.references	P. He, X. Liu, J. Gao, W. Chen, and M. Dynamics, “Deberta: Decoding-enhanced bert with disentangled attention,” Conference paper at ICLR, 2021.
dc.relation.references	P. Zhang, “Longformer-based Automated Writing Assessment for English Language Learners Stanford CS224N Custom Project,” tech. rep., Standford, 2023.
dc.relation.references	K. K. Y. Chan, T. Bond, and Z. Yan, “Application of an Automated Essay Scoring engine to English writing assessment using Many-Facet Rasch Measurement,” Language Testing, vol. 40, pp. 61–85, 1 2023.
dc.relation.references	A. Mizumoto and M. Eguchi, “Exploring the potential of using an AI language model for automated essay scoring,” Research Methods in Applied Linguistics, vol. 2, 8 2023.
dc.relation.references	V. Mohan, M. J. Ilamathi, and M. Nithya, “Preprocessing Techniques for Text Mining-An Overview,” International Journal of Computer Science & Communication Networks, vol. 5, no. 1, pp. 7–16, 2015.
dc.relation.references	M. Siino, I. Tinnirello, and M. La Cascia, “Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers,” Information Systems, vol. 121, p. 102342, 3 2024.
dc.relation.references	S. A. Crossley, D. B. Allen, and J. S. Danielle McNamara, “Text readability and intuitive simplification: A comparison of readability formulas,” Reading in a foreign language, vol. 23, no. 1, pp. 84–101, 2011.
dc.relation.references	F. Scarselli and A. C. Tsoi, “Universal Approximation Using Feedforward Neural Networks: A Survey of Some Existing Methods, and Some New Results,” Neural Networks, vol. 11, no. 1, pp. 15–37, 1998.
dc.relation.references	P. He, J. Gao, and W. Chen, “DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing,” ArXiv, 11 2021.
dc.relation.references	D. Wu, S.-t. Xia, and Y. Wang, “Adversarial Weight Perturbation Helps Robust Generalization,” ArXiv, 4 2020.
dc.relation.references	H. Inoue, “Multi-Sample Dropout for Accelerated Training and Better Generalization,” ArXiv, 5 2019.
dc.relation.references	A. Stanciu, I. Cristescu, E. M. Ciuperca, and C. E. Cˆırnu, “Using an ensemble of transformer-based models for automated writing evaluation of essays,” in 14th International Conference on Education and New Learning Technologies, (Palma, Spain), pp. 5276–5282, IATED, 7 2022.
dc.relation.references	H. Zhang, Y. Gong, Y. Shen, W. Li, J. Lv, N. Duan, and W. Chen, “Poolingformer: Long Document Modeling with Pooling Attention,” ArXiv, 2021.
dc.relation.references	A. Aziz, M. Akram Hossain, and A. Nowshed Chy, “CSECU-DSG at SemEval-2023 Task 4: Fine-tuning DeBERTa Transformer Model with Cross-fold Training and Multisample Dropout for Human Values Identification,” tech. rep., Department of Computer Science and Engineering University of Chittagong, Chattogram, Bangladesh, 2023.
dc.relation.references	E. Mayfield and A. W. Black, “Should You Fine-Tune BERT for Automated Essay Scoring?,” Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 151–162, 7 2020.
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.subject.proposal	PLN
dc.subject.proposal	Aprendizaje supervisado
dc.subject.proposal	Transformers
dc.subject.proposal	Ensamble de modelos
dc.subject.proposal	SVR
dc.subject.proposal	NLP
dc.subject.proposal	Automatic essay grading
dc.subject.proposal	Supervised learning
dc.subject.proposal	SVR
dc.subject.proposal	Kaggle contest
dc.subject.proposal	Calificación automática de ensayos
dc.subject.proposal	Ensamble of models
dc.subject.unesco	Método de evaluación
dc.subject.unesco	Evaluation methods
dc.subject.unesco	Procesamiento de datos
dc.subject.unesco	Data processing
dc.title.translated	Machine learning based method for scoring short english essays from a high school student sample
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.content	Text
dc.type.redcol	http://purl.org/redcol/resource_type/TM
oaire.accessrights	http://purl.org/coar/access_right/c_abf2
dcterms.audience.professionaldevelopment	Estudiantes
dcterms.audience.professionaldevelopment	Investigadores
dcterms.audience.professionaldevelopment	Maestros
dcterms.audience.professionaldevelopment	Público general
dc.subject.wikidata	aprendizaje supervisado
dc.subject.wikidata	supervised learning

Archivos en el documento

Nombre:: 1032469305.2024.pdf
Tamaño:: 1.992Mb
Formato:: PDF
Descripción:: Tesis de Maestría en Ingeniería ...

Descargar

Este documento aparece en la(s) siguiente(s) colección(ones)

Maestría en Ingeniería - Sistemas y Computación [311]

Mostrar el registro sencillo del documento

Atribución-NoComercial 4.0 Internacional

Esta obra está bajo licencia internacional Creative Commons Reconocimiento-NoComercial 4.0.Este documento ha sido depositado por parte de el(los) autor(es) bajo la siguiente constancia de depósito