Mostrar el registro sencillo del documento

dc.rights.licenseAtribución-NoComercial 4.0 Internacional
dc.contributor.advisorNiño Vásquez, Luis Fernando
dc.contributor.authorBofill Barrera, Joan Gabriel
dc.date.accessioned2024-05-28T22:15:47Z
dc.date.available2024-05-28T22:15:47Z
dc.date.issued2024
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/86173
dc.descriptionilustraciones, diagramas
dc.description.abstractEste trabajo aborda el desafío de la calificación automática de ensayos argumentativos en inglés escritos por estudiantes de bachillerato que están aprendiendo el inglés como segunda lengua. El objetivo general es implementar un método automético basado en aprendizaje supervisado que permita resolver esta tarea para 6 indicadores en simultáneo: Cohesión, Sintaxis, Vocabulario, Gramática, Fraseología y Convenciones en escala de 1 a 5. Para lograrlo, se realiza un análisis descriptivo de los datos, se aplican procedimientos de preprocesamiento y se extraen características relevantes; se exploran diferentes estrategias, técnicas de representación y modelos desde algunos clásicos hasta aquellos con mejor desempeño en la actualidad, evaluando en cada iteración su rendimiento, contrastándola con las calificaciones humanas. Luego, se presenta el modelo con menor error que está basado principalmente en DeBERTa al cual se le aplican distintas técnicas para mejorar su desempeño y se combina con un modelo SVR que toma como características los embeddings de los textos concatenados en 10 modelos preentrenados sin fine-tuning. Con esta estrategia, el resultado se acerca bastante a las calificaciones humanas, presentando un RMSE de 0.45 sobre todos los indicadores. (Texto tomado de la fuente).
dc.description.abstractThis work addresses the challenge of automatically grading argumentative essays in English written by high school students that learn English as a second language. The general objective is to implement an automatic method based on supervised learning that allows solving this task for 6 indicators simultaneously: Cohesion, Syntax, Vocabulary, Grammar, Phraseology and Conventions rated on a scale from 1 to 5. To achieve this, a descriptive analysis of the data is conducted, preprocessing procedures are applied and relevant features are extracted; different strategies, representation techniques and models are explored, from some classic ones to the currently best performing models. Their performance is evaluated in each iteration, contrasting it with human ratings with a chosen measure. Then, the method with the best performance is presented, it is based mainly on DeBERTa V3 Large, where different techniques are applied to improve its performance. Finally, and is combined with a regressor model SVR that takes as features the concatenated embeddings of the texts in 10 different pretrained models. With this strategy, the result is quite close to human ratings, presenting a root mean square error of 0.45 over all indicators.
dc.format.extentvii, 61 páginas
dc.format.mimetypeapplication/pdf
dc.language.isospa
dc.publisherUniversidad Nacional de Colombia
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subject.ddc000 - Ciencias de la computación, información y obras generales::004 - Procesamiento de datos Ciencia de los computadores
dc.subject.ddc370 - Educación::373 - Educación secundaria
dc.titleMétodo basado en aprendizaje automático para la calificación de ensayos cortos en inglés de una muestra de estudiantes de bachillerato
dc.typeTrabajo de grado - Maestría
dc.type.driverinfo:eu-repo/semantics/masterThesis
dc.type.versioninfo:eu-repo/semantics/acceptedVersion
dc.publisher.programBogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación
dc.contributor.refereeLeón Guzmán, Elizabeth
dc.contributor.researchgrouplaboratorio de Investigación en Sistemas Inteligentes Lisi
dc.description.degreelevelMaestría
dc.description.degreenameMagíster en Ingeniería - Ingeniería de Sistemas y Computación
dc.description.researchareaSistemas inteligentes
dc.identifier.instnameUniversidad Nacional de Colombia
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourlhttps://repositorio.unal.edu.co/
dc.publisher.facultyFacultad de Ingeniería
dc.publisher.placeBogotá, Colombia
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotá
dc.relation.referencesP. Kline, The New Psychometrics: Science, Psychology and Measurement. Routledge, 1 ed., 1999.
dc.relation.referencesT. N. Fitria, “Artificial intelligence (AI) technology in OpenAI ChatGPT application: A review of ChatGPT in writing English essay.,” ELT Forum: Journal of English Language Teaching, vol. 12, no. 1, pp. 44–58, 2023.
dc.relation.referencesE. B. Page, “Grading Essays by Computer: Progress Report. Proceedings of the 1966 Invitational Conference on Testing Problems.,” Princeton, N.J. Educational Testing Service, pp. 87–100, 1967.
dc.relation.referencesE. Page, “The use of the computer in analyzing student essays,” Int Rev Educ, pp. 210–225, 1968.
dc.relation.referencesK. L. Gwet, “Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters,” Advanced Analytics, LLC, 2014.
dc.relation.referencesMark D. S., “Contrasting State-of-the-Art in the Machine Scoring of Short-Form Cons-tructed Responses,” Educational Assessment, vol. 20, no. 1, pp. 46–65, 2015.
dc.relation.referencesAlex Franklin, Natalie Rambis, Maggie Meg Benner, Perpetual Baffour, Ryan Holbrook, and u. Scott Crossley, “Feedback Prize - English Language Learning. Kaggle. ,” 2022.
dc.relation.referencesS. A. Crossley, K. Kyle, and D. S. Mcnamara, “To Aggregate or Not? Linguistic Features in Automatic Essay Scoring and Feedback Systems,” Grantee Submission, vol. 8, no. 1, 2015.
dc.relation.referencesC. Ramineni and D. M. Williamson, “Automated essay scoring: Psychometric guidelines and practices,” Assessing Writing, pp. 25–39, 2013.
dc.relation.referencesS. P. Balfour, “Assessing Writing in MOOCs: Automated Essay Scoring and Calibrated PeerReview™,” Research & Practice in Assessment, pp. 40–48, 2013.
dc.relation.referencesS. Cushing Weigle, “Validation of automated scores of TOEFL iBT tasks against non-test indicators of writing ability,” Language Testing, vol. 27, no. 3, pp. 335–353, 2010.
dc.relation.referencesK. Taghipour, Robust trait-specific essay scoring using neural networks and density es-timators. PhD thesis, National University of Singapore, Singapore, 2017.
dc.relation.referencesH. Shi and V. Aryadoust, “Correction to: A systematic review of automated writing evaluation systems,” Education and Information Technologies, vol. 28, pp. 6189–6190, 5 2023.
dc.relation.referencesP. C. Jackson, Toward human-level artificial intelligence: Representation and computation of meaning in natural language. Dover Publications, 11 2019.
dc.relation.referencesE. Mayfield and C. P. Rosé, “LightSIDE,” in Handbook of Automated Essay Evaluation, Routledge, 1 ed., 2013.
dc.relation.referencesM. Shermis and J. Burstein, Handbook of Automated Essay Evaluation: Current Applications and New Directions. Routledge, 1 ed., 2013.
dc.relation.referencesS. Burrows, I. Gurevych, and B. Stein, “The Eras and Trends of Automatic Short Answer Grading,” Int J Artif Intell Educ, vol. 25, pp. 60–117, 2015.
dc.relation.referencesK. Zupanc and Z. Bosnic, “Automated essay evaluation with semantic analysis,” El-sevier, vol. 120, pp. 118–132, 2017.
dc.relation.referencesD. Yan, A. A. Rupp, and P. W. Foltz, Handbook of Automated Scoring; Theory into Practice. Chapman and Hall/CRC., 1 ed., 2020.
dc.relation.referencesB. Kitchenham, O. Pearl Brereton, D. Budgen, M. Turner, J. Bailey, and S. Linkman, “Systematic literature reviews in software engineering – A systematic literature review,” Information and Software Technology, vol. 51, no. 1, pp. 7–15, 2009.
dc.relation.referencesY.-Y. Chen, C.-L. Liu, C.-H. Lee, and T.-H. Chang, “An unsupervised automated essay scoring system,” IEEE Intelligent Systems, vol. 25, no. 5, pp. 61–67, 2010.
dc.relation.referencesY. Wang, Z. Wei, Y. Zhou, and X. Huang, “Automatic essay scoring incorporating rating schema via reinforcement learning,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 791–797, 2018.
dc.relation.referencesC. Lu and M. Cutumisu, “Integrating Deep Learning into An Automated Feedback Generation System for Automated Essay Scoring,” International Educational Data Mining Society, 2021.
dc.relation.referencesK. S. McCarthy, R. D. Roscoe, L. K. Allen, A. D. Likens, and D. S. McNamara, “Automated writing evaluation: Does spelling and grammar feedback support high-quality writing and revision?,” Assessing Writing, vol. 52, 4 2022.
dc.relation.referencesA. Sharma and D. B. Jayagopi, “Automated grading of handwritten essays,” in Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR, vol. 2018-August, pp. 279–284, 2018.
dc.relation.referencesA. Vaswani, G. Brain, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention Is All You Need,” Neural Information Processing Systems, 2017.
dc.relation.referencesT. Pedersen, S. Patwardhan, and J. Michelizzi, “WordNet::Similarity-Measuring the Relatedness of Concepts,” AAAI, vol. 4, pp. 25–29, 7 2004.
dc.relation.referencesF. Dong and Y. Zhang, “Automatic Features for Essay Scoring – An Empirical Study. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing,” Association for Computational Linguistics., pp. 1072–1077, 11 2016.
dc.relation.referencesY. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, 2019.
dc.relation.referencesE. Mayfield and A. W. Black, “Should You Fine-Tune BERT for Automated Essay Scoring?,” Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 151–162, 7 2020.
dc.relation.referencesD. Ramesh and S. K. Sanampudi, “An automated essay scoring systems: a systematic literature review,” Artificial Intelligence Review, vol. 55, pp. 2495–2527, 3 2022.
dc.relation.referencesH. Yannakoudakis and R. Cummins, “Evaluating the performance of automated text scoring systems,” Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 213–223, 2015.
dc.relation.referencesR. Bhatt, M. Patel, G. Srivastava, and V. Mago, “A Graph Based Approach to Automate Essay Evaluation,” in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, vol. 2020-Octob, pp. 4379–4385, 2020.
dc.relation.referencesZ. Ke and V. Ng, “Automated essay scoring: A survey of the state of the art,” in IJCAI International Joint Conference on Artificial Intelligence, vol. 2019-Augus, pp. 6300–6308, 2019.
dc.relation.referencesJ. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2018.
dc.relation.referencesD. Castro-Castro, R. Lannes-Losada, M. Maritxalar, I. Niebla, C. Pérez-Marqués, N. Álamo-Suárez, and A. Pons-Porrata, “A multilingual application for automated essay scoring,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5290 LNAI, pp. 243–251, 2008.
dc.relation.referencesP. U. Rodriguez, A. Jafari, and C. M. Ormerod, “Language models and Automated Essay Scoring,” ArXiv, 9 2019.
dc.relation.referencesS. Ghannay, B. Favre, Y. Estève, and N. Camelin, “Word Embeddings Evaluation and Combination,” Proceedings of the Tenth International Conference on Language Resources and Evaluation, pp. 300–305, 5 2016.
dc.relation.referencesM. Mars, “From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough,” Applied Sciences (Switzerland), vol. 12, 9 2022.
dc.relation.referencesY. Zhang, R. Jin, and Z.-H. Zhou, “Understanding bag-of-words model: a statistical framework,” International journal of machine learning and cybernetics, vol. 1, pp. 43–52, 12 2010.
dc.relation.referencesK. W. CHURCH, “Word2Vec,” Natural Language Engineering, vol. 23, pp. 155–162, 1 2017.
dc.relation.referencesJ. Pennington, R. Socher, and C. D. Manning, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1535–1543, 2014.
dc.relation.referencesV. Kumar and B. Subba, “A TfidfVectorizer and SVM based sentiment analysis framework for text data corpus,” in 2020 National Conference on Communications (NCC), pp. 1–6, IEEE, 2 2020.
dc.relation.referencesMicrosoft, “GitHub - Microsoft/LightGBM:Light Gradient Boosting Machine.”
dc.relation.referencesG. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.Y. Liu, “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” 31st Conference on Neural Information Processing Systems (NIPS 2017), pp. 3149–3157, 2017.
dc.relation.referencesT. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, vol. 13-17-August-2016, pp. 785–794, Association for Computing Machinery, 8 2016.
dc.relation.referencesJ.J.Espinosa-Zúñiga, “Aplicación de algoritmos Random Forest y XGBoost en una base de solicitudes de tarjetas de crédito,” Ingeniería Investigación y Tecnología, vol. 21, no. 3, pp. 1–16, 2020.
dc.relation.referencesC. Cortes, V. Vapnik, and L. Saitta, “Support-Vector Networks,” Machine Leaming, vol. 20, pp. 273–297, 1995.
dc.relation.referencesM. Awad, R. Khanna, M. Awad, and R. Khanna, Support vector regression. Efficient learning machines: Theories, concepts, and applications for engineers and system designers. Apress, 2015.
dc.relation.referencesM.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, “Multilayer Perceptron and Neural Networks,” WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579–588, 2009.
dc.relation.referencesT. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language Models are Few-Shot Learners,” ArXiv, 5 2020.
dc.relation.referencesT. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. Von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. M. Rush, “Transformers: State-of-the-Art Natural Language Processing,” Association for Computational Linguistics, pp. 38–45, 2020.
dc.relation.referencesP. He, X. Liu, J. Gao, W. Chen, and M. Dynamics, “Deberta: Decoding-enhanced bert with disentangled attention,” Conference paper at ICLR, 2021.
dc.relation.referencesP. Zhang, “Longformer-based Automated Writing Assessment for English Language Learners Stanford CS224N Custom Project,” tech. rep., Standford, 2023.
dc.relation.referencesK. K. Y. Chan, T. Bond, and Z. Yan, “Application of an Automated Essay Scoring engine to English writing assessment using Many-Facet Rasch Measurement,” Language Testing, vol. 40, pp. 61–85, 1 2023.
dc.relation.referencesA. Mizumoto and M. Eguchi, “Exploring the potential of using an AI language model for automated essay scoring,” Research Methods in Applied Linguistics, vol. 2, 8 2023.
dc.relation.referencesV. Mohan, M. J. Ilamathi, and M. Nithya, “Preprocessing Techniques for Text Mining-An Overview,” International Journal of Computer Science & Communication Networks, vol. 5, no. 1, pp. 7–16, 2015.
dc.relation.referencesM. Siino, I. Tinnirello, and M. La Cascia, “Is text preprocessing still worth the time? A comparative survey on the influence of popular preprocessing methods on Transformers and traditional classifiers,” Information Systems, vol. 121, p. 102342, 3 2024.
dc.relation.referencesS. A. Crossley, D. B. Allen, and J. S. Danielle McNamara, “Text readability and intuitive simplification: A comparison of readability formulas,” Reading in a foreign language, vol. 23, no. 1, pp. 84–101, 2011.
dc.relation.referencesF. Scarselli and A. C. Tsoi, “Universal Approximation Using Feedforward Neural Networks: A Survey of Some Existing Methods, and Some New Results,” Neural Networks, vol. 11, no. 1, pp. 15–37, 1998.
dc.relation.referencesP. He, J. Gao, and W. Chen, “DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing,” ArXiv, 11 2021.
dc.relation.referencesD. Wu, S.-t. Xia, and Y. Wang, “Adversarial Weight Perturbation Helps Robust Generalization,” ArXiv, 4 2020.
dc.relation.referencesH. Inoue, “Multi-Sample Dropout for Accelerated Training and Better Generalization,” ArXiv, 5 2019.
dc.relation.referencesA. Stanciu, I. Cristescu, E. M. Ciuperca, and C. E. Cˆırnu, “Using an ensemble of transformer-based models for automated writing evaluation of essays,” in 14th International Conference on Education and New Learning Technologies, (Palma, Spain), pp. 5276–5282, IATED, 7 2022.
dc.relation.referencesH. Zhang, Y. Gong, Y. Shen, W. Li, J. Lv, N. Duan, and W. Chen, “Poolingformer: Long Document Modeling with Pooling Attention,” ArXiv, 2021.
dc.relation.referencesA. Aziz, M. Akram Hossain, and A. Nowshed Chy, “CSECU-DSG at SemEval-2023 Task 4: Fine-tuning DeBERTa Transformer Model with Cross-fold Training and Multisample Dropout for Human Values Identification,” tech. rep., Department of Computer Science and Engineering University of Chittagong, Chattogram, Bangladesh, 2023.
dc.relation.referencesE. Mayfield and A. W. Black, “Should You Fine-Tune BERT for Automated Essay Scoring?,” Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, pp. 151–162, 7 2020.
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.subject.proposalPLN
dc.subject.proposalAprendizaje supervisado
dc.subject.proposalTransformers
dc.subject.proposalEnsamble de modelos
dc.subject.proposalSVR
dc.subject.proposalNLP
dc.subject.proposalAutomatic essay grading
dc.subject.proposalSupervised learning
dc.subject.proposalSVR
dc.subject.proposalKaggle contest
dc.subject.proposalCalificación automática de ensayos
dc.subject.proposalEnsamble of models
dc.subject.unescoMétodo de evaluación
dc.subject.unescoEvaluation methods
dc.subject.unescoProcesamiento de datos
dc.subject.unescoData processing
dc.title.translatedMachine learning based method for scoring short english essays from a high school student sample
dc.type.coarhttp://purl.org/coar/resource_type/c_bdcc
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.contentText
dc.type.redcolhttp://purl.org/redcol/resource_type/TM
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2
dcterms.audience.professionaldevelopmentEstudiantes
dcterms.audience.professionaldevelopmentInvestigadores
dcterms.audience.professionaldevelopmentMaestros
dcterms.audience.professionaldevelopmentPúblico general
dc.subject.wikidataaprendizaje supervisado
dc.subject.wikidatasupervised learning


Archivos en el documento

Thumbnail

Este documento aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del documento

Atribución-NoComercial 4.0 InternacionalEsta obra está bajo licencia internacional Creative Commons Reconocimiento-NoComercial 4.0.Este documento ha sido depositado por parte de el(los) autor(es) bajo la siguiente constancia de depósito