Uso de información auxiliar en la estimación inicial de la habilidad de una prueba adaptativa computarizada

Rodriguez Rivera, Nelson Andrés

Uso de información auxiliar en la estimación inicial de la habilidad de una prueba adaptativa computarizada

dc.contributor.advisor	Torres Jiménez, Camilo José
dc.contributor.author	Rodriguez Rivera, Nelson Andrés
dc.date.accessioned	2025-09-16T16:03:16Z
dc.date.available	2025-09-16T16:03:16Z
dc.date.issued	2025-09
dc.description	Ilustraciones, gráficos	spa
dc.description.abstract	Este trabajo se desarrolla en un escenario hipotético de implementación de pruebas adaptativas computarizadas (CAT, por sus siglas en inglés) en el contexto colombiano. Aunque el Instituto Colombiano para la Evaluación de la Educación (Icfes) no aplica actualmente este tipo de pruebas, ha realizado algunos pilotajes, lo cual motiva el análisis de sus posibles efectos y condiciones de aplicación. El objetivo del estudio es evaluar el uso de información auxiliar en la estimación inicial de la habilidad del evaluado, con el fin de seleccionar de manera más adecuada el primer ítem del examen. Para ello, se emplean modelos predictivos —específicamente regresión lineal, bosques aleatorios y redes neuronales artificiales— que permiten obtener una estimación inicial a partir de información contextual recopilada antes de la aplicación de la prueba. Posteriormente, se realizan simulaciones que comparan el desempeño del algoritmo adaptativo cuando se utiliza una estimación inicial basada en información auxiliar en comparación con escenarios en los que esta información no está disponible o no se emplea. Dichas simulaciones consideran factores clave como el tamaño del banco de ítems, el método de selección del primer ítem y los criterios de parada. Los resultados indican que contar con una estimación inicial cercana al valor real de la habilidad mejora la precisión final y reduce el número de ítems administrados. A partir de estos hallazgos, se presentan recomendaciones prácticas sobre las condiciones en las que el uso de información auxiliar podría incrementar de forma significativa la eficiencia y la precisión de futuras aplicaciones de pruebas adaptativas en el país. (Tomado de la fuente)	spa
dc.description.abstract	This study is developed within a hypothetical scenario of implementing computerized adaptive testing (CAT) in the Colombian context. Although the Colombian Institute for the Evaluation of Education (Icfes) does not currently administer this type of test, it has conducted several pilot studies, motivating the analysis of their potential effects and conditions for application. The objective of the study is to evaluate the use of auxiliary information in the initial estimation of examinee ability, with the aim of more appropriately selecting the first test item. To this end, predictive models—specifically linear regression, random forests, and artificial neural networks—are employed to obtain an initial estimate based on contextual information available prior to test administration. Subsequently, simulations are carried out to compare the performance of the adaptive algorithm when using an initial estimate based on auxiliary information against scenarios in which such information is unavailable or not used. These simulations consider key factors such as item bank size, first-item selection method, and stopping criteria. The results indicate that having an initial estimate close to the true ability value improves final precision and reduces the number of administered items. Based on these findings, practical recommendations are proposed regarding the conditions under which the use of auxiliary information could significantly enhance the efficiency and accuracy of future adaptive test implementations in the country.	eng
dc.description.curriculararea	Estadística.Sede Bogotá
dc.description.degreelevel	Maestría
dc.description.degreename	Magíster en Ciencias - Estadística
dc.description.technicalinfo	Se usó el software de procesamiento y análisis estadístico R en su versión 4.3.1 (2023-06-16 ucrt)	spa
dc.format.extent	99 páginas
dc.format.mimetype	application/pdf
dc.identifier.instname	Universidad Nacional de Colombia	spa
dc.identifier.reponame	Repositorio Institucional Universidad Nacional de Colombia	spa
dc.identifier.repourl	https://repositorio.unal.edu.co/	spa
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/88804
dc.language.iso	spa
dc.publisher	Universidad Nacional de Colombia
dc.publisher.branch	Universidad Nacional de Colombia - Sede Bogotá
dc.publisher.faculty	Facultad de Ciencias
dc.publisher.place	Bogotá, Colombia
dc.publisher.program	Bogotá - Ciencias - Maestría en Ciencias - Estadística
dc.relation.indexed	LaReferencia
dc.relation.references	AERA, A. and NCME (2014). Standards for Educational and Psychological Testing. American Educational Research Association, Washington, DC.
dc.relation.references	Allen, M. J. (2003). Assessing academic programs in higher education, volume 42. John Wiley & Sons.
dc.relation.references	Bartram, D. (2017). Computer-based testing and the internet. The Blackwell handbook of personnel selection, pages 397–418.
dc.relation.references	Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. Statistical theories of mental test scores.
dc.relation.references	Bishop, C. M. and Nasrabadi, N. M. (2006). Pattern recognition and machine learning, volume 4. Springer.
dc.relation.references	Bloom, B. S. et al. (1971). Handbook on formative and summative evaluation of student learning. ERIC.
dc.relation.references	Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123–140.
dc.relation.references	Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.
dc.relation.references	Casella, G. and Berger, R. L. (2002). Statistical Inference. Duxbury, Pacific Grove, CA, 2nd edition.
dc.relation.references	Chang, H.-H. (2015). Psychometrics behind computerized adaptive testing. Psychometrika, 80:1–20.
dc.relation.references	Chang, H.-H. and Ying, Z. (1996). A global information approach to computerized adaptive testing. Applied Psychological Measurement, 20(3):213–229.
dc.relation.references	Chen, S.-Y., Ankenmann, R. D., and Chang, H.-H. (2000). A comparison of item selection rules at the early stages of computerized adaptive testing. Applied Psychological Measurement, 24(3):241–255.
dc.relation.references	Cronbach, L. J. (1963). Course improvement through evaluation. Teachers college record, 64(8):1–13.
dc.relation.references	de Andrade, D. F., Tavares, H. R., and da Cunha Valle, R. (2000). Teoria da Resposta ao Item: conceitos e aplicações. ABE, Sao Paulo.
dc.relation.references	De Ayala, R. J. (2013). The theory and practice of item response theory. Guilford Publications.
dc.relation.references	Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society: series B (methodological), 39(1):1–22.
dc.relation.references	Efron, B. (1979). Bootstrap methods: another look at the jackknife. The Annals of Statistics, 7:1–26.
dc.relation.references	Elman, J. L. (1990). Finding structure in time. Cognitive science, 14(2):179–211.
dc.relation.references	Embretson, S. E. and Reise, S. P. (2013). Item response theory for psychologists. Psychology Press.
dc.relation.references	Faraway, J. J. (2016). Extending the linear model with R: generalized linear, mixed effects and nonparametric regression models. Chapman and Hall/CRC.
dc.relation.references	Gholamy, A., Kreinovich, V., and Kosheleva, O. (2018). Why 70/30 or 80/20 relation between training and testing sets: A pedagogical explanation. Technical report, The University of Texas at El Paso.
dc.relation.references	Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. http: //www.deeplearningbook.org.
dc.relation.references	Gulliksen, H. (1950). Theory of mental tests.
dc.relation.references	Hambleton, R. K. and Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational measurement: issues and practice, 12(3):38–47
dc.relation.references	Hambleton, R. K., Swaminathan, H., and Rogers, H. J. (1991). Fundamentals of item response theory, volume 2. Sage.
dc.relation.references	Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York, 2nd edition.
dc.relation.references	Haykin, S. (2009). Neural networks and learning machines, 3/E. Pearson Education India.
dc.relation.references	He, W. and Reckase, M. D. (2014). Item pool design for an operational variable-length computerized adaptive test. Educational and Psychological Measurement, 74(3):473–494.
dc.relation.references	Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, volume 1, pages 278–282. IEEE.
dc.relation.references	Huff, K. L. and Sireci, S. G. (2001). Validity issues in computer-based testing. Educational measurement: Issues and practice, 20(3):16–25.
dc.relation.references	Icfes (2019a). Boletín saber al detalle (edición 4) - ¿cómo se construye el Índice de nivel socioeconómico (inse) en el contexto de las pruebas saber? https://www.icfes.gov.co/ wp-content/uploads/2024/11/Edicion-4-boletin-saber-al-detalle-.pdf. Consultado el 12 de diciembre de 2024.
dc.relation.references	Icfes (2019b). Boletín saber al detalle (edición 6) - ¿en qué consiste la aplicación de pre saber 11° en versión adaptativa (cat)? https://www.icfes.gov.co/wp-content/uploads/ 2025/02/6-Edicion-boletin-saber-al-detalle.pdf. Consultado el 12 de diciembre de 2024.
dc.relation.references	Icfes (2024). Gu´ıa de orientaci´on examen saber 11°.
dc.relation.references	Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr.
dc.relation.references	Kane, M. (2006). Validation in rl breenan. Educational measurement 4th Ed.
dc.relation.references	LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature, 521(7553):436–444.
dc.relation.references	LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.
dc.relation.references	Lehmann, E. L. and Casella, G. (2006). Theory of point estimation. Springer Science & Business Media.
dc.relation.references	Lord, F. (1952). A theory of test scores. Psychometric monographs.
dc.relation.references	Lord, F., Novick, M., and Birnbaum, A. (1968). Statistical theories of mental test scores.
dc.relation.references	Lord, F. M. (1986). Maximum likelihood and bayesian parameter estimation in item response theory. Journal of Educational Measurement, pages 157–162.
dc.relation.references	Lord, F. M. (2012). Applications of item response theory to practical testing problems. Routledge.
dc.relation.references	Magis, D., Yan, D., and Von Davier, A. A. (2017). Computerized adaptive and multistage testing with R: Using packages catR and mstR. Springer.
dc.relation.references	Martín, E. S., Del Pino, G., and De Boeck, P. (2006). Irt models for ability-based guessing. Applied Psychological Measurement, 30(3):183–203.
dc.relation.references	Mendenhall, W. (2003). A Second Course in Statistics: Regression Analysis. Prentice Hall.
dc.relation.references	Mislevy, R. J. (1986). Bayes modal estimation in item response models. Psychometrika, 51(2):177–195.
dc.relation.references	Montgomery, D. C., Peck, E. A., and Vining, G. G. (2021). Introduction to linear regression analysis. John Wiley & Sons.
dc.relation.references	Park, J. Y., de Jong, T., Koning, B. B., and van der Meijden, H. A. T. (2018). An explanatory item response theory method for alleviating the cold-start problem in adaptive learning environments. Behavior Research Methods, 51(2):895–909.
dc.relation.references	Pliakos, K., Papamitsiou, Z., and Economides, A. A. (2019). Integrating machine learning into item response theory for addressing the cold-start problem in adaptive learning systems. Computers & Education, 137:91–106.
dc.relation.references	Popham, W. J. (2003). What every teacher should know about educational assessment. Pearson Education.
dc.relation.references	R Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
dc.relation.references	Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Nielsen & Lydiche.
dc.relation.references	Reckase, M. (2003). Item pool design for computerized adaptive tests. In annual meeting of the National Council on Measurement in Education, Chicago, IL.
dc.relation.references	Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61:85–117.
dc.relation.references	Shepard, L. A. (2000). The role of assessment in a learning culture. Educational researcher, 29(7):4–14.
dc.relation.references	Shmueli, G. (2010). To explain or to predict? Statistical science, pages 289–310.
dc.relation.references	Spearman, C. (1904). ”general intelligence,.objectively determined and measured. The American Journal of Psychology, 15(2):201–292.
dc.relation.references	Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1):1929–1958.
dc.relation.references	Suchman, E. (1968). Evaluative Research: Principles and Practice in Public Service and Social Action Progr. Russell Sage Foundation.
dc.relation.references	Thissen, D. and Wainer, H. (1982). Some standard errors in item response theory. Psychometrika, 47(4):397–412.
dc.relation.references	Van Der Linden, W. J. (1999). Empirical initialization of the trait estimator in adaptive testing. Applied Psychological Measurement, 23(1):21–29.
dc.relation.references	Van der Linden, W. J. and Glas, C. A. (2010). Elements of adaptive testing, volume 10. Springer.
dc.relation.references	Van der Linden, W. J., Glas, C. A., et al. (2000). Computerized adaptive testing: Theory and practice, volume 13. Springer.
dc.relation.references	Van der Linden, W. J. and Hambleton, R. K. (2015). Handbook of item response theory. CRC press.
dc.relation.references	von Davier, M. (2009). Is there need for the 3pl model? guess what? Measurement: Interdisciplinary Research and Perspectives.
dc.relation.references	Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., and Mislevy, R. J. (2000). Computerized adaptive testing: A primer. Routledge.
dc.relation.references	Wang, T. and Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and an example. Journal of Educational Measurement, 38(1):19– 49.
dc.relation.references	Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied psychological measurement, 6(4):473–492.
dc.relation.references	Yao, L., Pommerich, M., and Segall, D. O. (2014). Using multidimensional cat to administer a short, yet precise, screening test. Applied Psychological Measurement, 38(8):614–631.
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.rights.license	Atribución-NoComercial-CompartirIgual 4.0 Internacional
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subject.ddc	510 - Matemáticas::519 - Probabilidades y matemáticas aplicadas
dc.subject.lemb	Estimación de parámetros
dc.subject.lemb	Estadística matemática
dc.subject.lemb	Análisis de regresión
dc.subject.lemb	Redes neurales (Computadores)
dc.subject.lemb	Mediciones y pruebas educativas
dc.subject.proposal	Teoría de respuesta al ítem	spa
dc.subject.proposal	Pruebas adaptativas computarizadas	spa
dc.subject.proposal	Psicometría	spa
dc.subject.proposal	Modelado estadístico	spa
dc.subject.proposal	Item response theory	eng
dc.subject.proposal	Computer adaptive testing	eng
dc.subject.proposal	Psychometrics	eng
dc.subject.proposal	Statistical modeling	eng
dc.title	Uso de información auxiliar en la estimación inicial de la habilidad de una prueba adaptativa computarizada	spa
dc.title.translated	Use of auxiliary information in the initial ability estimation of a computerized adaptive test	eng
dc.type	Trabajo de grado - Maestría
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.content	Text
dc.type.driver	info:eu-repo/semantics/masterThesis
dc.type.redcol	http://purl.org/redcol/resource_type/TM
dc.type.version	info:eu-repo/semantics/acceptedVersion
dcterms.audience.professionaldevelopment	Investigadores
dcterms.audience.professionaldevelopment	Estudiantes
dcterms.audience.professionaldevelopment	Maestros
oaire.accessrights	http://purl.org/coar/access_right/c_abf2

Archivos

Bloque original

Mostrando 1 - 1 de 1

Nombre:: Tesis de Maestría en Ciencias - Estadística
Tamaño:: 1.17 MB
Formato:: Adobe Portable Document Format

Descargar

Bloque de licencias

Mostrando 1 - 1 de 1

Nombre:: license.txt
Tamaño:: 5.74 KB
Formato:: Item-specific license agreed upon to submission
Descripción:

Descargar

Colecciones

Maestría en Ciencias - Estadística