Show simple item record

dc.rights.licenseReconocimiento 4.0 Internacional
dc.contributor.advisorVilla Garzón, Fernán Alonso
dc.contributor.authorMárquez Aristizábal, Hugo Alejandro
dc.date.accessioned2022-08-22T21:32:48Z
dc.date.available2022-08-22T21:32:48Z
dc.date.issued2022-06-23
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/82000
dc.description.abstractLa extracción automática de información de documentos de identidad es una tarea fundamental en diferentes procesos digitales como registros, solicitud de productos, validación de identidad, entre otros. La extracción de información consiste en la identificación, ubicación, clasificación y reconocimiento del texto de campos clave presentes en un documento, en este caso un documento de identidad. Tratándose de documentos de identidad, los campos clave son aquellos como: nombres, apellidos, números de documento, fechas, entre otros. El problema de extracción de información se ha solucionado tradicionalmente utilizando algoritmos basados en reglas y motores clásicos de OCR. En los últimos años se han realizado implementaciones de modelos de aprendizaje de máquina, utilizando modelos de NLP (procesamiento de lenguaje natural) y CV (visión por computador) para solucionar el problema de una manera más flexible y eficiente (Subramani et al., 2020). En este trabajo se propuso solucionar el problema de extracción de información con una aproximación de detección de objetos. Se implementó, entrenó y evaluó un modelo de detección de objetos basado en transformadores (Carion et al., 2020). Se logró llegar a una solución que alcanza valores de precisión superiores al 95% en la detección de campos clave en documentos de identidad. (Texto tomado de la fuente)
dc.description.abstractAutomatic information extraction from identity documents is a fundamental task in digital processes such as onboarding, requesting products, identity validation, among others. The information extraction process consists of identifying, locating, classifying and recognizing text of the corresponding key fields that an identity document contains. In the case of identity documents, key fields are: names, last names, document number, dates, among others. The information extraction problem has been traditionally solved using rule based algorithms and classic OCR engines. In the last few years there have been implementations based on machine learning models, using NLP (natural language processing) and CV (computer vision) to solve the problem in a more flexible and efficient way (Subramani et al., 2020). This work proposes to solve the problem of information extraction with an object detection approach. An object detection model based on transformers (Carion et al., 2020) was implemented, trained and evaluated. A solution with above 95% accuracy in detecting key fields on identification documents was achieved.
dc.format.extent67 páginas
dc.format.mimetypeapplication/pdf
dc.language.isospa
dc.publisherUniversidad Nacional de Colombia
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.ddc000 - Ciencias de la computación, información y obras generales::003 - Sistemas
dc.subject.otherIdentidad digital
dc.subject.otherReconocimiento óptico de caracteres
dc.titleExtracción de información de documentos de identidad utilizando técnicas de aprendizaje de máquina
dc.typeTrabajo de grado - Maestría
dc.type.driverinfo:eu-repo/semantics/masterThesis
dc.type.versioninfo:eu-repo/semantics/acceptedVersion
dc.publisher.programMedellín - Minas - Maestría en Ingeniería - Analítica
dc.description.degreelevelMaestría
dc.description.degreenameMagíster en Ingeniería - Analítica
dc.identifier.instnameUniversidad Nacional de Colombia
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourlhttps://repositorio.unal.edu.co/
dc.publisher.departmentDepartamento de la Computación y la Decisión
dc.publisher.facultyFacultad de Minas
dc.publisher.placeMedellín
dc.publisher.branchUniversidad Nacional de Colombia - Sede Medellín
dc.relation.referencesAl-Badr, B., & Mahmoud, S. A. (1995). Survey and bibliography of Arabic optical text recognition. Signal Processing, 41(1), 49–77. https://doi.org/10.1016/0165-1684(94)00090-M
dc.relation.referencesAmin, A., & Shiu, R. (2001). Page Segmentation and Classification utilizing Bottom-up Approach. International Journal of Image and Graphics, 01(02), 345–361. https://doi.org/10.1142/S0219467801000219
dc.relation.referencesAppalaraju, S., Jasani, B., Kota, B. U., Xie, Y., & Manmatha, R. (2021). DocFormer: End-to-End Transformer for Document Understanding (arXiv:2106.11539). arXiv. http://arxiv.org/abs/2106.11539
dc.relation.referencesArlazarov, V. V., Bulatov, K., Chernov, T., & Arlazarov, V. L. (2019). MIDV-500: A dataset for identity document analysis and recognition on mobile devices in video stream. Computer Optics, 43(5), 818–824. https://doi.org/10.18287/2412-6179-2019-43-5-818-824
dc.relation.referencesBahdanau, D., Cho, K., & Bengio, Y. (2016). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv:1409.0473 [Cs, Stat]. http://arxiv.org/abs/1409.0473
dc.relation.referencesBajaj, R., Dey, L., & Chaudhury, S. (2002). Devnagari numeral recognition by combining decision of multiple connectionist classifiers. Sadhana, 27(1), 59–72. https://doi.org/10.1007/BF02703312
dc.relation.referencesBello, I., Zoph, B., Vaswani, A., Shlens, J., & Le, Q. V. (2020). Attention Augmented Convolutional Networks. ArXiv:1904.09925 [Cs]. http://arxiv.org/abs/1904.09925
dc.relation.referencesBertolotti, M. (2011). Optical Pattern Recognition, edited by F.T.S Yu and S. Jutamulia.
dc.relation.referencesBhavani, S., & Thanushkodi, D. K. (2010). A Survey On Coding Algorithms In Medical Image Compression. 02(05), 7.
dc.relation.referencesCarion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-End Object Detection with Transformers. ArXiv:2005.12872 [Cs]. http://arxiv.org/abs/2005.12872
dc.relation.referencesCastelblanco, A., Solano, J., Lopez, C., Rivera, E., Tengana, L., & Ochoa, M. (2020). Machine Learning Techniques for Identity Document Verification in Uncontrolled Environments: A Case Study. In K. M. Figueroa Mora, J. Anzurez Marín, J. Cerda, J. A. Carrasco-Ochoa, J. F. Martínez-Trinidad, & J. A. Olvera-López (Eds.), Pattern Recognition (Vol. 12088, pp. 271–281). Springer International Publishing. https://doi.org/10.1007/978-3-030-49076-8_26
dc.relation.referencesChan, W., Saharia, C., Hinton, G., Norouzi, M., & Jaitly, N. (2020). Imputer: Sequence Modelling via Imputation and Dynamic Programming. ArXiv:2002.08926 [Cs, Eess]. http://arxiv.org/abs/2002.08926
dc.relation.referencesChaudhuri, A., Mandaviya, K., Badelia, P., & K Ghosh, S. (2017). Optical Character Recognition Systems for Different Languages with Soft Computing (Vol. 352). Springer International Publishing. https://doi.org/10.1007/978-3-319-50252-6
dc.relation.referencesDalal, N., & Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 1, 886–893. https://doi.org/10.1109/CVPR.2005.177
dc.relation.referencesDe Brabandere, B., Neven, D., & Van Gool, L. (2017). Semantic Instance Segmentation with a Discriminative Loss Function. ArXiv:1708.02551 [Cs]. http://arxiv.org/abs/1708.02551
dc.relation.referencesDelteil, T., Belval, E., Chen, L., Goncalves, L., & Mahadevan, V. (2022). MATrIX -- Modality-Aware Transformer for Information eXtraction (arXiv:2205.08094). arXiv. http://arxiv.org/abs/2205.08094
dc.relation.referencesDevlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. http://arxiv.org/abs/1810.04805
dc.relation.referencesFelzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2010). Object Detection with Discriminatively Trained Part-Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9), 1627–1645. https://doi.org/10.1109/TPAMI.2009.167
dc.relation.referencesFu, C.-Y., Liu, W., Ranga, A., Tyagi, A., & Berg, A. C. (2017). DSSD: Deconvolutional Single Shot Detector. ArXiv:1701.06659 [Cs]. http://arxiv.org/abs/1701.06659
dc.relation.referencesGhazvininejad, M., Levy, O., Liu, Y., & Zettlemoyer, L. (2019). Mask-Predict: Parallel Decoding of Conditional Masked Language Models. ArXiv:1904.09324 [Cs, Stat]. http://arxiv.org/abs/1904.09324
dc.relation.referencesGirshick, R. (2015). Fast R-CNN. ArXiv:1504.08083 [Cs]. http://arxiv.org/abs/1504.08083
dc.relation.referencesGirshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. ArXiv:1311.2524 [Cs]. http://arxiv.org/abs/1311.2524
dc.relation.referencesGlorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. 8.
dc.relation.referencesGraves, A., & Schmidhuber, J. (2007). Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks. In Artificial Neural Networks – ICANN 2007 (Vol. 4668, pp. 549–558). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-74690-4_56
dc.relation.referencesGu, J., Bradbury, J., Xiong, C., Li, V. O. K., & Socher, R. (2018). Non-Autoregressive Neural Machine Translation. ArXiv:1711.02281 [Cs]. http://arxiv.org/abs/1711.02281
dc.relation.referencesGu, J., Kuen, J., Morariu, V. I., Zhao, H., Barmpalios, N., Jain, R., Nenkova, A., & Sun, T. (2022). Unified Pretraining Framework for Document Understanding (arXiv:2204.10939). arXiv. http://arxiv.org/abs/2204.10939
dc.relation.referencesHe, K., Gkioxari, G., Dollár, P., & Girshick, R. (2018). Mask R-CNN. ArXiv:1703.06870 [Cs]. http://arxiv.org/abs/1703.06870
dc.relation.referencesHe, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. ArXiv:1512.03385 [Cs]. http://arxiv.org/abs/1512.03385
dc.relation.referencesHuang, Y., Lv, T., Cui, L., Lu, Y., & Wei, F. (2022). LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking (arXiv:2204.08387). arXiv. http://arxiv.org/abs/2204.08387
dc.relation.referencesIslam, N., Islam, Z., & Noor, N. (2016). A Survey on Optical Character Recognition System. Journal of Information, 10(2), 4.
dc.relation.referencesJiao, L., Zhang, F., Liu, F., Yang, S., Li, L., Feng, Z., & Qu, R. (2019). A Survey of Deep Learning-Based Object Detection. IEEE Access, 7, 128837–128868. https://doi.org/10.1109/ACCESS.2019.2939201
dc.relation.referencesKatti, A. R., Reisswig, C., Guder, C., Brarda, S., Bickel, S., Höhne, J., & Faddoul, J. B. (2018). Chargrid: Towards Understanding 2D Documents. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 4459–4469. https://doi.org/10.18653/v1/D18-1476
dc.relation.referencesKim, G., Hong, T., Yim, M., Park, J., Yim, J., Hwang, W., Yun, S., Han, D., & Park, S. (2021). Donut: Document Understanding Transformer without OCR (arXiv:2111.15664). arXiv. http://arxiv.org/abs/2111.15664
dc.relation.referencesKrizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84–90. https://doi.org/10.1145/3065386
dc.relation.referencesLe, A. D., Pham, D. V., & Nguyen, T. A. (2019). Deep Learning Approach for Receipt Recognition. In T. K. Dang, J. Küng, M. Takizawa, & S. H. Bui (Eds.), Future Data and Security Engineering (Vol. 11814, pp. 705–712). Springer International Publishing. https://doi.org/10.1007/978-3-030-35653-8_50
dc.relation.referencesLebourgeois, F., Bublinski, Z., & Emptoz, H. (1992). A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents. Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems, 272–276. https://doi.org/10.1109/ICPR.1992.201771
dc.relation.referencesLeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
dc.relation.referencesLi, X., Zheng, Y., Hu, Y., Cao, H., Wu, Y., Jiang, D., Liu, Y., & Ren, B. (2022). Relational Representation Learning in Visually-Rich Documents (arXiv:2205.02411). arXiv. http://arxiv.org/abs/2205.02411
dc.relation.referencesLi, Y., Qian, Y., Yu, Y., Qin, X., Zhang, C., Liu, Y., Yao, K., Han, J., Liu, J., & Ding, E. (2021). StrucTexT: Structured Text Understanding with Multi-Modal Transformers (arXiv:2108.02923). arXiv. http://arxiv.org/abs/2108.02923
dc.relation.referencesLin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature Pyramid Networks for Object Detection. ArXiv:1612.03144 [Cs]. http://arxiv.org/abs/1612.03144
dc.relation.referencesLin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2018). Focal Loss for Dense Object Detection. ArXiv:1708.02002 [Cs]. http://arxiv.org/abs/1708.02002
dc.relation.referencesLin, T.-Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L., & Dollár, P. (2015). Microsoft COCO: Common Objects in Context. ArXiv:1405.0312 [Cs]. http://arxiv.org/abs/1405.0312
dc.relation.referencesLiu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2020). Deep Learning for Generic Object Detection: A Survey. International Journal of Computer Vision, 128(2), 261–318. https://doi.org/10.1007/s11263-019-01247-4
dc.relation.referencesLiu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., & Berg, A. C. (2016). SSD: Single Shot MultiBox Detector. ArXiv:1512.02325 [Cs], 9905, 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
dc.relation.referencesLong, J., Shelhamer, E., & Darrell, T. (2015). Fully Convolutional Networks for Semantic Segmentation. ArXiv:1411.4038 [Cs]. http://arxiv.org/abs/1411.4038
dc.relation.referencesLowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2), 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94
dc.relation.referencesMatas, J., Chum, O., Urban, M., & Pajdla, T. (2004). Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10), 761–767. https://doi.org/10.1016/j.imavis.2004.02.006
dc.relation.referencesMori, S., Nishida, H., & Yamada, H. (1999). Optical Character Recognition.
dc.relation.referencesNamysl, M., & Konya, I. (2019). Efficient, Lexicon-Free OCR using Deep Learning. ArXiv:1906.01969 [Cs]. http://arxiv.org/abs/1906.01969
dc.relation.referencesNiu, Z., Zhong, G., & Yu, H. (2021). A review on the attention mechanism of deep learning. Neurocomputing, 452, 48–62. https://doi.org/10.1016/j.neucom.2021.03.091
dc.relation.referencesOord, A. van den, Li, Y., Babuschkin, I., Simonyan, K., Vinyals, O., Kavukcuoglu, K., Driessche, G. van den, Lockhart, E., Cobo, L. C., Stimberg, F., Casagrande, N., Grewe, D., Noury, S., Dieleman, S., Elsen, E., Kalchbrenner, N., Zen, H., Graves, A., King, H., … Hassabis, D. (2017). Parallel WaveNet: Fast High-Fidelity Speech Synthesis. ArXiv:1711.10433 [Cs]. http://arxiv.org/abs/1711.10433
dc.relation.referencesPadilla, R., Netto, S. L., & da Silva, E. A. B. (2020). A Survey on Performance Metrics for Object-Detection Algorithms. 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), 237–242. https://doi.org/10.1109/IWSSIP48289.2020.9145130
dc.relation.referencesParmar, N., Vaswani, A., Uszkoreit, J., Kaiser, Ł., Shazeer, N., Ku, A., & Tran, D. (2018). Image Transformer. ArXiv:1802.05751 [Cs]. http://arxiv.org/abs/1802.05751
dc.relation.referencesPeters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. ArXiv:1802.05365 [Cs]. http://arxiv.org/abs/1802.05365
dc.relation.referencesRedmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. ArXiv:1506.02640 [Cs]. http://arxiv.org/abs/1506.02640
dc.relation.referencesRedmon, J., & Farhadi, A. (2016). YOLO9000: Better, Faster, Stronger. ArXiv:1612.08242 [Cs]. http://arxiv.org/abs/1612.08242
dc.relation.referencesRedmon, J., & Farhadi, A. (2018). YOLOv3: An Incremental Improvement. ArXiv:1804.02767 [Cs]. http://arxiv.org/abs/1804.02767
dc.relation.referencesRen, M., & Zemel, R. S. (2017). End-to-End Instance Segmentation with Recurrent Attention. ArXiv:1605.09410 [Cs]. http://arxiv.org/abs/1605.09410
dc.relation.referencesRen, S., He, K., Girshick, R., & Sun, J. (2016). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. ArXiv:1506.01497 [Cs]. http://arxiv.org/abs/1506.01497
dc.relation.referencesRezatofighi, S. H., G, V. K. B., Milan, A., Abbasnejad, E., Dick, A., & Reid, I. (2017). DeepSetNet: Predicting Sets with Deep Neural Networks. ArXiv:1611.08998 [Cs]. http://arxiv.org/abs/1611.08998
dc.relation.referencesRomera-Paredes, B., & Torr, P. H. S. (2016). Recurrent Instance Segmentation. ArXiv:1511.08250 [Cs]. http://arxiv.org/abs/1511.08250
dc.relation.referencesRothe, R., Guillaumin, M., & Van Gool, L. (2015). Non-maximum Suppression for Object Detection by Passing Messages Between Windows. In D. Cremers, I. Reid, H. Saito, & M.-H. Yang (Eds.), Computer Vision – ACCV 2014 (Vol. 9003, pp. 290–306). Springer International Publishing. https://doi.org/10.1007/978-3-319-16865-4_19
dc.relation.referencesSabu, A. M., & Das, A. S. (2018). A Survey on various Optical Character Recognition Techniques. 2018 Conference on Emerging Devices and Smart Systems (ICEDSS), 152–155. https://doi.org/10.1109/ICEDSS.2018.8544323
dc.relation.referencesSalvador, A., Bellver, M., Campos, V., Baradad, M., Marques, F., Torres, J., & Giro-i-Nieto, X. (2019). Recurrent Neural Networks for Semantic Instance Segmentation. ArXiv:1712.00617 [Cs]. http://arxiv.org/abs/1712.00617
dc.relation.referencesSatti, D. A. (2013). Offline Urdu Nastaliq OCR for Printed Text using Analytical Approach. 161.
dc.relation.referencesShen, H., & Coughlan, J. M. (2012). Towards a Real-Time System for Finding and Reading Signs for Visually Impaired Users. In K. Miesenberger, A. Karshmer, P. Penaz, & W. Zagler (Eds.), Computers Helping People with Special Needs (Vol. 7383, pp. 41–47). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-31534-3_7
dc.relation.referencesSherstinsky, A. (2020). Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network. Physica D: Nonlinear Phenomena, 404, 132306. https://doi.org/10.1016/j.physd.2019.132306
dc.relation.referencesShi, B., Bai, X., & Yao, C. (2015). An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition. ArXiv:1507.05717 [Cs]. http://arxiv.org/abs/1507.05717
dc.relation.referencesSimonyan, K., & Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. ArXiv:1409.1556 [Cs]. http://arxiv.org/abs/1409.1556
dc.relation.referencesSmith, R. (2007). An Overview of the Tesseract OCR Engine. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Vol 2, 629–633. https://doi.org/10.1109/ICDAR.2007.4376991
dc.relation.referencesStewart, R., & Andriluka, M. (2015). End-to-end people detection in crowded scenes. ArXiv:1506.04878 [Cs]. http://arxiv.org/abs/1506.04878
dc.relation.referencesSubramani, N., Matton, A., Greaves, M., & Lam, A. (2020). A Survey of Deep Learning Approaches for OCR and Document Understanding. ArXiv:2011.13534 [Cs]. http://arxiv.org/abs/2011.13534
dc.relation.referencesSzegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2014). Going Deeper with Convolutions. ArXiv:1409.4842 [Cs]. http://arxiv.org/abs/1409.4842
dc.relation.referencesTan, M., & Le, Q. V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ArXiv:1905.11946 [Cs, Stat]. http://arxiv.org/abs/1905.11946
dc.relation.referencesVaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. ArXiv:1706.03762 [Cs]. http://arxiv.org/abs/1706.03762
dc.relation.referencesViola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 1, I-511-I–518. https://doi.org/10.1109/CVPR.2001.990517
dc.relation.referencesXu, Y., Li, M., Cui, L., Huang, S., Wei, F., & Zhou, M. (2020). LayoutLM: Pre-training of Text and Layout for Document Image Understanding. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1192–1200. https://doi.org/10.1145/3394486.3403172
dc.relation.referencesXu, Y., Xu, Y., Lv, T., Cui, L., Wei, F., Wang, G., Lu, Y., Florencio, D., Zhang, C., Che, W., Zhang, M., & Zhou, L. (2022). LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding (arXiv:2012.14740). arXiv. http://arxiv.org/abs/2012.14740
dc.relation.referencesZaidi, S. S. A., Ansari, M. S., Aslam, A., Kanwal, N., Asghar, M., & Lee, B. (2021). A Survey of Modern Deep Learning based Object Detection Models. ArXiv:2104.11892 [Cs, Eess]. http://arxiv.org/abs/2104.11892
dc.relation.referencesZhang, P., Xu, Y., Cheng, Z., Pu, S., Lu, J., Qiao, L., Niu, Y., & Wu, F. (2021). TRIE: End-to-End Text Reading and Information Extraction for Document Understanding (arXiv:2005.13118). arXiv. http://arxiv.org/abs/2005.13118
dc.relation.referencesZhang, S., Wen, L., Bian, X., Lei, Z., & Li, S. Z. (2018). Single-Shot Refinement Neural Network for Object Detection. ArXiv:1711.06897 [Cs]. http://arxiv.org/abs/1711.06897
dc.relation.referencesZhang, Y., Hare, J., & Prügel-Bennett, A. (2020). Deep Set Prediction Networks. ArXiv:1906.06565 [Cs, Stat]. http://arxiv.org/abs/1906.06565
dc.relation.referencesZhang, Z., Ma, J., Du, J., Wang, L., & Zhang, J. (2022). Multimodal Pre-training Based on Graph Attention Network for Document Understanding (arXiv:2203.13530). arXiv. http://arxiv.org/abs/2203.13530
dc.relation.referencesZhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., & Ling, H. (2019). M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network. ArXiv:1811.04533 [Cs]. http://arxiv.org/abs/1811.04533
dc.relation.referencesZhao, X., Niu, E., Wu, Z., & Wang, X. (2019). CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor. ArXiv:1903.12363 [Cs]. http://arxiv.org/abs/1903.12363
dc.relation.referencesZhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2021). DEFORMABLE DETR: DEFORMABLE TRANSFORMERS FOR END-TO-END OBJECT DETECTION. 16.
dc.relation.referencesZou, Z., Shi, Z., Guo, Y., & Ye, J. (2019). Object Detection in 20 Years: A Survey. 40.
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.subject.lembRedes neuronales (Computadoers)
dc.subject.proposalIentidad digital
dc.subject.proposalOCR
dc.subject.proposalDigital identity
dc.subject.proposalExtracción de información
dc.subject.proposalInformation extraction
dc.subject.proposalObject detection
dc.subject.proposalDetección de objetos
dc.title.translatedInformation extraction from identification documents using machine learning techniques
dc.type.coarhttp://purl.org/coar/resource_type/c_bdcc
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.contentText
dc.type.redcolhttp://purl.org/redcol/resource_type/TM
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2
dcterms.audience.professionaldevelopmentEstudiantes
dcterms.audience.professionaldevelopmentInvestigadores
dc.description.curricularareaÁrea Curricular de Ingeniería de Sistemas e Informática


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Reconocimiento 4.0 InternacionalThis work is licensed under a Creative Commons Reconocimiento-NoComercial 4.0.This document has been deposited by the author (s) under the following certificate of deposit