Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural

López Solano, Juan Camilo

Mostrar el registro sencillo del documento

dc.rights.license	Reconocimiento 4.0 Internacional
dc.contributor.advisor	Camargo Mendoza, Jorge Eliecer
dc.contributor.author	López Solano, Juan Camilo
dc.date.accessioned	2022-07-25T20:26:15Z
dc.date.available	2022-07-25T20:26:15Z
dc.date.issued	2022
dc.identifier.uri	https://repositorio.unal.edu.co/handle/unal/81747
dc.description	ilustraciones, graficas
dc.description.abstract	La seguridad informática o ciberseguridad se encarga de la protección de datos y servicios ante individuos no autorizados y protege las características de la información como la integridad, la confidencialidad y la disponibilidad. Existen múltiples amenazas y ataques que ponen en riesgo la seguridad informática como el ransomware, el malware o programas malignos, los ataques de denegación de servicios, las fallas de inyección, la ingeniería social, entre otros. En muchas ocasiones la parte más vulnerable de los sistemas son los usuarios, por este motivo los ciberdelincuentes usan la ingeniería social para adquirir información de forma ilícita de los usuarios. La ingeniería social consiste en la manipulación de los individuos mediante el engaño para que divulguen información privada o confidencial. Este tipo de ciberataque es muy difícil de detectar ya que puede ser ejecutado por cualquier individuo en cualquier momento y explota aspectos psicológicos de los humanos para engañarlos. En el presente trabajo se presenta la implementación de un modelo computacional basado en técnicas de Procesamiento de Lenguaje Natural para extraer características en textos y alimentar tres algoritmos de Aprendizaje de Máquina (redes neuronales, máquinas de vector de soporte y bosques aleatorios) para detectar posibles ataques de ingeniería social en textos. Los tres algoritmos fueron entrenados y evaluados, mostrando resultados que superan el 80% de exactitud en la detección de ataques de ingeniería social. (Texto tomado de la fuente)
dc.description.abstract	Computer security or cybersecurity is responsible for the protection of data and services against unauthorized people and protects information characteristics such as integrity, confidentiality, and availability. There are multiple threats and attacks that put computer security at risk such as ransomware, malware, denial of services attacks, injection failures, social engineering, among others. In many cases, the most vulnerable part of systems are users, for this reason cybercriminals use social engineering to illegally acquire information from users. Social engineering consists of the manipulation of people through deception to make them disclose private or confidential information. This type of cyber-attack is very difficult to detect since it can be executed by any individual at any time and exploits psychological aspects of humans to deceive them. This paper presents the implementation of a computational model based on Natural Language Processing techniques to extract characteristics in texts and used to train three Machine Learning algorithms (Neural Network, Support Vector Machine and Random Forest) to detect possible social engineering attacks in texts. The three algorithms were trained and tested showing an accuracy over 80% in the task of detecting social engineering attacks.
dc.format.extent	xiv, 79 páginas
dc.format.mimetype	application/pdf
dc.language.iso	spa
dc.publisher	Universidad Nacional de Colombia
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject.ddc	000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computación
dc.subject.other	social engineering
dc.subject.other	ingeniería social
dc.title	Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural
dc.type	Trabajo de grado - Maestría
dc.type.driver	info:eu-repo/semantics/masterThesis
dc.type.version	info:eu-repo/semantics/acceptedVersion
dc.publisher.program	Bogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computación
dc.contributor.researchgroup	Unsecurelab Cybersecurity Research Group
dc.description.degreelevel	Maestría
dc.description.degreename	Magíster en Ingeniería - Ingeniería de Sistemas y Computación
dc.description.researcharea	Sistemas inteligentes
dc.identifier.instname	Universidad Nacional de Colombia
dc.identifier.reponame	Repositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourl	https://repositorio.unal.edu.co/
dc.publisher.department	Departamento de Ingeniería de Sistemas e Industrial
dc.publisher.faculty	Facultad de Ingeniería
dc.publisher.place	Bogotá, Colombia
dc.publisher.branch	Universidad Nacional de Colombia - Sede Bogotá
dc.relation.indexed	RedCol
dc.relation.indexed	LaReferencia
dc.relation.references	Amat, J. (Abril 2017). Máquinas de Vector Soporte (Support Vector Machines, SVMs) https://www.cienciadedatos.net/documentos/34_maquinas_de_vector_soporte_support_vector_machines
dc.relation.references	Balim, C., & Gunal, E. S. (Noviembre 2019). Automatic Detection of Smishing Attacks by Machine Learning Methods. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1-3). IEEE.
dc.relation.references	Bezuidenhout, M., Mouton, F., & Venter, H. S. (2010). Social engineering attack detection model: SEADM. Proceedings of the 2010 Information Security for South Africa Conference, ISSA 2010.
dc.relation.references	Bhakta, R., & Harris, I. G. (2015). Semantic analysis of dialogs to detect social engineering attacks. Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).
dc.relation.references	Bhardwaj, T., Sharma, T. K., & Pandit, M. R. (2014). Social engineering prevention by detecting malicious URLs using artificial bee colony algorithm. 355-363.
dc.relation.references	Bueno, F. (2019). Redes neuronales: entrenamiento y comportamiento.
dc.relation.references	Cialdini, Robert. (1993). Influence: Science and Practice.
dc.relation.references	Coulombe, C. (2018). Text data augmentation made simple by leveraging nlp cloud apis. arXiv preprint arXiv:1812.04718.
dc.relation.references	Craigen, D., Diakun-Thibault, N., & Purse, R. (2014). Defining cybersecurity. Technology Innovation Management Review, 4(10).
dc.relation.references	Dan, A., & Gupta, S. (2019). Social engineering attack detection and data protection model (SEADDPM). In Advances in Intelligent Systems and Computing (Vol. 811, pp. 15-24). https://doi.org/10.1007/978-981- 13-1544-2
dc.relation.references	Del Pozo, I. (2018). Social engineering: Application of psychology to information security. 2018 6th International Conference on Future Internet of Things and Cloud Workshops
dc.relation.references	Denning, T., Lerner, A., Shostack, A., & Kohno, T. (2013, November). Control-Alt-Hack: the design and evaluation of a card game for computer security awareness and education. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security (pp. 915-928).
dc.relation.references	Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075.
dc.relation.references	Footprint (2021). 2021 State of the Phish. An In-Depth Look at User Awareness, Vulnerability and Resilience. https://www.proofpoint.com/sites/default/files/threat-reports/pfpt-us-tr-state-of-the-phish-2021.pdf
dc.relation.references	Gatlan, S. (3 de septiembre de 2020). FBI: Thousands of orgs targeted by RDoS extortion campaign. BleepingComputer. https://www.bleepingcomputer.com/news/security/fbi-thousands-of-orgs-targeted-by-rdos-extortion-campaign/
dc.relation.references	Googletrans (11 de enero 2022). Googletrans 3.0.0 documentation. https://py-googletrans.readthedocs.io/en/latest/
dc.relation.references	Gragg, D. (2003). A multi-level defense against social engineering. SANS Reading Room, 13, 15.
dc.relation.references	Gregar, J. (1994). Research Design (Qualitative, Quantitative and Mixed Methods Approaches). Book published by SAGE Publications, 228.
dc.relation.references	Hadnagy, C. (2010). Social Engineering: The Art of Human Hacking.
dc.relation.references	Hernández-Sampieri, R., & Torres, C. P. M. (2018). Metodología de la investigación (Vol. 4). México^ eD. F DF: McGraw-Hill Interamericana.
dc.relation.references	Infoblox. (2020). Cyberthreat Intelligence Report. The Infloblo Q3 2020.
dc.relation.references	Ivaturi, K., & Janczewski, L. (Junio 2011). A taxonomy for social engineering attacks. In International Conference on Information Resources Management (pp. 1-12). Centre for Information Technology, Organizations, and People.
dc.relation.references	Janczewski, L., & Colarik, A. (Eds.). (2007). Cyber warfare and cyber terrorism. IGI Global.
dc.relation.references	Junger, M., Montoya, L., & Overink, F. J. (2017). Priming and warnings are not effective to prevent social engineering attacks. Computers in human behavior, 66, 75-87.
dc.relation.references	Kaspersky. (2022). ¿Qué es la ciberseguridad?. Recuperado el 02 de enero de 2022 de https://latam.kaspersky.com/resource-center/definitions/what-is-cyber-security
dc.relation.references	Khonji, M., Iraqi, Y., & Jones, A. (2013). Phishing detection: a literature survey. IEEE Communications Surveys & Tutorials, 15(4), 2091-2121.
dc.relation.references	Khorshed, M. T., Ali, A. S., & Wasimi, S. A. (2014). Combating Cyber Attacks in Cloud Systems Using Machine Learning. In Security, Privacy and Trust in Cloud Systems (pp. 407-431). Springer, Berlin, Heidelberg.
dc.relation.references	Krombholz, K., Hobel, H., Huber, M., & Weippl, E. (2015). Advanced social engineering attacks. Journal of Information Security and applications, 22, 113-122.
dc.relation.references	Lansley, M., Mouton, F., Kapetanakis, S., & Polatidis, N. (2020). SEADer++: social engineering attack detection in online environments using machine learning. Journal of Information and Telecommunication, 4(3), 346-362.
dc.relation.references	Lansley, M., Polatidis, N., & Kapetanakis, S. (Septiembre 2019). Seader: A social engineering attack detection method based on natural language processing and artificial neural networks. In International Conference on Computational Collective Intelligence (pp. 686-696). Springer, Cham.
dc.relation.references	Lansley, M., Polatidis, N., Kapetanakis, S., Amin, K., Samakovitis, G., & Petridis, M. (2019). Seen the villains: Detecting Social Engineering Attacks using Case-based Reasoning and Deep Learning. In ICCBR Workshops (pp. 39-48).
dc.relation.references	Long, J. (2011). No tech hacking: A guide to social engineering, dumpster diving, and shoulder surfing. Syngress.
dc.relation.references	Malwarefox. (2021). How to Spot Fake Facebook Profile. https://www.malwarefox.com/spot-fake-facebook-profile/
dc.relation.references	Matplotlib. (14 de enero 2022). Matplotlib: Visualization with Python. https://matplotlib.org
dc.relation.references	López, J., & Camargo, J., (Para ser presentada en Marzo 2022). Social Engineering Detection Using Natural Language Processing and Machine Learning.The 5th International Conference on Information and Computer Technologies (ICICT), 2022.
dc.relation.references	Merino, R. F. M., & Chacón, C. I. Ñ. (2017). Bosques aleatorios como extensión de los árboles de clasificación con los programas R y Python. Interfases, (10), 165-189.
dc.relation.references	Mitnick, K. D., & Simon, W. L. (2003). The art of deception: Controlling the human element of security. John Wiley & Sons.
dc.relation.references	Mokhor, V. V, Tsurkan, O. V, Tsurkan, V. V, & Herasymov, R. P. (2017). Information security assessment of computer systems by socio-engineering approach. CEUR Workshop Proceedings, 2067, 92-98.
dc.relation.references	Mouton, F., Leenen, L., & Venter, H. S. (2016). Social Engineering Attack Detection Model: SEADMv2. Proceedings - 2015 International Conference on Cyberworlds, CW 2015, 216-223.
dc.relation.references	Mouton, F., Malan, M. M., Leenen, L., & Venter, H. S. (Agosto 2014). Social engineering attack framework. In 2014 Information Security for South Africa (pp. 1-9). IEEE.
dc.relation.references	Mouton, F., Teixeira, M., & Meyer, T. (Agosto 2017). Benchmarking a mobile implementation of the social engineering prevention training tool. In 2017 Information Security for South Africa (ISSA) (pp. 106-116). IEEE.
dc.relation.references	NLTK. (14 de enero 2022). Documentation - Natural Language Toolkit. https://www.nltk.org
dc.relation.references	Numpy. (14 de enero 2022). Numpy documentation. https://numpy.org/doc/stable/
dc.relation.references	Olabe, X. B. (1998). Redes neuronales artificiales y sus aplicaciones. Publicaciones de la Escuela de Ingenieros.
dc.relation.references	OWASP. (2021). OWASP Top 10 - 2021. https://owasp.org/Top10/
dc.relation.references	Python. (14 de enero 2022). History and License. https://docs.python.org/3/license.html
dc.relation.references	Python. (14 de enero 2022). os — Interfaces misceláneas del sistema operativo. https://docs.python.org/es/3.9/library/os.html?highlight=#module-os
dc.relation.references	Python. (14 de enero 2022). re — Operaciones con expresiones regulares. https://docs.python.org/es/3.9/library/re.html
dc.relation.references	Python. (14 de enero 2022). time — Tiempo de acceso y conversiones. https://docs.python.org/es/3.9/library/time.html?highlight=time#module-time
dc.relation.references	Sahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357.
dc.relation.references	Sawa, Y., Bhakta, R., Harris, I. G., & Hadnagy, C. (2016). Detection of Social Engineering Attacks Through Natural Language Processing of Conversations. Proceedings - 2016 IEEE 10th International Conference on Semantic Computing, ICSC 2016, 262��265. https://doi.org/10.1109/ICSC.2016.95
dc.relation.references	Scikit-Learn (14 de enero 2022). Inicio - scikit-learn - Machine Learning in Python. https://scikit-learn.org/dev/index.html
dc.relation.references	Scikit-Learn (15 de enero 2022). 4.2. Permutation feature importance. https://scikit-learn.org/stable/modules/permutation_importance.html
dc.relation.references	Shorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Text data augmentation for deep learning. Journal of big Data, 8(1), 1-34.
dc.relation.references	Simmons, M., & Lee, J. S. (Julio 2020). Catfishing: A Look into Online Dating and Impersonation. In International Conference on Human-Computer Interaction (pp. 349-358). Springer, Cham.
dc.relation.references	SonicWall. (2021). SonicWall 2021 Cyber Threat Report.
dc.relation.references	Spacy. (11 de enero 2022). Respositorio de código en Github de Spacy. https://github.com/explosion/spaCy
dc.relation.references	Srivalli, & Prasanna, L. (2019). Cyber attacks. International Journal of Engineering and Advanced Technology, 8(6 Special Issue 3), 1934-1936. https://doi.org/10.35940/ijeat.F1372.0986S319
dc.relation.references	Stajano, F., & Wilson, P. (2011). Understanding scam victims: Seven principles for systems security. Communications of the ACM, 54(3), 70-75. https://doi.org/10.1145/1897852.1897872
dc.relation.references	Stajano, F., & Wilson, P. (2011). Understanding scam victims: seven principles for systems security. Communications of the ACM, 54(3), 70-75.
dc.relation.references	The Python Package Index. (14 de enero 2022). Powerful data structures for data analysis, time series, and statistics - Pandas. https://pypi.org/project/pandas/
dc.relation.references	The Python Package Index. (14 de enero 2022). Pure python spell checker based on work by Peter Norvig - Pyspellchecker. https://pypi.org/project/pyspellchecker/
dc.relation.references	The Python Package Index. (14 de enero 2022). Python HTTP for Humans - Requests. https://pypi.org/project/requests/
dc.relation.references	TIBOE. (14 de enero 2022). TIOBE Index for January 2022. https://www.tiobe.com/tiobe-index/
dc.relation.references	Tweepy. (08 de enero 2022). Tweepy. https://www.tweepy.org
dc.relation.references	Wirth, R., & Hipp, J. (Abril 2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining (Vol. 1, pp. 29-39). London, UK: Springer-Verlag.
dc.rights.accessrights	info:eu-repo/semantics/openAccess
dc.subject.proposal	Cybersecurity
dc.subject.proposal	Social Engineering
dc.subject.proposal	Natural Language Processing
dc.subject.proposal	Machine Learning
dc.subject.proposal	Ciberseguridad
dc.subject.proposal	Ingeniería Social
dc.subject.proposal	Procesamiento de Lenguaje Natural
dc.subject.proposal	Aprendizaje de máquina
dc.subject.unesco	Modelo de simulación
dc.subject.unesco	Simulation models
dc.title.translated	Implementation of computational model for social engineering detection based on machine learning and natural language processing
dc.type.coar	http://purl.org/coar/resource_type/c_bdcc
dc.type.coarversion	http://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.content	Text
dc.type.redcol	http://purl.org/redcol/resource_type/TM
oaire.accessrights	http://purl.org/coar/access_right/c_abf2
dcterms.audience.professionaldevelopment	Estudiantes
dcterms.audience.professionaldevelopment	Investigadores
dcterms.audience.professionaldevelopment	Personal de apoyo escolar
dcterms.audience.professionaldevelopment	Público general

Archivos en el documento

Nombre:: 1020798860.2022.pdf
Tamaño:: 1.635Mb
Formato:: PDF
Descripción:: Tesis de Maestría en Sistemas y ...

Descargar

Este documento aparece en la(s) siguiente(s) colección(ones)

Maestría en Ingeniería - Sistemas y Computación [311]

Mostrar el registro sencillo del documento

Esta obra está bajo licencia internacional Creative Commons Reconocimiento-NoComercial 4.0.Este documento ha sido depositado por parte de el(los) autor(es) bajo la siguiente constancia de depósito