Implementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje natural

dc.contributor.advisorCamargo Mendoza, Jorge Eliecer
dc.contributor.authorLópez Solano, Juan Camilo
dc.contributor.researchgroupUnsecurelab Cybersecurity Research Groupspa
dc.date.accessioned2022-07-25T20:26:15Z
dc.date.available2022-07-25T20:26:15Z
dc.date.issued2022
dc.descriptionilustraciones, graficasspa
dc.description.abstractLa seguridad informática o ciberseguridad se encarga de la protección de datos y servicios ante individuos no autorizados y protege las características de la información como la integridad, la confidencialidad y la disponibilidad. Existen múltiples amenazas y ataques que ponen en riesgo la seguridad informática como el ransomware, el malware o programas malignos, los ataques de denegación de servicios, las fallas de inyección, la ingeniería social, entre otros. En muchas ocasiones la parte más vulnerable de los sistemas son los usuarios, por este motivo los ciberdelincuentes usan la ingeniería social para adquirir información de forma ilícita de los usuarios. La ingeniería social consiste en la manipulación de los individuos mediante el engaño para que divulguen información privada o confidencial. Este tipo de ciberataque es muy difícil de detectar ya que puede ser ejecutado por cualquier individuo en cualquier momento y explota aspectos psicológicos de los humanos para engañarlos. En el presente trabajo se presenta la implementación de un modelo computacional basado en técnicas de Procesamiento de Lenguaje Natural para extraer características en textos y alimentar tres algoritmos de Aprendizaje de Máquina (redes neuronales, máquinas de vector de soporte y bosques aleatorios) para detectar posibles ataques de ingeniería social en textos. Los tres algoritmos fueron entrenados y evaluados, mostrando resultados que superan el 80% de exactitud en la detección de ataques de ingeniería social. (Texto tomado de la fuente)spa
dc.description.abstractComputer security or cybersecurity is responsible for the protection of data and services against unauthorized people and protects information characteristics such as integrity, confidentiality, and availability. There are multiple threats and attacks that put computer security at risk such as ransomware, malware, denial of services attacks, injection failures, social engineering, among others. In many cases, the most vulnerable part of systems are users, for this reason cybercriminals use social engineering to illegally acquire information from users. Social engineering consists of the manipulation of people through deception to make them disclose private or confidential information. This type of cyber-attack is very difficult to detect since it can be executed by any individual at any time and exploits psychological aspects of humans to deceive them. This paper presents the implementation of a computational model based on Natural Language Processing techniques to extract characteristics in texts and used to train three Machine Learning algorithms (Neural Network, Support Vector Machine and Random Forest) to detect possible social engineering attacks in texts. The three algorithms were trained and tested showing an accuracy over 80% in the task of detecting social engineering attacks.eng
dc.description.degreelevelMaestríaspa
dc.description.degreenameMagíster en Ingeniería - Ingeniería de Sistemas y Computaciónspa
dc.description.researchareaSistemas inteligentesspa
dc.format.extentxiv, 79 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/81747
dc.language.isospaspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.departmentDepartamento de Ingeniería de Sistemas e Industrialspa
dc.publisher.facultyFacultad de Ingenieríaspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ingeniería - Maestría en Ingeniería - Ingeniería de Sistemas y Computaciónspa
dc.relation.indexedRedColspa
dc.relation.indexedLaReferenciaspa
dc.relation.referencesAmat, J. (Abril 2017). Máquinas de Vector Soporte (Support Vector Machines, SVMs) https://www.cienciadedatos.net/documentos/34_maquinas_de_vector_soporte_support_vector_machinesspa
dc.relation.referencesBalim, C., & Gunal, E. S. (Noviembre 2019). Automatic Detection of Smishing Attacks by Machine Learning Methods. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1-3). IEEE.spa
dc.relation.referencesBezuidenhout, M., Mouton, F., & Venter, H. S. (2010). Social engineering attack detection model: SEADM. Proceedings of the 2010 Information Security for South Africa Conference, ISSA 2010.spa
dc.relation.referencesBhakta, R., & Harris, I. G. (2015). Semantic analysis of dialogs to detect social engineering attacks. Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015).spa
dc.relation.referencesBhardwaj, T., Sharma, T. K., & Pandit, M. R. (2014). Social engineering prevention by detecting malicious URLs using artificial bee colony algorithm. 355-363.spa
dc.relation.referencesBueno, F. (2019). Redes neuronales: entrenamiento y comportamiento.spa
dc.relation.referencesCialdini, Robert. (1993). Influence: Science and Practice.spa
dc.relation.referencesCoulombe, C. (2018). Text data augmentation made simple by leveraging nlp cloud apis. arXiv preprint arXiv:1812.04718.spa
dc.relation.referencesCraigen, D., Diakun-Thibault, N., & Purse, R. (2014). Defining cybersecurity. Technology Innovation Management Review, 4(10).spa
dc.relation.referencesDan, A., & Gupta, S. (2019). Social engineering attack detection and data protection model (SEADDPM). In Advances in Intelligent Systems and Computing (Vol. 811, pp. 15-24). https://doi.org/10.1007/978-981- 13-1544-2spa
dc.relation.referencesDel Pozo, I. (2018). Social engineering: Application of psychology to information security. 2018 6th International Conference on Future Internet of Things and Cloud Workshopsspa
dc.relation.referencesDenning, T., Lerner, A., Shostack, A., & Kohno, T. (2013, November). Control-Alt-Hack: the design and evaluation of a card game for computer security awareness and education. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security (pp. 915-928).spa
dc.relation.referencesFeng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for nlp. arXiv preprint arXiv:2105.03075.spa
dc.relation.referencesFootprint (2021). 2021 State of the Phish. An In-Depth Look at User Awareness, Vulnerability and Resilience. https://www.proofpoint.com/sites/default/files/threat-reports/pfpt-us-tr-state-of-the-phish-2021.pdfspa
dc.relation.referencesGatlan, S. (3 de septiembre de 2020). FBI: Thousands of orgs targeted by RDoS extortion campaign. BleepingComputer. https://www.bleepingcomputer.com/news/security/fbi-thousands-of-orgs-targeted-by-rdos-extortion-campaign/spa
dc.relation.referencesGoogletrans (11 de enero 2022). Googletrans 3.0.0 documentation. https://py-googletrans.readthedocs.io/en/latest/spa
dc.relation.referencesGragg, D. (2003). A multi-level defense against social engineering. SANS Reading Room, 13, 15.spa
dc.relation.referencesGregar, J. (1994). Research Design (Qualitative, Quantitative and Mixed Methods Approaches). Book published by SAGE Publications, 228.spa
dc.relation.referencesHadnagy, C. (2010). Social Engineering: The Art of Human Hacking.spa
dc.relation.referencesHernández-Sampieri, R., & Torres, C. P. M. (2018). Metodología de la investigación (Vol. 4). México^ eD. F DF: McGraw-Hill Interamericana.spa
dc.relation.referencesInfoblox. (2020). Cyberthreat Intelligence Report. The Infloblo Q3 2020.spa
dc.relation.referencesIvaturi, K., & Janczewski, L. (Junio 2011). A taxonomy for social engineering attacks. In International Conference on Information Resources Management (pp. 1-12). Centre for Information Technology, Organizations, and People.spa
dc.relation.referencesJanczewski, L., & Colarik, A. (Eds.). (2007). Cyber warfare and cyber terrorism. IGI Global.spa
dc.relation.referencesJunger, M., Montoya, L., & Overink, F. J. (2017). Priming and warnings are not effective to prevent social engineering attacks. Computers in human behavior, 66, 75-87.spa
dc.relation.referencesKaspersky. (2022). ¿Qué es la ciberseguridad?. Recuperado el 02 de enero de 2022 de https://latam.kaspersky.com/resource-center/definitions/what-is-cyber-securityspa
dc.relation.referencesKhonji, M., Iraqi, Y., & Jones, A. (2013). Phishing detection: a literature survey. IEEE Communications Surveys & Tutorials, 15(4), 2091-2121.spa
dc.relation.referencesKhorshed, M. T., Ali, A. S., & Wasimi, S. A. (2014). Combating Cyber Attacks in Cloud Systems Using Machine Learning. In Security, Privacy and Trust in Cloud Systems (pp. 407-431). Springer, Berlin, Heidelberg.spa
dc.relation.referencesKrombholz, K., Hobel, H., Huber, M., & Weippl, E. (2015). Advanced social engineering attacks. Journal of Information Security and applications, 22, 113-122.spa
dc.relation.referencesLansley, M., Mouton, F., Kapetanakis, S., & Polatidis, N. (2020). SEADer++: social engineering attack detection in online environments using machine learning. Journal of Information and Telecommunication, 4(3), 346-362.spa
dc.relation.referencesLansley, M., Polatidis, N., & Kapetanakis, S. (Septiembre 2019). Seader: A social engineering attack detection method based on natural language processing and artificial neural networks. In International Conference on Computational Collective Intelligence (pp. 686-696). Springer, Cham.spa
dc.relation.referencesLansley, M., Polatidis, N., Kapetanakis, S., Amin, K., Samakovitis, G., & Petridis, M. (2019). Seen the villains: Detecting Social Engineering Attacks using Case-based Reasoning and Deep Learning. In ICCBR Workshops (pp. 39-48).spa
dc.relation.referencesLong, J. (2011). No tech hacking: A guide to social engineering, dumpster diving, and shoulder surfing. Syngress.spa
dc.relation.referencesMalwarefox. (2021). How to Spot Fake Facebook Profile. https://www.malwarefox.com/spot-fake-facebook-profile/spa
dc.relation.referencesMatplotlib. (14 de enero 2022). Matplotlib: Visualization with Python. https://matplotlib.orgspa
dc.relation.referencesLópez, J., & Camargo, J., (Para ser presentada en Marzo 2022). Social Engineering Detection Using Natural Language Processing and Machine Learning.The 5th International Conference on Information and Computer Technologies (ICICT), 2022.spa
dc.relation.referencesMerino, R. F. M., & Chacón, C. I. Ñ. (2017). Bosques aleatorios como extensión de los árboles de clasificación con los programas R y Python. Interfases, (10), 165-189.spa
dc.relation.referencesMitnick, K. D., & Simon, W. L. (2003). The art of deception: Controlling the human element of security. John Wiley & Sons.spa
dc.relation.referencesMokhor, V. V, Tsurkan, O. V, Tsurkan, V. V, & Herasymov, R. P. (2017). Information security assessment of computer systems by socio-engineering approach. CEUR Workshop Proceedings, 2067, 92-98.spa
dc.relation.referencesMouton, F., Leenen, L., & Venter, H. S. (2016). Social Engineering Attack Detection Model: SEADMv2. Proceedings - 2015 International Conference on Cyberworlds, CW 2015, 216-223.spa
dc.relation.referencesMouton, F., Malan, M. M., Leenen, L., & Venter, H. S. (Agosto 2014). Social engineering attack framework. In 2014 Information Security for South Africa (pp. 1-9). IEEE.spa
dc.relation.referencesMouton, F., Teixeira, M., & Meyer, T. (Agosto 2017). Benchmarking a mobile implementation of the social engineering prevention training tool. In 2017 Information Security for South Africa (ISSA) (pp. 106-116). IEEE.spa
dc.relation.referencesNLTK. (14 de enero 2022). Documentation - Natural Language Toolkit. https://www.nltk.orgspa
dc.relation.referencesNumpy. (14 de enero 2022). Numpy documentation. https://numpy.org/doc/stable/spa
dc.relation.referencesOlabe, X. B. (1998). Redes neuronales artificiales y sus aplicaciones. Publicaciones de la Escuela de Ingenieros.spa
dc.relation.referencesOWASP. (2021). OWASP Top 10 - 2021. https://owasp.org/Top10/spa
dc.relation.referencesPython. (14 de enero 2022). History and License. https://docs.python.org/3/license.htmlspa
dc.relation.referencesPython. (14 de enero 2022). os — Interfaces misceláneas del sistema operativo. https://docs.python.org/es/3.9/library/os.html?highlight=#module-osspa
dc.relation.referencesPython. (14 de enero 2022). re — Operaciones con expresiones regulares. https://docs.python.org/es/3.9/library/re.htmlspa
dc.relation.referencesPython. (14 de enero 2022). time — Tiempo de acceso y conversiones. https://docs.python.org/es/3.9/library/time.html?highlight=time#module-timespa
dc.relation.referencesSahingoz, O. K., Buber, E., Demir, O., & Diri, B. (2019). Machine learning based phishing detection from URLs. Expert Systems with Applications, 117, 345-357.spa
dc.relation.referencesSawa, Y., Bhakta, R., Harris, I. G., & Hadnagy, C. (2016). Detection of Social Engineering Attacks Through Natural Language Processing of Conversations. Proceedings - 2016 IEEE 10th International Conference on Semantic Computing, ICSC 2016, 262�����265. https://doi.org/10.1109/ICSC.2016.95spa
dc.relation.referencesScikit-Learn (14 de enero 2022). Inicio - scikit-learn - Machine Learning in Python. https://scikit-learn.org/dev/index.htmlspa
dc.relation.referencesScikit-Learn (15 de enero 2022). 4.2. Permutation feature importance. https://scikit-learn.org/stable/modules/permutation_importance.htmlspa
dc.relation.referencesShorten, C., Khoshgoftaar, T. M., & Furht, B. (2021). Text data augmentation for deep learning. Journal of big Data, 8(1), 1-34.spa
dc.relation.referencesSimmons, M., & Lee, J. S. (Julio 2020). Catfishing: A Look into Online Dating and Impersonation. In International Conference on Human-Computer Interaction (pp. 349-358). Springer, Cham.spa
dc.relation.referencesSonicWall. (2021). SonicWall 2021 Cyber Threat Report.spa
dc.relation.referencesSpacy. (11 de enero 2022). Respositorio de código en Github de Spacy. https://github.com/explosion/spaCyspa
dc.relation.referencesSrivalli, & Prasanna, L. (2019). Cyber attacks. International Journal of Engineering and Advanced Technology, 8(6 Special Issue 3), 1934-1936. https://doi.org/10.35940/ijeat.F1372.0986S319spa
dc.relation.referencesStajano, F., & Wilson, P. (2011). Understanding scam victims: Seven principles for systems security. Communications of the ACM, 54(3), 70-75. https://doi.org/10.1145/1897852.1897872spa
dc.relation.referencesStajano, F., & Wilson, P. (2011). Understanding scam victims: seven principles for systems security. Communications of the ACM, 54(3), 70-75.spa
dc.relation.referencesThe Python Package Index. (14 de enero 2022). Powerful data structures for data analysis, time series, and statistics - Pandas. https://pypi.org/project/pandas/spa
dc.relation.referencesThe Python Package Index. (14 de enero 2022). Pure python spell checker based on work by Peter Norvig - Pyspellchecker. https://pypi.org/project/pyspellchecker/spa
dc.relation.referencesThe Python Package Index. (14 de enero 2022). Python HTTP for Humans - Requests. https://pypi.org/project/requests/spa
dc.relation.referencesTIBOE. (14 de enero 2022). TIOBE Index for January 2022. https://www.tiobe.com/tiobe-index/spa
dc.relation.referencesTweepy. (08 de enero 2022). Tweepy. https://www.tweepy.orgspa
dc.relation.referencesWirth, R., & Hipp, J. (Abril 2000). CRISP-DM: Towards a standard process model for data mining. In Proceedings of the 4th international conference on the practical applications of knowledge discovery and data mining (Vol. 1, pp. 29-39). London, UK: Springer-Verlag.spa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseReconocimiento 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/spa
dc.subject.ddc000 - Ciencias de la computación, información y obras generales::005 - Programación, programas, datos de computaciónspa
dc.subject.othersocial engineeringeng
dc.subject.otheringeniería socialspa
dc.subject.proposalCybersecurityeng
dc.subject.proposalSocial Engineeringeng
dc.subject.proposalNatural Language Processingeng
dc.subject.proposalMachine Learningeng
dc.subject.proposalCiberseguridadspa
dc.subject.proposalIngeniería Socialspa
dc.subject.proposalProcesamiento de Lenguaje Naturalspa
dc.subject.proposalAprendizaje de máquinaspa
dc.subject.unescoModelo de simulaciónspa
dc.subject.unescoSimulation modelseng
dc.titleImplementación de modelo computacional para la detección de ingeniería social basado en aprendizaje de máquina y procesamiento de lenguaje naturalspa
dc.title.translatedImplementation of computational model for social engineering detection based on machine learning and natural language processingeng
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdccspa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/masterThesisspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TMspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
dcterms.audience.professionaldevelopmentEstudiantesspa
dcterms.audience.professionaldevelopmentInvestigadoresspa
dcterms.audience.professionaldevelopmentPersonal de apoyo escolarspa
dcterms.audience.professionaldevelopmentPúblico generalspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
1020798860.2022.pdf
Tamaño:
1.64 MB
Formato:
Adobe Portable Document Format
Descripción:
Tesis de Maestría en Sistemas y Computación

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
3.98 KB
Formato:
Item-specific license agreed upon to submission
Descripción: