Métodos para identificar asociaciones entre genotipos y múltiples fenotipos

dc.contributor.advisorLópez Kleine, Liliana
dc.contributor.authorAcero Baena, Juan Pablo
dc.date.accessioned2023-08-08T14:29:39Z
dc.date.available2023-08-08T14:29:39Z
dc.date.issued2023
dc.descriptionilustraciones, diagramasspa
dc.description.abstractLa secuenciación de genomas ha permitido aumentar el conocimiento en varios aspectos de la biología de los organismos. Una de las principales ramas que ha surgido es el estudio de asociación del genoma completo (Genome Wide Association Studies, GWAS), el cual ha permitido por medio de la asociación entre genotipos y fenotipos, identificar aspectos genotípicos relacionados con enfermedades complejas tales como el Alzheimer , la diabetes, el cáncer, entre otras. Originalmente, la mayor parte de estos estudios se han realizado para un solo fenotipo, por esta razón, tomando como base la metodología presentada por Guo y Wu, 2018 se evaluaron las asociaciones entre genotipos y fenotipos múltiples aplicando los métodos Principal Component Based Association Test, denotado como ET, Omnibus Test (OT) y Adaptative Test (AT), sobre tres bases de datos reales y un set de datos simulados binarios correlacionados. Así mismo, se evaluaron los desempeños de las metodologías comparándolas entre sí, teniendo en cuenta su capacidad para rechazar la mayor cantidad de hipótesis en pruebas múltiples y la potencia en los datos simulados. La comparación y caracterización de los métodos permitió establecer un flujo de trabajo óptimo, una identificación de los puntos positivos y negativos de cada una de las metodologias probadas. Igualmente, en la aplicación a bases de datos reales y simuladas se identificaron los aspectos a considerar para tener un m´etodo más sensible y específico. Se evaluó la mejora propuesta que consistió en la inclusión de la frecuencia y proporción de los alelos raros de cada SNP en el método AT. Estos resultados permitieron observar una mejora en la potencia del método AT, demostrando que la inclusión de dicha frecuencia es un insumo importante para detectar una mejor asociación entre un fenotipo y un genotipo. (Texto tomado de la fuente)spa
dc.description.abstractGenome sequencing has increased knowledge in various aspects of the biology of organisms. One of the main branches that has emerged is the Genome Wide Association Studies (GWAS), which has allowed, through the association between genotypes and phenotypes, to identify genotypic aspects related to complex diseases such as Alzheimer’s, diabetes, cancer, among others, to identify genotypic aspects related to complex diseases such as Alzheimer’s disease, diabetes, cancer, among others. Originally, most of these studies have been performed for a single phenotype, for this reason, taking as a basis the methodology presented by Guo y Wu, 2018, the associations between genotypes and multiple phenotypes were evaluated by applying the methods ¨textitPrincipal Component Based Association Test, denoted as ET, Omnibus Test (OT) and Adaptative Test (AT), on three real datasets and a correlated binary simulated dataset. The performance of the methodologies was also evaluated by comparing them with each other, taking into account their ability to reject the largest number of hypotheses in multiple testing and the power in the simulated data. The comparison and characterization of the methods allowed establishing an optimal workflow, an identification of the positive and negative points of each of the tested methodologies. Likewise, in the application to real and simulated databases, the aspects to be considered in order to have a more sensitive and specific method were identified. The proposed improvement that consisted in the inclusion of the frequency and proportion of rare alleles of each SNP in the AT method was evaluated. These results allowed observing an improvement in the power of the AT method, demonstrating that the inclusion of such frequency is an important input to detect a better association between a phenotype and a genotypeeng
dc.description.degreelevelMaestríaspa
dc.description.degreenameMagíster en Ciencias - Estadísticaspa
dc.description.researchareaEstadística Genómicaspa
dc.format.extent66 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/84474
dc.language.isospaspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.facultyFacultad de Cienciasspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ciencias - Maestría en Ciencias - Estadísticaspa
dc.relation.referencesAgresti, A. (2015). Foundations of linear and generalized linear models. John Wiley & Sons Inc.spa
dc.relation.referencesAndrews, S. J., Fulton-Howard, B., & Goate, A. (2020). Interpretation of risk loci from genome-wide association studies of Alzheimer’s disease. The Lancet Neurology, 19, 326-335. https://doi.org/10.1016/ s1474-4422(19)30435-1spa
dc.relation.referencesBenafif, S., Kote-Jarai, Z., & Eeles, R. A. (2018). A Review of Prostate Cancer Genome-Wide Association Studies (GWAS). Cancer Epidemiology Biomarkers Prevention, 27, 845-857. https://doi.org/10. 1158/1055-9965.epi-16-1046spa
dc.relation.referencesBoca, S. M., & Leek, J. T. (2015). A direct approach to estimating false discovery rates conditional on covariates. bioRxiv. https://doi.org/10.1101/035675spa
dc.relation.referencesCarlos Fang-Mercado, L., Urrego-´Alvarez, J., Andr´es, E., Merlano-Bar´on, Meza-Torres, C., Hern´andez- Bonfante, L., L´opez-Kleine, L., & Marrugo-Cano, J. (2017). Art´ıculo original Influence of lifestyle, diet and vitamin D on atopy in a population of Afro-descendant Colombian children. Rev Alerg Mex, 64, 277-290.spa
dc.relation.referencesChul, G., Park, T., Park, D., & Shin. (1996). A Simple Method for Generating Correlated Binary Variates A Simple Method for Generating Correlated Binary Variates. Source: The American Statistician, 50, 306-310.spa
dc.relation.referencesCortés Muñoz, F. (2019). Methodology for estimating association between categorical variables with application to Genome-wide association studies (GWAS) (Tesis doctoral).spa
dc.relation.referencesDudoit, S., Gilbert, H. N., & van der Laan, M. J. (2008). Resampling-Based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability and Expected Value Error Rates: Focus on the False Discovery Rate and Simulation Study. Biometrical Journal, 50, 716-744. https: //doi.org/10.1002/bimj.200710473spa
dc.relation.referencesEhret, G. B. (2010). Genome-Wide Association Studies: Contribution of Genomics to Understanding Blood Pressure and Essential Hypertension. Current hypertension reports, 12, 17-25. https://doi.org/10. 1007/s11906-009-0086-6spa
dc.relation.referencesEmrich, L. J., & Piedmonte, M. R. (1991). A Method for Generating High-Dimensional Multivariate Binary Variates. The American Statistician, 45, 302. https://doi.org/10.2307/2684460spa
dc.relation.referencesFernández-Santiago, R., & Sharma, M. (2022). What have we learned from genome-wide association studies (GWAS) in Parkinson disease? Ageing Research Reviews, 101648. https://doi.org/10.1016/j.arr. 2022.101648spa
dc.relation.referencesFang-Mercado, L. C., Urrego- Álvarez, J. R., Merlano-Barón, A. E., Meza-Torres, C., Hernández-Bonfante, L., López-Kleine, L., & Marrugo-Cano, J. (2017). Influencia del estilo de vida, la dieta y la vitamina D en la atopia en niños colombianos afrodescendientes. Revista Alergia México, 64, 277. https : //doi.org/10.29262/ram.v64i3.275spa
dc.relation.referencesFrayling, T. M. (2007). Genome–wide association studies provide new insights into type 2 diabetes aetiology. Nature Reviews Genetics, 8, 657-662. https://doi.org/10.1038/nrg2178spa
dc.relation.referencesGibbons, J. D., & Chakraborti, S. (2021). Nonparametric statistical inference. Crc Press.spa
dc.relation.referencesGuide, G. G. U. (s.f.). Manhattan Plot. www.jmp.com. Consultado el 19 de junio de 2023, desde https://www. jmp.com/support/downloads/JMPG101 documentation/Content/JMPGUserGuide/GR G 0022. htmspa
dc.relation.referencesGuo, B., & Wu, B. (2018). Integrate multiple traits to detect novel trait–gene association using GWAS summary data with an adaptive test approach (R. Schwartz, Ed.). Bioinformatics, 35, 2251-2257. https://doi.org/10.1093/bioinformatics/bty961spa
dc.relation.referencesJohnson, R. A., & Wichern, D. W. (2019). Applied multivariate statistical analysis. Pearson.spa
dc.relation.referencesLiu, Z., & Lin, X. (2017). Multiple phenotype association tests using summary statistics in genome-wide association studies. Biometrics, 74, 165-175. https://doi.org/10.1111/biom.12735spa
dc.relation.referencesLoos, R. J. F. (2020). 15 years of genome-wide association studies and no signs of slowing down. Nature Communications, 11. https://doi.org/10.1038/s41467-020-19653-5spa
dc.relation.referencesOtto, L.-G., Mondal, P., Brassac, J., Preiss, S., Degenhardt, J., He, S., Reif, J. C., & Sharbel, T. F. (2017). Use of genotyping-by-sequencing to determine the genetic structure in the medicinal plant chamomile, and to identify flowering time and alpha-bisabolol associated SNP-loci by genome-wide association mapping. BMC Genomics, 18. https://doi.org/10.1186/s12864-017-3991-0spa
dc.relation.referencesMedlinePlus. (2022). ¿Cuáles son los riesgos y las limitaciones de las pruebas genéticas: MedlinePlus Genetics. medlineplus.gov. https://medlineplus.gov/spanish/genetica/entender/pruebas/riesgoslimitaciones/spa
dc.relation.referencesParra-Galindo, M.-A., Piñeros-Niño, C., Soto-Sedano, J. C., & Mosquera-Vasquez, T. (2019). Chromosomes I and X Harbor Consistent Genetic Factors Associated with the Anthocyanin Variation in Potato. Agronomy, 9, 366. https://doi.org/10.3390/agronomy9070366spa
dc.relation.referencesRavishanker, N., & Dey, D. K. (2020). A First Course in Linear Model Theory. CRC Press.spa
dc.relation.referencesRowan, B. A., Seymour, D. K., Chae, E., Lundberg, D. S., & Weigel, D. (2016). Methods for Genotyping-by- Sequencing. Methods in Molecular Biology, 221-242. https://doi.org/10.1007/978-1-4939-6442-0 16spa
dc.relation.referencesSevilla, S. D. (2023). Metodolog´ıa de los estudios de asociación genética. Insuficiencia cardíaca, 2, 111-114. Consultado el 6 de junio de 2023, desde http://www.scielo.org.ar/scielo.php?script=sci arttext& pid=S1852-38622007000300006spa
dc.relation.referencesShaffer, J., Feingold, E., & Marazita, M. (2012). Genome-wide Association Studies. Journal of Dental Research, 91, 637-641. https://doi.org/10.1177/0022034512446968spa
dc.relation.referencesShim, H., Chasman, D. I., Smith, J. D., Mora, S., Ridker, P. M., Nickerson, D. A., Krauss, R. M., & Stephens, M. (2015). A Multivariate Genome-Wide Association Analysis of 10 LDL Subfractions, and Their Response to Statin Treatment, in 1868 Caucasians (P. Aspichueta, Ed.). PLOS ONE, 10, e0120758. https://doi.org/10.1371/journal.pone.0120758spa
dc.relation.referencesStephens, M. (2013). A Unified Framework for Association Analysis with Multiple Related Phenotypes (F. Emmert-Streib, Ed.). PLoS ONE, 8, e65245. https://doi.org/10.1371/journal.pone.0065245spa
dc.relation.referencesVanRaden, P. (2008). Efficient Methods to Compute Genomic Predictions. Journal of Dairy Science, 91, 4414-4423. https://doi.org/10.3168/jds.2007-0980spa
dc.relation.referencesWang, K., Zhang, H.-T., Kugathasan, S., Annese, V., Bradfield, J. P., Russell, R. K., Imielinski, M., Glessner, J. T., Hou, C., Wilson, D., Walters, T. D., Kim, C. E., Frackelton, E. C., Lionetti, P., Barabino, A., Limbergen, J. V., Guthery, S. L., Denson, L. A., . . . Hakonarson, H. (2009). Diverse Genomewide Association Studies Associate the IL12/IL23 Pathway with Crohn Disease. 84, 399-405. https: //doi.org/10.1016/j.ajhg.2009.01.026spa
dc.relation.referencesWeighill, D., Jones, P., Bleker, C., Ranjan, P., Shah, M., Zhao, N., Martin, M., DiFazio, S., Macaya-Sanz, D., Schmutz, J., Sreedasyam, A., Tschaplinski, T., Tuskan, G., & Jacobson, D. (2019). Multi-Phenotype Association Decomposition: Unraveling Complex Gene-Phenotype Relationships. Frontiers in Genetics, 10. https://doi.org/10.3389/fgene.2019.00417spa
dc.relation.referencesZhu, X., & Stephens, M. (2017). Bayesian large-scale multiple regression with summary statistics from genome-wide association studies. The Annals of Applied Statistics, 11, 1561-1592. https://doi.org/ 10.1214/17-aoas1046spa
dc.relation.referencesZhu, X., Feng, T., Tayo, B. O., Liang, J., Young, J. H., Franceschini, N., Smith, J. A., Yanek, L. R., Sun, Y. V., Edwards, T. L., Chen, W., Nalls, M., Fox, E., Sale, M., Bottinger, E., Rotimi, C., Liu, Y., McKnight, B., Liu, K., . . . Redline, S. (2015). Meta-analysis of Correlated Traits via Summary Statistics from GWASs with an Application in Hypertension. The American Journal of Human Genetics, 96, 21-36. https://doi.org/10.1016/j.ajhg.2014.11.011spa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseReconocimiento 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/spa
dc.subject.ddc620 - Ingeniería y operaciones afines::621 - Física aplicadaspa
dc.subject.lembGENOTIPOSspa
dc.subject.lembGenotypeseng
dc.subject.lembFENOTIPOSspa
dc.subject.lembPhenotypeeng
dc.subject.proposalModelos Linealesspa
dc.subject.proposalPruebas Múltiplesspa
dc.subject.proposalMultiple Testingeng
dc.subject.proposalBioestadísticaspa
dc.subject.proposalBiostatisticseng
dc.subject.proposalLinear Modelseng
dc.titleMétodos para identificar asociaciones entre genotipos y múltiples fenotiposspa
dc.title.translatedMethods to identify associations between genotypes and multiple phenotypeseng
dc.typeTrabajo de grado - Maestríaspa
dc.type.coarhttp://purl.org/coar/resource_type/c_bdccspa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/masterThesisspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TMspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
dcterms.audience.professionaldevelopmentEstudiantesspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
TRABAJO_DE_GRADO_VERSION_FINAL_ENTREGA.pdf
Tamaño:
953.2 KB
Formato:
Adobe Portable Document Format
Descripción:
Trabajo final de Maestría en ciencias Estadística, Juan Pablo Acero

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
5.74 KB
Formato:
Item-specific license agreed upon to submission
Descripción: