Show simple item record

dc.rights.licenseAtribución-NoComercial 4.0 Internacional
dc.contributor.advisorLópez Kleine, Liliana
dc.contributor.authorBello Reyes, Nicolás
dc.date.accessioned2022-08-02T14:59:09Z
dc.date.available2022-08-02T14:59:09Z
dc.date.issued2022
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/81767
dc.descriptionilustraciones, graficas
dc.description.abstractLas metodologías de secuenciación de ARN han acelerado en gran medida el entendimiento de los procesos biológicos a nivel molecular en diferentes organismos. Aún así, estas metodologías son costosas, lo que lleva a conjuntos de datos de alta dimensionalidad con tamaños de muestra reducidos. Actualmente DESeq2 es una de las metodologías más usadas para el análisis de expresión diferencial, y a pesar de tener una gran fexibilidad en términos de sus hiper-parámetros, en la mayoría de casos se usa con parámetros predeterminados. En este trabajo se analizan dos elementos importantes de esta metodología: se evalúa el desempeño cuando los conteos siguen una distribución Poisson en vez de Binomial negativa y se muestra como la sensibilidad del método aumenta con esta distribución. Adicionalmente se contrasta la corrección por pruebas múltiples de Benjamini y Hochberg con la propuesta de Boca y Leek, y se propone un gráfico para la identificación de la relación funcional con la covariable. (Texto tomado de la fuente)
dc.description.abstractARN sequencing methods have dramatically accelerated our understanding of molecular biological processes within different organisms. However, these methodologies are costly, leading to datasets of high dimensionality and limited sampling size. At present DESeq2 is among the most used methodologies for this type of analysis, and despite its great flexibility regarding its hyper-parameters, it is mostly used with default values. In this work we analyze two important elements in this methodology: we assess the performance when counts follow a Poisson distribution instead of a negative binomial and we show how the sensibility increases with this distribution. Additionally we contrast the multiple-test correction proposed by Benjamini and Hochberg with that of Boca and Leek, and we also suggest a plot for the correct identification of the functional relationship with the informative covariate.
dc.format.extentv, 39 páginas
dc.format.mimetypeapplication/pdf
dc.language.isospa
dc.publisherUniversidad Nacional de Colombia
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subject.ddc570 - Biología::576 - Genética y evolución
dc.titleAnálisis del desempeño de DESeq2 para detección de genes diferencialmente expresados para datos de secuenciación genómica
dc.typeTrabajo de grado - Maestría
dc.type.driverinfo:eu-repo/semantics/masterThesis
dc.type.versioninfo:eu-repo/semantics/acceptedVersion
dc.publisher.programBogotá - Ciencias - Maestría en Ciencias - Estadística
dc.description.degreelevelMaestría
dc.description.degreenameMagíster en Ciencias - Estadística
dc.identifier.instnameUniversidad Nacional de Colombia
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombia
dc.identifier.repourlhttps://repositorio.unal.edu.co/
dc.publisher.departmentDepartamento de Estadística
dc.publisher.facultyFacultad de Ciencias
dc.publisher.placeBogotá, Colombia
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotá
dc.relation.indexedRedCol
dc.relation.indexedLaReferencia
dc.relation.referencesAl Mahi, Naim ; Begum, Munni: A two-step integrated approach to detect differentially expressed genes in RNA-Seq data. En: Journal of Bioinformatics and Computational Biology 14 (2016), Nr. 06, p. 1650034
dc.relation.referencesAnders, Simon ; Huber, Wolfgang: Differential expression analysis for sequence count data. En: Nature Precedings (2010), p. 1-1
dc.relation.referencesAuer, Paul L. ; Doerge, Rebecca W.: A two-stage Poisson model for testing RNA-seq data. En: Statistical applications in genetics and molecular biology 10 (2011), Nr. 1
dc.relation.referencesBenjamini, Yoav ; Hochberg, Yosef: Controlling the false discovery rate: a practical and powerful approach to multiple testing. En: Journal of the Royal statistical society: series B (Methodological) 57 (1995), Nr. 1, p. 289-300
dc.relation.referencesBoca, Simina M. ; Leek, Jeffrey T.: A direct approach to estimating false discovery rates conditional on covariates. En: bioRxiv (2018)
dc.relation.referencesCheung, Vivian G. ; Nayak, Renuka R. ; Wang, Isabel X. ; Elwyn, Susannah ; Cousins, Sarah M. ; Morley, Michael ; Spielman, Richard S.: Polymorphic cis-and trans-regulation of human gene expression. En: PLoS Biol 8 (2010), Nr. 9, p. e1000480
dc.relation.referencesDillies, Marie-Agnès ; Rau, Andrea ; Aubert, Julie ; Hennequet-Antier, Christelle ; Jeanmougin, Marine ; Servant, Nicolas ; Keime, Céline ; Marot, Guillemette ; Castel, David ; Estelle, Jordi [u. a.]: A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. En: Briefings in bioinformatics 14 (2013), Nr. 6, p. 671-683
dc.relation.referencesGu, Jinghua ; Wang, Xiao ; Halakivi-Clarke, Leena ; Clarke, Robert ; Xuan, Jianhua: BADGE: A novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data. En: BMC bioinformatics Vol. 15 Springer, 2014, p. 1-11
dc.relation.referencesIgnatiadis, Nikolaos ; Klaus, Bernd ; Zaugg, Judith B. ; Huber, Wolfgang: Datadriven hypothesis weighting increases detection power in genome-scale multiple testing. En: Nature methods 13 (2016), Nr. 7, p. 577-580
dc.relation.referencesKorthauer, Keegan ; Kimes, Patrick K. ; Duvallet, Claire ; Reyes, Alejandro ; Subramanian, Ayshwarya ; Teng, Mingxiang ; Shukla, Chinmay ; Alm, Eric J. ; Hicks, Stephanie C.: A practical guide to methods controlling false discoveries in computational biology. En: Genome biology 20 (2019), Nr. 1, p. 1-21
dc.relation.referencesKorthauer, Keegan D. ; Chu, Li-Fang ; Newton, Michael A. ; Li, Yuan ; Thomson, James ; Stewart, Ron ; Kendziorski, Christina: A statistical approach for identifying differential distributions in single-cell RNA-seq experiments. En: Genome biology 17 (2016), Nr. 1, p. 1-15
dc.relation.referencesLonsdale, John ; Thomas, Jeffrey ; Salvatore, Mike ; Phillips, Rebecca ; Lo, Edmund ; Shad, Saboor ; Hasz, Richard ; Walters, Gary ; Garcia, Fernando ; Young, Nancy [u. a.]: The genotype-tissue expression (GTEx) project. En: Nature genetics 45 (2013), Nr. 6, p. 580-585
dc.relation.referencesLove, Michael ; Huber, W ; Anders, S: Assessment of DESeq2 performance through simulation. En: DESeq2 vignette (2014)
dc.relation.referencesLove, Michael I. ; Huber, Wolfgang ; Anders, Simon: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. En: Genome biology 15 (2014), Nr. 12, p. 550
dc.relation.referencesOshlack, Alicia ; Robinson, Mark D. ; Young, Matthew D.: From RNA-seq reads to differential expression results. En: Genome biology 11 (2010), Nr. 12, p. 220
dc.relation.referencesPickrell, Joseph K. ; Marioni, John C. ; Pai, Athma A. ; Degner, Jacob F. ; Engelhardt, Barbara E. ; Nkadori, Everlyne ; Veyrieras, Jean-Baptiste ; Stephens, Matthew ; Gilad, Yoav ; Pritchard, Jonathan K.: Understanding mechanisms underlying human gene expression variation with RNA sequencing. En: Nature 464 (2010), Nr. 7289, p. 768-772
dc.relation.referencesReyes, Alejandro. Count RNA-seq data used for benchmarking FDR control methods. Oktober 2018
dc.relation.referencesReyes, Alejandro ; Huber, Wolfgang: Alternative start and termination sites of transcription drive most transcript isoform differences across human tissues. En: Nucleic acids research 46 (2018), Nr. 2, p. 582-592
dc.relation.referencesRitchie, Matthew E. ; Phipson, Belinda ; Wu, DI ; Hu, Yifang ; Law, Charity W. ; Shi, Wei ; Smyth, Gordon K.: limma powers differential expression analyses for RNAsequencing and microarray studies. En: Nucleic acids research 43 (2015), Nr. 7, p. e47-e47
dc.relation.referencesRobinson, Mark D. ; McCarthy, Davis J. ; Smyth, Gordon K.: edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. En: Bioinformatics 26 (2010), Nr. 1, p. 139-140
dc.relation.referencesSchuster, Stephan C.: Next-generation sequencing transforms today's biology. En: Nature methods 5 (2008), Nr. 1, p. 16-18
dc.relation.referencesScott, James G. ; Kelly, Ryan C. ; Smith, Matthew A. ; Zhou, Pengcheng ; Kass, Robert E.: False discovery rate regression: an application to neural synchrony detection in primary visual cortex. En: Journal of the American Statistical Association 110 (2015), Nr. 510, p. 459-471
dc.relation.referencesSoneson, Charlotte: compcodeR - an R package for benchmarking differential expression methods for RNA-seq data. En: Bioinformatics 30 (2014), Nr. 17, p. 2517-2518
dc.relation.referencesSoneson, Charlotte ; Delorenzi, Mauro: A comparison of methods for differential expression analysis of RNA-seq data. En: BMC bioinformatics 14 (2013), Nr. 1, p. 1-18
dc.relation.referencesSun, Shiquan ; Hood, Michelle ; Scott, Laura ; Peng, Qinke ; Mukherjee, Sayan ; Tung, Jenny ; Zhou, Xiang: Differential expression analysis for RNAseq using Poisson mixed models. En: Nucleic acids research 45 (2017), Nr. 11, p. e106-e106
dc.relation.referencesWang, Tianyu ; Li, Boyang ; Nelson, Craig E. ; Nabavi, Sheida: Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data. En: BMC bioinformatics 20 (2019), Nr. 1, p. 1-16
dc.relation.referencesWang, Zhong ; Gerstein, Mark ; Snyder, Michael: RNA-Seq: a revolutionary tool for transcriptomics. En: Nature reviews genetics 10 (2009), Nr. 1, p. 57-63
dc.rights.accessrightsinfo:eu-repo/semantics/openAccess
dc.subject.lembARN MENSAJERO
dc.subject.lembRna, messenger
dc.subject.lembDatabases, nucleic acid
dc.subject.lembBASES DE DATOS DE ACIDO NUCLEICO
dc.subject.proposalRNA-Seq
dc.subject.proposalExpresión diferencial
dc.subject.proposalModelos lineales generalizados
dc.subject.proposalPruebas múltiples
dc.subject.proposalDifferential expression
dc.subject.proposalGeneralized Linear Models
dc.subject.proposalMultiple testing
dc.title.translatedAnalysis of the performance of DESeq2 for the detection of differentially expressed genes for genome sequencing data
dc.type.coarhttp://purl.org/coar/resource_type/c_bdcc
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aa
dc.type.contentText
dc.type.redcolhttp://purl.org/redcol/resource_type/TM
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2
dcterms.audience.professionaldevelopmentEstudiantes
dcterms.audience.professionaldevelopmentInvestigadores
dcterms.audience.professionaldevelopmentMaestros


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Atribución-NoComercial 4.0 InternacionalThis work is licensed under a Creative Commons Reconocimiento-NoComercial 4.0.This document has been deposited by the author (s) under the following certificate of deposit