Semiparametric smoothing spline to joint mean and variance models with responses from the biparametric exponential family: a bayesian perspective

dc.contributor.advisorCepeda Cuervo, Edilbertospa
dc.contributor.authorZárate Solano, Héctor Manuelspa
dc.contributor.researchgroupInferencia Bayesianaspa
dc.date.accessioned2022-02-05T00:31:26Z
dc.date.available2022-02-05T00:31:26Z
dc.date.issued2022-01
dc.descriptionilustraciones, gráficas, tablasspa
dc.description.abstractStatistical applications need to address an increasing complexity due to new data arising from recent technologies, new phenomenons, and diverse sources of uncertainty. The demand for flexible methods with non-standard data structures, high-dimensional real-time estimation, and latent models framework have caused semiparametric modeling to play a crucial role in contemporary statistical analysis. We provide flexible Bayesian methods to jointly infer the mean, variance, and skewness functions when the response variable comes either from a two-parameter exponential family or asymmetric distributions. Hence, we implemented Bayesian algorithms based on MCMC sampling techniques and deterministic variational Bayesian learning theory. In these settings, each sub-model depends on some covariates parametrically and for others in a non-parametrically way. It follows that understanding how the moments change with predictors is a goal of Statistics, and it is of intrinsic interest given the role in approximating other quantities. We propose several modeling scenarios that benefit from the fusion of the graphical models' approach to Bayesian semiparametric regression under the architecture of GLM models. The significance and implications of our strategy lie in its potential to contribute to a unified computational methodology that provides insight into many complex models that otherwise could be intractable analytically. Therefore, combining data models and algorithms contribute to solving real-world problems enjoying crucial advantages related to faster computation time, which allow not only to explore quickly many models for the data but to estimate them accurately.eng
dc.description.abstractLas aplicaciones estadísticas deben abordar una complejidad cada vez mayor debido a los nuevos datos que surgen con las tecnologías recientes, los nuevos fenómenos y las diversas fuentes de incertidumbre. La demanda por métodos con estructuras de datos no estándar, estimación en tiempo real de alta dimensión y modelos latentes adecuados ha causado que los modelos semiparamétricos desempeñen un papel crucial en el análisis estadístico reciente. En esta tesis se implementan métodos Bayesianos flexibles para inferir conjuntamente las funciones de media, varianza y asimetría cuando la variable de respuesta proviene de la familia exponencial biparamétrica o de distribuciones asimétricas. La aproximación es obtenida con métodos basados en técnicas de simulación de Monte Carlo con cadenas de markov y en algoritmos de aprendizaje variacional determinístico. En estos escenarios, cada submodelo incluye variables en forma paramétrica y no paramétrica para analizar el efecto de los predictores sobre los momentos. Los escenarios de modelamiento se benefician de la fusión entre los modelos gráficos y la regresión semiparamétrica Bayesiana utilizando la arquitectura de modelos lineales generalizados. La importancia e implicaciones de nuestra estrategia radican en su potencial para contribuir con una metodología computacional unificada que proporciona información sobre una gran variedad de modelos complejos que, de otro modo, podrían resultar analíticamente intratables. Por lo tanto, la combinación de modelos de datos y algoritmos contribuye a resolver problemas del mundo real y disfruta de ventajas cruciales relacionadas con el bajo tiempo de cómputo, lo cual permite no solo explorar rápidamente muchos modelos para los datos, sino también estimarlos con precisión. (Texto tomado de la fuente).spa
dc.description.degreelevelDoctoradospa
dc.description.degreenameDoctor en Ciencias - Estadísticaspa
dc.description.notesIncluye anexosspa
dc.format.extentxvii, 133 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/80887
dc.language.isoengspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.departmentDepartamento de Estadísticaspa
dc.publisher.facultyFacultad de Cienciasspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ciencias - Doctorado en Ciencias - Estadísticaspa
dc.relation.referencesAnderson, D. F. and Livingston, P. S. The zero-divisor graph of a commutative ring. Journal of Algebra, 217(2):434–447, 1999.eng
dc.relation.referencesBerry, S., Carroll, R., and Ruppert, D. Bayesian smoothing and regression splines for measurement error problems. Journal of the American Statistical Association, 97(457):160 – 169, 2002.eng
dc.relation.referencesBishop, C. M. Pattern recognition and Machine Learning, volume 1 of Graduate Texts in Mathematics. New York : Springer, 2006.eng
dc.relation.referencesBlei, D. M., Kucukelbir, A., and McAuliffe, J.D. Variational inference: a review for statisticians. Journal of the American Statistical Association, 112:859–857, 2017.eng
dc.relation.referencesBrooks, S., Gelman, A., Jones, G., and Meng, X. Handbook of Markov Chain Monte Carlo. Handbooks of Modern and Statistical Methods. Chapman Hall/CR, 2011.eng
dc.relation.referencesCepeda, E. and Gamerman, D. Bayesian modeling of variance heterogeneity in normal regression models. J. Prob. Stat, 14:207–221, 2001.eng
dc.relation.referencesCrainiceanu, C., Ruppert, D., and Wand, M. Bayesian analysis for penalized spline regression using winbugs. Journal of Statistical Software, 14:1–24, 2005.eng
dc.relation.referencesCrevar, M. Shared file systems: Determining the best choice for your distributed SAS® foundation applications. SAS Institute Inc., Cary, NC., pages 1–11, 2017. 21eng
dc.relation.referencesDey, D.K., Gelfand, A.E., and Peng, F. Overdispersed generalized linear models. Journal of Statistical Planning and Inference,, 64(64):93–108, 1997.eng
dc.relation.referencesEvans, M. J. and Rosenthal, J. S. Probability and Statistics: The science of Uncertainty. The American Statistician. New York: W.H Freeman and Company, 2004.eng
dc.relation.referencesGilks, W.R., Richardson, S., and Spiegelhalter, D.J. Markov Chain Monte Carlo in practice. Interdisciplinary Statistics. Chapman Hall/CR, 1998.eng
dc.relation.referencesGreen, P.J. and Silverman, B.W. Nonparametric Regression and Generalized Linear Models: A roughness penalty approach. Chapman and Hall, London, 1994.eng
dc.relation.referencesGroll, A., Hambuckers, J., , Kneib, T., and Umlauf, N. Lasso-type penalization in the framework of generalized additive models for location, scale and shape. Computational Statistics Data Analysis, 140:59–73, 2019.eng
dc.relation.referencesHastings, W. Monte carlo sampling methods using markov chains and their applications. Biometrika, (57):97–109, 1970.eng
dc.relation.referencesJang, E. A beginner’s guide to variational methods: Mean-field approximation. http://blog.evjang.com/2016/08/variational-bayes.html, (3):233–240, 2016.eng
dc.relation.referencesKullback, S. Information Theory and Statistics. Gloucester , MASS, 1978.eng
dc.relation.referencesLandau, D. and Binder, K. A Guide to Monte Carlo Simulations in Statistical Physics. Cambridge University Press, 2005.eng
dc.relation.referencesNakajima, S., Watanabe, K., and Sugiyama, M. Variational Bayesian Learning Theory, volume 1 of Graduate Texts in Mathematics. Cambridge University Press, 2019.eng
dc.relation.referencesNosedal-Sanchez, A., Storlie, C., Thomas, C., and Chisrensen, R. Reproducing kernel hilbert spaces for penalized regression : A tutorial. The American Statistician, (66):50–60, 2012.eng
dc.relation.referencesOrmerod, J. T. and Wand, J. T. Explaining variational approximations. The American Statistician, 64:140–153, 2001. 22eng
dc.relation.referencesPierce, N. and Wand, D. Penalized splines and reproducible kernel methods. American Statistical Association, (3):233–240, 2006.eng
dc.relation.referencesRuppert, D., Wand, M., and Carroll, R. Semiparametric regression during 2003-2007. Electronic Journal of Statistics, 3:1193–1256, 2009.eng
dc.relation.referencesStasinopoulos, D. and Rigby, R. Generalized additive models for location scale and shape (gamlss) in r. Journal of Statistical Software, 23:1–46, 2007.eng
dc.relation.referencesTurkman, M., Paulino, C., and Muller, P. Computational Bayesian Statistics . An Introduction. Cambridge, 2019.eng
dc.relation.referencesUmlauf, N., Nadja, K., and Achim, Z. Bamlss: Bayesian additive models for location, scale, and shape (and beyond). Journal of Computational and Graphical Statistics, 27:612–627, 2018.eng
dc.relation.referencesWahba, G. Spline Models for Observational Data. Society for Industrial and Applied Mathematics, 1990.eng
dc.relation.referencesWhaba, G. and Wendelberger, J. Some new mathematical methods for variational objective analysis using splines and cross validation. Monthly weather review, 108:1122–1143, 1980. 23eng
dc.relation.referencesBerry, S., Carroll, R., and Ruppert, D. Bayesian smoothing and regression splines for measurement error problems. Journal of the American Statistical Association, (457):160–169, 2011.eng
dc.relation.referencesCepeda, E. Variability modeling in Generalized Linear models. PhD thesis, Unpublished Ph.D thesis, Mathematics Institute Universidade Federal do Rio de Janeiro, 2001.eng
dc.relation.referencesCepeda, E. and Gamerman, D. Bayesian methodology for modeling parameters in the two parametric exponential family. Estadística, 57:93–105, 2005.eng
dc.relation.referencesCepeda,E., Achcar,J., and Garrido Lopera,L. Bivariate beta regression models: joint modeling of the mean, dispersion and association parameters. Journal of Applied statistics, 41(3):677– 687, Marzo 2014.eng
dc.relation.referencesCrainiceanu, C. Spatially adaptative bayesian penalized splines with heteroscedastic errors. Journal of Computational and Graphical Statistics, (2):265–288, 2007.eng
dc.relation.referencesCurrie, I. and Burban, M. Flexible smoothing with p-splines : a unified approach. Statistical Modelling, (4):333–349, 2002.eng
dc.relation.referencesDavidian, M., Lin, X., Morris, J., and Stefanski, O. The Work of Raymond J. Carroll. The impact and influence of a Statistician. 2014. 49eng
dc.relation.referencesDenison, D., Mallick, B., and Smith, F. A bayesian cart algorithm. Biometrika, (2):363 – 367, 1998.eng
dc.relation.referencesDey, D.K., Gelfand,A.E., and Peng, F. Overdispersed generalized linear models. Journal of Statistical Planning and Inference,, 64(64):93–108, 1997.eng
dc.relation.referencesEilers, P.and Marx, B. and Durbán, M. Twenty years of p-splines. SORT (Statistics and Operations Research Transactions), (39), 2014.eng
dc.relation.referencesGamerman, D. Sampling from the posterior distribution in generalized linear mixed models. Instituto de matemática, Universidade Federal do Rio de Janeiro, pages 59 – 68, 1997.eng
dc.relation.referencesGijbels, I. and Prosdocimi, I. Flexible mean and dispersion function estimation in extended generalized additive models. Communications in statistics - Theory and Methods, (41):3259 – 3277, 2012.eng
dc.relation.referencesGu,C. Smoothing Spline ANOVA Models. Springer, West Lafayette,USA, 2002.eng
dc.relation.referencesLittell, R. and Schabenberger, O. SAS for Mixed Models. Number 2. 2006.eng
dc.relation.referencesLoomis, C. MCMC in SAS: From scratch or by proc. Western users of SAS software 2016, 1(1):1 – 19, 2016.eng
dc.relation.referencesMa, Y. and Carroll R.J. Locally efficient estimators for semiparametric models with measurement error. Journal of the American Statistical Association, (101):1465–1474, 2006.eng
dc.relation.referencesMencitas, M. and Wand, M. Variational inference for heteroscedastic semiparametric regression. School of mathematical sciences, University of Technology. Sydney, Australia, 2014.eng
dc.relation.referencesNosedal-Sanchez, A., Storlie, C., Thomas, C., and Chisrensen, R. Reproducing kernel hilbert spaces for penalized regression : A tutorial. The American Statistician, (66):50–60, 2012.eng
dc.relation.referencesNott, D. Semiparametric estimation of mean and variance functions for non-gaussian data. Computational Statistics, (3-4):603–620, 2006.eng
dc.relation.referencesPierce, N. and Wand, D. Penalized splines and reproducible kernel methods. American Statistical Association, (3):233–240, 2006.eng
dc.relation.referencesPinheiro, J. and Bates, D. Mixed-Effects Models in S and S-Plus. Springer Verlag, 2009.eng
dc.relation.referencesRobert, C.P., Elvira,V., Tawn, N., and Wu, C. Accelerating mcmc algorithms. Journal of the American Statistical Association, 2018.eng
dc.relation.referencesRuppert, D., Wand, M.P., Holst,U., and Hössjer, O. Local polynomial variance-function estimation. Technometrics, (39):262–273, 1997.eng
dc.relation.referencesSpiegelhalter, D. and Best, N. Bayesian approaches to multiple sources of evidence in complex cost-effectiveness modelling. Statistics in Medicine, (23):3687 – 3709, 2003.eng
dc.relation.referencesTran, M., Nguyen, N., Nott, D., and Kohn, R. Bayesian deep net glm and glmm. SORT ( arXiv:1805.10157v1 [stat.CO]), 2018.eng
dc.relation.referencesXu,D. and Zhang,Z. A semiparametric bayesian approach to joint mean and variance models. Statistics & Probability Letters, 83(7):1624 – 1631, 2013.eng
dc.relation.referencesAzzalini, A. A class of distribution which includes the normal ones. Scandinavian Journal of Statistics, 12:171–178, 1985.eng
dc.relation.referencesAzzalini, A. Further results on a class of distributions which includes the normal ones. Statistica, 46:199–208, 1986.eng
dc.relation.referencesCharenza, W. and Diaz, C. Choosing the rigth skew normal distribution: the macroeconomist dilemma. Journal of Forecasting, 19:235–254, 2015.eng
dc.relation.referencesDursun, A. and Ersin, Y. Modified estimators in semiparametric regression models with rightcensored data. Journal of Statistical Computation and Simulation, 88:1470–1498, 2018.eng
dc.relation.referencesFerreira, J. and Steel, M. A new class of skewed multivariate distributions with applications to regression analysis. Statistica Sinica, 17:505–529, 2007.eng
dc.relation.referencesFranceschini, C. and Loperfido, N. Testing for normality when the sampled distribution is extended skew-normal. Mathematical and statistical methods for actuarial sciences and finance Springer, 2014.eng
dc.relation.referencesGenton, M. Discussion of the ’skew-normal’. Scandinavian Journal of Statistics, 32:189–198, 2005.eng
dc.relation.referencesGroll, A., Hambuckers, J., , Kneib, T., and Umlauf, N. Lasso-type penalization in the framework of generalized additive models for location, scale and shape. Computational Statistics Data Analysis, 140:59–73, 2019.eng
dc.relation.referencesGómez, H., Salinas, H., and Bolfarine, H. Generalized skew-normal models: properties and inference. Statistics, 40:495–505, 2006.eng
dc.relation.referencesHuiqiong, L., Liucang, W., and Ting, M. Variable selection in joint location, scale and skewness models of the skew-normal distribution. J Syst Sci Complex, 30:694–709, 2017.eng
dc.relation.referencesKuligowsk, A. and Mendes, L. Working with sparse matrices in sas®. Proceedings of the SAS® Global Forum 2019 Conference, pages 1–9, 2019.eng
dc.relation.referencesMa, Y. and Genton, M. Flexible class of skew-symmetric distributions. Scandinavian Journal of Statistics, 31:459–468, 2004.eng
dc.relation.referencesMa, Y., Genton, M., and Tsiatis, A. Locally efficient semiparametric estimators for generalized skew-elliptical distributions. Journal of American Statistical Association, 100:980–989, 2005.eng
dc.relation.referencesPerez, P., Acosta, R., and Perez, S. A bayesian genomic regression model with skew normal random errors? G3, 8:1771–1785, 2018.eng
dc.relation.referencesPotgieter, C. and Genton, M. Bayesian analysis of two-piece normal regression models. Presented at Joint Statistical meeting, San Francisco Statistics, 2003.eng
dc.relation.referencesPotgieter, C. and Genton, M. Characteristic function-based semiparametric inference for skewsymmetric models. Scandinavian Journal of Statistics, 40:1803–1819, 2013.eng
dc.relation.referencesSatori, N. Bias prevention of maximum likelihood estimates for scalar skew normal and skew t distributions. Journal of Statistical Planning and Inference, 136:4259–4275, 2006.eng
dc.relation.referencesStasinopoulos, D. and Rigby, R. Generalized additive models for location scale and shape (gamlss) in r. Journal of Statistical Software, 23:1–46, 2007.eng
dc.relation.referencesUmlauf, N., Nadja, K., and Achim, Z. Bamlss: Bayesian additive models for location, scale, and shape (and beyond). Journal of Computational and Graphical Statistics, 27:612–627, 2018.eng
dc.relation.referencesWilhelmsson, Y. Value at risk with time varying variance, skewness and kurtosis - the nig-acd model. Econometrics Journal, 12:82–104, 2009.eng
dc.relation.referencesYu, K., Alhamzawi, A., Becker, F., and Lord, J. Statistical methods for body mass index: a selective review of the literature. arXiv:1412.3653v1 [stat.AP], pages 1–32, 2014.eng
dc.relation.referencesZareifard, H. and Khaledi, M. Non-gaussian modeling of spatial data using scale mixing of a unified skew gaussian process. Journal of Multivariate Analysis, 114:16–28, 2013. 99eng
dc.relation.referencesAhmedt, D., Ali, M., Denman, S., Fookes, C., and Petersson, L. Graph-based deep learning for medical diagnosis and analysis: Past, present and future. arXIV:2105.13137v1 cs.LG, pages 1–41, 2021.eng
dc.relation.referencesBarro, S and Sala i Martin, X. Economic growth. MIT, 2004.eng
dc.relation.referencesBishop, C. Pattern recognition and machine learning. Springer, 2016.eng
dc.relation.referencesBugbee, B., Bredit, J., and Van der Woerd, M. Laplace variational approximation for semiparametric regression in presence of heterocedastic errors. Journal of Computational and Graphical Statistics, 25:225–245, 2016.eng
dc.relation.referencesBuxton, R. Introduction to Functional Magnetic Resonance Imaging : Principles and Techniques. Cambridge, 2009.eng
dc.relation.referencesCaffo, B., Bowman, D., Elberly, L., and Bassett, S. Handbook of Markov Chain Monte Carlo: Part II, chapter 14. Chapman Hall / CRC, 2011.eng
dc.relation.referencesFaes, C. and Wand, M.P. Semiparametric mean field variational bayes: General principles and numerical issues. Journal of Machine Learning Research, (17):1–47, 2016.eng
dc.relation.referencesHans, S. MRI made easy (...well almost). Schering, 1990.eng
dc.relation.referencesKoller, D. and Friedman, N. Probabilistic Graphical Models: Principles and Techniques. The MIT Press, 2010.eng
dc.relation.referencesLarsen, W. and McCleary, S. The use of partial residual plots in regression analysis. Technometrics, (14):781–790, 1970.eng
dc.relation.referencesLazaro, M. Bayesian warped gaussian processes. Advances in Neural Information Processing Systems, 26:225–245, 2013.eng
dc.relation.referencesMencitas, M. and Wand, M. Variational inference for heteroscedastic semiparametric regression. School of mathematical sciences, University of Technology. Sydney, Australia, 2014.eng
dc.relation.referencesNakajima, S., Watanabe, K., and Sugiyama, M. Variational Bayesian Learning Theory. Cambridge University press, 2019.eng
dc.relation.referencesNott , D., Tran, M., and Kuk, A. Efficient variational inference for generalized linear mixed models with large datasets. arXiv preprint, 2018.eng
dc.relation.referencesPotgieter, C. and Genton, M. Bayesian analysis of two-piece normal regression models. Presented at Joint Statistical meeting, San Francisco Statistics, 2003.eng
dc.relation.referencesRindler, F. Calculus of Variations. Springer-Verlag, 2016.eng
dc.relation.referencesStarke, L. and Ostwald, D. Variational bayesian parameter estimation techniques for the general linear model. Frontiers in Neuroscience, pages 1–22, 2017.eng
dc.relation.referencesZárate, H. and Cepeda, E. Semiparametric smoothing spline in joint mean and dispersion models with responses from the biparametric exponential family: A bayesian perspective. Statistics, Optimization Information Computing, 9(2):351–367, 2021.eng
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseReconocimiento 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/spa
dc.subject.ddc510 - Matemáticas::519 - Probabilidades y matemáticas aplicadasspa
dc.subject.lembSpline theoryeng
dc.subject.lembTeoría Splinespa
dc.subject.lembBayesian statistical decision theoryeng
dc.subject.lembTeoría bayesiana de decisiones estadísticasspa
dc.subject.lembLineal models (statistics)eng
dc.subject.lembModelos lineales (Estadística)spa
dc.subject.proposalSemiparametric heteroscedastic modelseng
dc.subject.proposalCalculus of variationseng
dc.subject.proposalOptimizationeng
dc.subject.proposalBiparametric exponential modelseng
dc.subject.proposalMarkov chain Monte Carloeng
dc.subject.proposalGeneralized linear modelseng
dc.subject.proposalSmoothing splineeng
dc.subject.proposalAsymmetric distributionseng
dc.subject.proposalModelos semiparamétricosspa
dc.subject.proposalFamilia exponencial biparamétricaspa
dc.subject.proposalCadenas de markov Monte Carlospa
dc.subject.proposalModelos lineales generalizadosspa
dc.subject.proposalSuavizamiento splinespa
dc.subject.proposalDistribuciones asimétricasspa
dc.subject.proposalVariational bayesian learningeng
dc.subject.proposalAprendizaje bayesiano variacionalspa
dc.subject.unescoAnálisis numéricospa
dc.subject.unescoNumerical analysiseng
dc.titleSemiparametric smoothing spline to joint mean and variance models with responses from the biparametric exponential family: a bayesian perspectiveeng
dc.title.translatedSuavizamiento spline semiparamétrico para modelar simultaneamente las funciones media y varianza con respuestas de la familia exponencial biparamétrica: una perspectiva bayesianaspa
dc.typeTrabajo de grado - Doctoradospa
dc.type.coarhttp://purl.org/coar/resource_type/c_db06spa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/doctoralThesisspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TDspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
dcterms.audience.professionaldevelopmentPúblico generalspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
7219498.2021.pdf
Tamaño:
4.54 MB
Formato:
Adobe Portable Document Format
Descripción:
Tesis de Doctorado en Ciencias - Estadística

Bloque de licencias

Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
3.98 KB
Formato:
Item-specific license agreed upon to submission
Descripción: