Mostrar el registro sencillo del documento
R package for estimating parameters of some regression models with or without covariates using TensorFlow
dc.rights.license | Atribución-NoComercial 4.0 Internacional |
dc.contributor.advisor | Hernández Barajas, Freddy |
dc.contributor.author | Garcés Céspedes, Sara |
dc.date.accessioned | 2021-11-11T14:49:45Z |
dc.date.available | 2021-11-11T14:49:45Z |
dc.date.issued | 2021-11-10 |
dc.identifier.uri | https://repositorio.unal.edu.co/handle/unal/80677 |
dc.description | ilustraciones, diagramas, tablas |
dc.description.abstract | La tarea de estimar parámetros es muy importante tanto en aplicaciones científicas como de industria. El lenguaje de programación R provee una amplia variedad de funciones creadas para encontrar los estimadores de máxima verosimilitud de parámetros de distribuciones y de modelos de regresión. En este trabajo se presenta el paquete estimtf junto con sus principales funciones mle_tf y mlereg_tf. Este paquete fue diseñado con el objetivo de encontrar los estimadores de máxima verosimilitud de parámetros distribucionales y de regresión usando TensorFlow, una librería de código abierto para computación numérica creada por Google. Para alcanzar este objetivo se diseñó un proceso de estimación iterativo en el cual se utilizan los optimizadores incluidos en esta librería para maximizar la función de verosimilitud. Para ilustrar el uso del paquete estimtf y evaluar el desempeño del proceso de estimación, se llevó a cabo un estudio de simulación y se presentaron algunas aplicaciones usando bases de datos reales. A partir del estudio de simulación se observó que el tamaño de muestra, el optimizador seleccionado y el valor inicial de la tasa de aprendizaje afectan las estimaciones obtenidas con las funciones mle_tf y mlereg_tf. Adicionalmente, las estimaciones obtenidas con ambas funciones resultaron muy cercanas a los verdaderos valores de los parámetros y muy similares a las estimaciones obtenidas con otras funciones de R, las cuales son muy populares y comúnmente usadas para la estimación de parámetros. (Texto tomado de la fuente) |
dc.description.abstract | The task of estimating parameters is very important in both scientific and industrial applications. The R programming language provides a wide variety of functions created to find the maximum likelihood estimates of parameters from distributions and regression models. In this work the estimtf package with its main functions mle_tf and mlereg_tf are presented. This package was design with the aim of finding the maximum likelihood estimates of distributional and regression parameters using TensorFlow, an open-source library for numerical computation created by Google. To achieve this goal an iterative estimation process was design in which the TensorFlow optimizers are used to maximize the likelihood function. To illustrate the use of the \pkg{estimtf} package and evaluate the performance of the estimation process, a simulation study was performed as well as some applications using real datasets. From the simulation study, an impact of the sample size, the selected optimizer, and the initial value of the learning rate on the estimates obtained with the mle_tf and the mlereg_tf functions was observed. Additionally, the estimates obtained with both functions were very close to the real value of the parameters and very similar to the estimates obtained with other R functions that are very popular and widely used for estimating parameters. |
dc.format.extent | xv, 106 páginas |
dc.format.mimetype | application/pdf |
dc.language.iso | eng |
dc.publisher | Universidad Nacional de Colombia |
dc.rights.uri | http://creativecommons.org/licenses/by-nc/4.0/ |
dc.subject.ddc | 510 - Matemáticas::519 - Probabilidades y matemáticas aplicadas |
dc.title | R package for estimating parameters of some regression models with or without covariates using TensorFlow |
dc.type | Trabajo de grado - Maestría |
dc.type.driver | info:eu-repo/semantics/masterThesis |
dc.type.version | info:eu-repo/semantics/acceptedVersion |
dc.publisher.program | Medellín - Ciencias - Maestría en Ciencias - Estadística |
dc.description.degreelevel | Maestría |
dc.description.degreename | Magíster en Ciencias - Estadística |
dc.identifier.instname | Universidad Nacional de Colombia |
dc.identifier.reponame | Repositorio Institucional Universidad Nacional de Colombia |
dc.identifier.repourl | https://repositorio.unal.edu.co/ |
dc.publisher.department | Escuela de estadística |
dc.publisher.faculty | Facultad de Ciencias |
dc.publisher.place | Medellín, Colombia |
dc.publisher.branch | Universidad Nacional de Colombia - Sede Medellín |
dc.relation.references | Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, M., J. Isard, . . . Zheng, X. (2016). Tensorflow: A system for large-scale machine learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. |
dc.relation.references | Adamidis, K., Dimitrakopoulou, T., & Loukas, S. (2005). On an extension of the exponentialgeometric distribution. Statistics Probability Letters, 73 , 259-269. |
dc.relation.references | Agresti, A. (2015). Foundations of linear and generalized linear models. Wiley |
dc.relation.references | Allaire, J., & Tang, Y. (2021). tensorflow: R interface to “tensorflow” [Computer software manual]. Retrieved from https://github.com/rstudio/tensorflow (R package version 2.2.0.9000) |
dc.relation.references | Bebbington, M., Lai, C.-D., & Zitikis, R. (2007). A flexible weibull extension. Reliability Engineering System Safety, 92 (6), 719-726. |
dc.relation.references | Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. |
dc.relation.references | Bolker, B., & R Development Core Team. (2020). bbmle: Tools for general maximum likelihood estimation [Computer software manual]. Retrieved from https://CRAN.R -project.org/package=bbmle (R package version 1.0.23.1) |
dc.relation.references | Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. Proc. of COMPSTAT. |
dc.relation.references | Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge University Press. |
dc.relation.references | Byrd, R., Lu, P., Nocedal, J., & Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal of Scientific Computing, 16 , 1190–1208. |
dc.relation.references | Commenges, D., Jacqmin-Gadda, H., Proust-Lima, C., & Guedj, J. (2006). A newton-like algorithm for likelihood maximization: The robust-variance scoring algorithm. Arxiv math/0610402 . |
dc.relation.references | Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39 (1), 1–38 |
dc.relation.references | Do, Q., Son, T., & Chaudri, J. (2017). Classification of asthma severity and medication using tensorflow and multilevel databases. Procedia Computer Science, 113 , 344-351. |
dc.relation.references | Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12 , 2121-2159. |
dc.relation.references | Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London, A, 222 , 309–368. |
dc.relation.references | Fox, P. A., Hall, A. P., & Schryer, N. L. (1978). The port mathematical subroutine library. ACM Trans. Math. Softw., 4 (2), 104–126. |
dc.relation.references | Galeone, P. (2019). Hands-on neural networks with tensorflow 2.0: understand tensorflow, from static graph to eager execution, and design neural networks (1st ed.). Packt Publishing. |
dc.relation.references | Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. (http:// www.deeplearningbook.org) |
dc.relation.references | Henningsen, A., & Toomet, O. (2011). maxlik: A package for maximum likelihood estimation in R. Computational Statistics, 26 (3), 443-458. |
dc.relation.references | Garcés, S., & Hernández, F. (2021). estimtf: Estimation of distributional and regression parameters using tensorflow [Computer software manual]. Retrieved from https:// github.com/SaraGarcesCespedes/estimtf (R package version 0.1.0) |
dc.relation.references | Hernandez, F., Usuga, O., Patino, C., Mosquera, J., & Urrea, A. (2021). Reldists: Estimation for some reliability distributions within gamlss framework [Computer software manual] |
dc.relation.references | Hernández, F., & Usuga, O. (2019). Manual de R [Computer software manual]. Retrieved from https://fhernanb.github.io/Manual-de-R/ |
dc.relation.references | Ihaka, R., & Gentleman, R. (1996). R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics, 5 (3), 299–314. |
dc.relation.references | Karlis, D., & Xekalaki, E. (2003). Choosing initial values for the em algorithm for finite mixtures. In Comput. stat. data anal. |
dc.relation.references | Keydana, S. (2020). tfprobability: Interface to “tensorflow probability” [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=tfprobability (R package version 0.11.0.0) |
dc.relation.references | Kingma, D., & Ba, J. (2014). Adam: A method for stochastic optimization. International Conference on Learning Representations. |
dc.relation.references | Kissell, R., & Poserina, J. (2017). Chapter 4 - advanced math and statistics. In R. Kissell & J. Poserina (Eds.), Optimal sports math, statistics, and fantasy (p. 103-135). Academic Press. Retrieved from https://www.sciencedirect.com/science/article/pii/B9780128051634000049 |
dc.relation.references | Devore, J. (2016). Probability and statistics for engineering and the sciences. Cengage Learning. Retrieved from https://books.google.com.co/books?id=UouECwAAQBAJ |
dc.relation.references | Bakouch, H., Dey, S., Ramos, P., & Louzada, F. (2017). Binomial-exponential 2 Distribution: Different Estimation Methods with Weather Applications. TEMA (Sao Carlos), 18 , 233 - 251. |
dc.relation.references | Bélisle, C. J. (1992). Convergence theorems for a class of simulated annealing algorithms on Rd. Journal of Applied Probability, 885–895. |
dc.relation.references | Legendre, A. M. A. M. (1805). Nouvelles méthodes pour la détermination des orbites des cometes [microform] / par a.m. legendre. Paris: F. Didot. |
dc.relation.references | Ling, M. (2018). A comparison of estimation methods for generalized gamma distribution with one-shot device testing data. |
dc.relation.references | Little, T. (2014). The oxford handbook of quantitative methods (No. v. 1). Oxford University Press. |
dc.relation.references | Louzada, F., Ramos, P. L., & Perdoná, G. (2016). Different estimation procedures for the parameters of the extended exponential geometric distribution for medical data. Computational and Mathematical Methods in Medicine, 2016 . |
dc.relation.references | Mai Anh, T., Bastin, F., & Frejinger, E. (2014). On optimization algorithms for maximum likelihood estimation |
dc.relation.references | Merovci, F. (2013). Transmuted rayleigh distribution. Austrian Journal of Statistics, 42 (1), 21-31. Retrieved from https://www.ajs.or.at/index.php/ajs/article/view/vol42%2C%20no1-2 |
dc.relation.references | Millar, R. (2011). Maximum likelihood estimation and inference: With examples in R, SAS and ADMB. Wiley. |
dc.relation.references | Mosquera, J., & Hernandez, F. (2019). Estimationtools: Maximum likelihood estimation for probability functions from data sets [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=EstimationTools (R package version 1.2.1) |
dc.relation.references | Mullen, K., Ardia, D., Gil, D., Windover, D., & Cline, J. (2011). DEoptim: An R package for global optimization by differential evolution. Journal of Statistical Software, 40 (6), 1–26. Retrieved from http://www.jstatsoft.org/v40/i06/ |
dc.relation.references | Muralidharan, K., & Khabia, A. (2014). Some statistical inferences on inlier(s) models. International Journal of System Assurance Engineering and Management, 8 . |
dc.relation.references | Nash, J. C. (2014). Nonlinear parameter optimization using R tools.. |
dc.relation.references | Nelder, J., & Mead, R. (1965). A simplex method for function minimization. Comput. J., 7 , 308-313. |
dc.relation.references | Nelder, J. A., & Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 135 (3), 370–384 |
dc.relation.references | Nesterov, Y. (2014). Introductory lectures on convex optimization: A basic course (1st ed.). Springer Publishing Company, Incorporated. |
dc.relation.references | Nocedal, J., & Wright, S. (2006). Numerical optimization. Springer New York. Retrieved from https://books.google.at/books?id=VbHYoSyelFcC |
dc.relation.references | Pawitan, Y. (2013). In all likelihood: Statistical modelling and inference using likelihood. OUP Oxford. |
dc.relation.references | Pearson, K. (1936). Method of moments and method of maximum likelihood. Biometrika, 28 (1/2), 34–59. |
dc.relation.references | Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks, 12 (1), 145 - 151. |
dc.relation.references | Ramos, P., & Louzada, F. (2019). A distribution for instantaneous failures. Stats, 2 , 247-258. |
dc.relation.references | R Core Team. (2021). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/ |
dc.relation.references | Rigby, R. A., & Stasinopoulos, D. M. (2005). Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society. Series C (Applied Statistics), 54 (3), 507–554. Retrieved from http://www.jstor.org/stable/3592732 |
dc.relation.references | Rizzo, M. (2007). Statistical computing with R. Chapman & Hall/CRC. |
dc.relation.references | Ross, S. M. (2006). Simulation, fourth edition. USA: Academic Press, Inc. |
dc.relation.references | RStudio. (2020). Tensorflow for R. Retrieved 24-06-2020, from https://tensorflow .rstudio.com/ |
dc.relation.references | Ruder, S. (2016). An overview of gradient descent optimization algorithms. |
dc.relation.references | Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323 , 533-536. |
dc.relation.references | Sawant, A., Bhandari, M., Yadav, R., Yele, R., & Bendale, S. (2018). Brain cancer detection from mri: a machine learning approach (tensorflow). International Research Journal of Engineering and Technology (IRJET), 5 (4). |
dc.relation.references | Schnabel, R. B., Koonatz, J. E., & Weiss, B. E. (1985). A modular system of algorithms for unconstrained minimization. ACM Trans. Math. Softw., 11 (4), 419–440. |
dc.relation.references | Stasinopoulos, D., Rigby, R., Heller, G., Voudouris, V., & De Bastiani, F. (2017). Flexible regression and smoothing: Using gamlss in R. |
dc.relation.references | Stigler, S. M. (1981). Gauss and the invention of least squares. The Annals of Statistics, 9 (3), 465–474. |
dc.relation.references | Storvik, G. (2011). Numerical optimization of likelihoods : Additional literature for stk 2120. |
dc.relation.references | Sweeting, T. J. (1980). Uniform asymptotic normality of the maximum likelihood estimator. The Annals of Statistics, 8 (6), 1375–1381. |
dc.relation.references | TensorFlow. (2020). Tensorflow core v2.2.0. Retrieved 11-06-2020, from https://www.tensorflow.org/ |
dc.relation.references | Variani, E., Bagby, T., McDermott, E., & Bacchiani, M. (2017). End-to-end training of acoustic models for large vocabulary continuous speech recognition with tensorflow. In Interspeech. |
dc.relation.references | Wickham, H. (2015). R packages (1st ed.). OReilly Media, Inc |
dc.relation.references | Wilks, D. S. (2019). Chapter 4 - parametric probability distributions. In D. S. Wilks (Ed.), Statistical methods in the atmospheric sciences (fourth edition) (Fourth Edition ed., p. 77-141). Elsevier. |
dc.relation.references | Yang, X.-S. (2021). Chapter 1 - introduction to algorithms. In X.-S. Yang (Ed.), Natureinspired optimization algorithms (second edition) (Second Edition ed., p. 1-22). Academic Press. Retrieved from https://www.sciencedirect.com/science/article/pii/B9780128219867000081 |
dc.relation.references | Zakerzadeh, H., & Dolati, A. (2009). Generalized lindley distribution. Journal of Mathematical Extension, 3 , 1-17. |
dc.relation.references | Zeiler, M. (2012). Adadelta: An adaptive learning rate method. , 1212 |
dc.relation.references | Dey, S., Raheem, E., & Mukherjee, S. (2017). Statistical properties and different methods of estimation of transmuted rayleigh distribution. Revista Colombiana de Estadística, 40 , 165 - 203. Retrieved from http://www.scielo.org.co/scielo.phpscript=sci_arttext&pid=S0120-17512017000100008&nrm=iso |
dc.rights.accessrights | info:eu-repo/semantics/openAccess |
dc.subject.lemb | Estimación de parámetros |
dc.subject.lemb | Parameter estimation |
dc.subject.proposal | TensorFlow |
dc.subject.proposal | Estimation of parameters |
dc.subject.proposal | Maximum likelihood |
dc.subject.proposal | Optimization algorithms |
dc.subject.proposal | Estimación de parámetros |
dc.subject.proposal | Máxima verosimilitud |
dc.subject.proposal | Algoritmos de optimización |
dc.title.translated | Propuesta de un paquete en R para la estimación de parámetros de algunos modelos de regresión con y sin covariables usando TensorFlow |
dc.type.coar | http://purl.org/coar/resource_type/c_bdcc |
dc.type.coarversion | http://purl.org/coar/version/c_ab4af688f83e57aa |
dc.type.content | Text |
dc.type.redcol | http://purl.org/redcol/resource_type/TM |
oaire.accessrights | http://purl.org/coar/access_right/c_abf2 |
dcterms.audience.professionaldevelopment | Investigadores |
dc.description.curriculararea | Área Curricular Estadística |
Archivos en el documento
Este documento aparece en la(s) siguiente(s) colección(ones)
![Atribución-NoComercial 4.0 Internacional](/themes/Mirage2//images/creativecommons/cc-generic.png)