Doctorado en Ciencias - Estadística

URI permanente para esta colecciónhttps://repositorio.unal.edu.co/handle/unal/82409

Examinar

Mostrando 1 - 20 de 25

Análisis del número reproductivo básico en modelos epidemiológicos con componente estocástico
(Universidad Nacional de Colombia, 2025-11) Ríos Gutiérrez, Andrés Sebastián; Arunachalam, Viswanathan; Torres Díaz, Soledad; Procesos Estocásticos
Esta tesis se enfoca en el estudio del número reproductivo básico como una medida clave para evaluar la propagación de enfermedades infecciosas. Inicialmente, se presentan modelos compartimentales deterministas, como el Susceptibles-Infectados-Recuperados (SIR) y el Susceptibles-Expuestos-Infectados-Recuperados (SEIR), que describen la dinámica epidémica y permiten calcular el número reproductivo básico en distintos escenarios. Luego, se introducen modelos estocásticos que incorporan la variabilidad en los parámetros epidemiológicos mediante ecuaciones diferenciales estocásticas, lo que permite obtener tanto el valor esperado como la varianza del número reproductivo básico. Posteriormente, se desarrolla el método de actualización de datos que mejora la previsión de poblaciones no observables y optimiza el cálculo del número reproductivo básico, asegurando resultados epidemiológicamente consistentes. Finalmente, se extienden estos modelos para abordar enfermedades más complejas, como el COVID-19, incluyendo nuevas poblaciones, como los vacunados, y determinando la media y la varianza del número reproductivo básico en este contexto. Estos avances permiten una caracterización más precisa del riesgo epidémico y una evaluación más efectiva de las estrategias de control. (Texto tomado de la fuente).
Analysis of crossover designs with repeated measurements using generalized estimating equations
(Universidad Nacional de Colombia, 2023-07-25) Cruz Gutiérrez, Nelson Alirio; Melo Martínez, Oscar Orlando; Martínez Niño, Carlos Alberto; Cruz, Nelson Alirio [0001562620]; Cruz Gutierrez, N.A. [N.A. Cruz]; Cruz, N.A. [0000000273705111]; Estadística Aplicada en Investigación Experimental, Industria y Biotecnología
Experimental crossover designs are widely used in medicine, agriculture, and other areas of the biological sciences. Due to the characteristics of the crossover design, each experimental unit has longitudinal observations and the presence of drag effects on the response variable. Furthermore, in many scenarios it is not possible to have a washout period between applications of different treatments, which creates problems in estimating treatment effects without a proper model specification. As a solution to this problem, this thesis deals with crossover designs without a washout period and with repeated measures. First, a methodology is developed for the analysis of crossover designs when the response variable is a Poisson count. For the estimation, generalized estimation equations are used assuming that there is no washout period and that the experimental unit was observed once per period. Furthermore, this methodology is easily extended to any response variable that belongs to the exponential family. Then, the above methodology is extended to crossover designs with repeated measures within each period, that is, when an experimental unit is observed more than once in each period. For this model, a family of correlation structures that takes into account the particularities of the design, that is, the correlation between and within the periods, is built. Finally, an extension of the generalized estimating equations is developed. It includes a parametric component to model treatment effects and a nonparametric component to model time effects and carry-over effects. The non-parametric component is estimated from splines inserted into the generalized estimation equations. Additionally, the codes for the application of the methodology in any crossover design in the R statistical software are given. The advantages of the proposed methodology are evidenced through simulation exercises and, theoretically, by exploring the asymptotic properties of the estimators obtained. The performance of the methodology is also compared with the usual methodologies on some real data from crossover designs. The methodology built in this thesis allows to analyze any crossover design as long as the observed response variable belongs to the exponential family, regardless of whether there is a washout period or not. It also allows modeling repeated measurements within each period and broadens the correlation structures used in the generalized estimation equations.
Semiparametric smoothing spline to joint mean and variance models with responses from the biparametric exponential family: a bayesian perspective
(Universidad Nacional de Colombia, 2022-01) Zárate Solano, Héctor Manuel; Cepeda Cuervo, Edilberto; Inferencia Bayesiana
Statistical applications need to address an increasing complexity due to new data arising from recent technologies, new phenomenons, and diverse sources of uncertainty. The demand for flexible methods with non-standard data structures, high-dimensional real-time estimation, and latent models framework have caused semiparametric modeling to play a crucial role in contemporary statistical analysis. We provide flexible Bayesian methods to jointly infer the mean, variance, and skewness functions when the response variable comes either from a two-parameter exponential family or asymmetric distributions. Hence, we implemented Bayesian algorithms based on MCMC sampling techniques and deterministic variational Bayesian learning theory. In these settings, each sub-model depends on some covariates parametrically and for others in a non-parametrically way. It follows that understanding how the moments change with predictors is a goal of Statistics, and it is of intrinsic interest given the role in approximating other quantities. We propose several modeling scenarios that benefit from the fusion of the graphical models' approach to Bayesian semiparametric regression under the architecture of GLM models. The significance and implications of our strategy lie in its potential to contribute to a unified computational methodology that provides insight into many complex models that otherwise could be intractable analytically. Therefore, combining data models and algorithms contribute to solving real-world problems enjoying crucial advantages related to faster computation time, which allow not only to explore quickly many models for the data but to estimate them accurately.
Análisis de factores comunes dinámicos en presencia de procesos de ruido autocorrelacionados
(2020-07-25) Bolívar Atuesta, Stevenson; Nieto Sánchez, Fabio Humberto; Peña Sánchez de Rivera, Daniel; Series de Tiempo
This thesis presents a procedure to build a dynamic factor model in the presence of orthogonal stationary noise-processes. The procedure is based on the Peña-Box model (Peña & Box, 1987), in which the number of observed time series is fixed, and in the extension proposed by Peña & Poncela (2006) to non-stationary common factors, in which the common factors may be integrated processes. As a first result, an alternative for detecting the number of common factors is proposed by extending the statistical test of Peña & Poncela (2006), proposed for the Peña-Box model with a white noise process. Furthermore, in the same context, a statistical test is proposed to identify the number of non-stationary common factors. These proposals are illustrated by simulation and an application with real data, in which some empirical findings related to seasonal factors are also presented. The model is estimated by maximum likelihood, via a state-space model.
Estimación de áreas pequeñas utilizando imputación múltiple en modelos logísticos de tres parámetros
(2020-10-30) Tellez Piñerez, Cristian Fernando; Trujillo Oyola, Leonardo
Generar datos de alta calidad y bajo costo es una necesidad para los tomadores de decisiones. En el sector educativo, estos datos son necesarios para decidir sobre la creación de políticas públicas, la continuidad de los programas existentes y la asignación de recursos año por año. En esta tesis se propone una metodología que incorpora la teoría de respuesta al ítem con la estimación en áreas pequeñas en presencia de datos faltantes. Se propone un estimador insesgado para el promedio de la habilidad de los estudiantes y un estimador bayesiano basado en la distribución beta, para la proporción de estudiantes que cuenten con una característica particular. Estos estimadores se comparan, vía simulación con los estimadores más usados en la práctica como lo son el estimador de Horvitz-Thompson, calibración y estimadores compuestos, para el caso del promedio, y para el caso de la proporción, se compara con el estimador de una razón y el estimador bayesiano para la proporción basado en la distribución normal. Concluyendo con esto que, los estimadores propuestos tienen menores errores estándar relativos y a su vez, son insesgados para el caso del promedio y aproximadamente insesgados para el caso de la proporción. Adicional a lo anterior, se hacen dos aplicaciones de esta metodología, la primera, utilizando los resultados de la prueba de matemáticas de PISA presentada en el año 2015 y la segunda, utilizando los resultados de las pruebas Saber 3°, 5° y 9° aplicada por el Icfes en Colombia. Para la primera, se comparan los resultados publicados con los obtenidos utilizando esta metodología en términos de precisión y a su vez, se predicen algunos países, observándose que los sesgos relativos de estas predicciones son pequeños. Para la segunda, se toma la muestra controlada, la cual implica más seguridad en la aplicación y en la posible copia entre participantes, y se predicen los resultados a nivel de entidades territoriales certifi cadas (ETC). Concluyendo con esto, que esta metodología es una buena alternativa para la generación de estadísticas ofi ciales en el sector educativo.
Monitoreo de perfiles para respuesta discreta agregada
(2020-08-21) Morales Ospina, Victor Hugo; Vargas Navas, José Alberto
The temporary aggregation of data is a procedure that occurs frequently in various areas and for different reasons, among which are a better handling of high-frequency data or simplicity in the monitoring processes. Many aggregation procedures are done on discrete data, and in particular, on counting data based on exposed populations that do not change over time, so in such cases, process monitoring can be done properly through its average. The effect that the aggregation of this type of observations has on the monitoring of the processes, has been studied by some authors, however, the effect of the aggregation of counting data when the sizes of the exposed populations vary over time, not it has been studied so far. These types of situations are very common, for example, in health surveillance, where populations of people exposed to a certain adverse event generally vary over time. The records that are obtained from this type of situation, correspond mostly to univariate counting data. However, there are other applications that generate multivariate counting data that depend on a covariate, such as when in a certain time interval, the number of cases of cancer deaths discriminated according to the age of the patients is counted. In this case, the records obtained correspond to vectors of counts that depend on the age covariate. In this situation, the covariate is a factor whose values do not change from one observation interval to another. However, in some cases this assumption is not true, which generates data with a special characteristic that must be taken into account. This dissertation studies the effect of the aggregation of counting data (univariate and multivariate), when the size of the exposed populations changes over time. In addition, a methodology is introduced to aggregate and monitor processes that generate this type of data. This methodology allows to keep the rate of adverse events constant, regardless of the level of aggregation used. Through simulation processes and the use of real data, we determine the effect that aggregation has on the monitoring of this type of process. At the end of the document the conclusions of our study are presented, as well as ideas for future research.
Modelos de Poisson no homogéneos en el estudio de contaminantes en la ciudad de Bogotá
(2020) Suárez Sierra, Biviana Marcela; Rodrigues, Eliane Regina; Blanco Castañeda, Liliana
Los modelos de Poisson no homogéneos han tenido gran importancia en los problemas, donde el conteo de excedencias de contaminantes del aire es relevante, para llegar a formular alguna solución. En el presente trabajo, en primer lugar se formularán modelos univariados, donde se estudia de manera independiente cada contaminante, en un intervalo de tiempo determinado, llegando de manera precisa al modelo que ajusta lo observado, en cuando a las excedencias acumuladas hasta un cierto tiempo. Como estos primeros modelos adolecen de la dependencia que genera la interacción de los diferentes contaminantes en un mismo intervalo de tiempo de observación, en una misma región, en segundo lugar se propondrá un modelo bivariado que permita estudiar tal situación. De tal manera, en el presente trabajo, se establecerá una función que relacione las excedencias de dos contaminantes, para su respectivo umbral, así como su función de media bivariada acumulada para datos de contaminación de aire de Bogotá a partir de funciones cópula. Esto último se establecerá en el marco de los procesos de Poisson no homogéneos bivariados.
Methodology for estimating association between categorical variables with application to Genome-wide association studies (GWAS)
(2019) Cortés Muñoz, Fabián
Several genomic data analysis contexts have a large number of statistical hypotheses, which are tested simultaneously. When the association between categorical phenotypes (i.e. healthy and not healthy) and Single Nucleotide Polymorphisms (SNPs) are ssessed by applying statistical tests, the two key challenges to address are the following: which method is the best for using multiple testing and how to increase the statistical power after adjustment for multiple testing. In this association studies, a solid criterion obtained to consider its significant with high statistical power and without necessarily increasing the sample size is crucial. Numerous methods have been developed for addressing these limitations; they have improved type I and type II errors rates. The proposed methods are mainly based on changing the type for establishing the association and extending it to continuous traits. Some of these statistical methods are very complex, which are difcult to use, specially for non-statisticians who usually obtain such data. Moreover, very few methods focused on developing a new statistical test for categorical data, which is the most common form of measuring phenotypical traits in humans and other organisms. By applying the maximum values of chi-square distribution as the test statistic, this study propose a new statistical test called Quotient C that allows testing associations between thousands of SNPs and a categorical trait. In real datasets, Quotient C is observed to be less stringent criterion that allows the declaration of a large number of associations between SNPs and dichotomous outcomes in comparison with the classical methods used for correcting multiple testing, thus keeping the probability of incorrectly rejecting a true null hypothesis (type I error) equal or less than type I error. The proposed method has a lower type II error rate and a better statistical power than the following methods: Bonferroni, Holm, Hochberg and Benjamini and Hochberg.
Modelamiento de procesos autorregresivos de umbrales estacionales
(2019-06-28) González Borja, Joaquín
Fluctuaciones estacionales frecuentemente se hallan en muchas series de tiempo. En adición, la no linealidad y la relación con otras series de tiempo son comportamientos prominentes de muchas de tales series. En este trabajo, consideramos el modelamiento de procesos autorregresivos de umbrales estacionales multiplicativos con entrada exógena (TSARX), los cuales incorporan en forma explícita y simultánea estacionalidad multiplicativa y no linealidad de umbrales. La estacionalidad es modelada a ser estocástica y dependiente del régimen. El modelo propuesto es un caso especial de un proceso autorregresivo de umbrales con entrada exógena (TARX). Desarrollamos un procedimiento basado en métodos Bayesianos para identificar el modelo, estimar parámetros, validar el modelo y calcular pronósticos. En la etapa de identificación del modelo, presentamos una prueba estadística de estacionalidad multiplicativa por regímenes. La metodología propuesta es ilustrada con un ejemplo simulado y aplicada a datos empíricos económicos.
Un modelo de interacción espacial para el flujo de pasajeros -entre terminales aéreas de Colombia en los años 2004 a 2015.
(2020-02-25) Santana Alfonso, Adrían Alberto; Giraldo Henao, Ramón
Se hace un análisis de información acerca del flujo de pasajeros en Colombia para los años 2004 a 2015 con diferentes variantes del modelo de gravedad. También es usada para evaluar la bondad de ajuste de otras estrategias estadísticas, que aunque conocidas, no han sido aplicadas en ese contexto, como por ejemplo los modelos mixtos. Para identificar posibles no linealidades e involucrar información espacio temporal que ayude en la descripción y predicción del flujo de pasajeros, se hace uso de las técnicas de regresión no paramétrica. Por ultimo, se evalúan técnicas de modelación funcional , con el propósito de involucrar información espacio temporal de los flujos dentro el modelo de gravedad. El ajuste de todos los modelos considerados, se realiza en el software R. El documento está organizado en dos partes. En la primera se presenta un marco teórico que incluye modelos de interacción espacial (modelo de gravedad y de dependencia espacial), modelos mixtos, no paramétricos y funcionales. En segunda instancia, con base en los métodos mencionados, se hace un análisis de información correspondiente al flujo de pasajeros aéreos en Colombia en el periodo 2004-2015. Finalmente, se dan conclusiones específicas respecto a los datos estudiados y sobre las alternativas de modelación que podrían considerarse a futuro.
Modelos para estimar cambios brutos en encuestas rotativas con ausencia de respuesta en diseños de muestreo complejos
(2014) Gutiérrez Rojas, Hugo Andrés
Las encuestas rotativas tipo panel son usadas para calcular estimaciones de cambios brutos (o flujos agregados de individuos) entre los estados de clasificación de interés para dos periodos consecutivos de medición. En esta tesis se considera un procedimiento general para la estimación de cambios brutos cuando la encuesta rotativa ha sido generada con un dise~no de muestreo complejo y no ignorable para la cual se presentan distintos patrones de ausencia de respuesta que pueden depender de las clasificaciones de los individuos. Para obtener estimaciones insesgadas de los parámetros de interés, se utiliza un enfoque de pseudo-verosimilitud, que está inducido por el diseño de muestreo complejo, sobre un modelo general en dos etapas para la afijación de los individuos en las categorías de la encuesta y la modelación de la ausencia de respuesta. Después de realizar sendos estudios de simulación, se concluye que la metodología propuesta es adecuada para la estimación de los cambios brutos y que omitir la medida de probabilidad inducida por el dise~no de muestreo conduce a estimadores sesgados en los parámetros del modelo de superpoblación así como en los cambios brutos. Por último, se considera el uso de la metodología en una encuesta de fuerza laboral (Pesquisa Mensal de Emprego) para la cual los modelos ajustados resultan ser adecuados en las estimación de los cambios brutos entre los estados de empleo en un determinado periodo de observación.
TAR modeling with missing data when the white noise process is not Gaussian
(2014) Zhang, Hanwen
En esta investigación, proponemos tres familias de modelos TAR: (1) Modelos TAR con ruidos t, (2) Modelos TAR para el logaritmo de series positivas, y (3) Modelos TAR donde el proceso del ruido tiene distribución Gamma estandarizada. Para cada uno de estos modelos, proponemos un procedimiento de tres etapas que consiste en: (1) La identificación del número de regímenes y los correspondientes umbrales, (2) La identificación de los órdenes autoregresivos en los regímenes, y (3) La estimación de los parámetros no estructurales, estos son, los coeficientes autoregresivos, las varianzas condicionales tipo II y demás parámetros que cada modelo particular pueda tener.
Modelos Bayesianos para datos longitudinales: extensiones teóricas y metodológicas
(2017-12-21) Corrales Bossio, Martha Lucia
Esta tesis explora el ajuste de modelos de antedependencia a datos continuos longitudinales donde se establecen estructuras de regresión a los parámetros del modelo. Para ajustar los modelos propuestos, se extiende el método Bayesiano propuesto por Cepeda y Gamerman. Finalmente, se presentan algunos estudios de simulación y múltiples aplicaciones para observar el desempeño del método.
Modelo Factorial Dinámico TAR
(2007-12) Correal Nuñez, María Elsa; Martínez Collantes, Jorge (Thesis advisor)
En este trabajo se presenta un procedimiento para estimar factores comunes en series temporales que presenten comportamientos no-lineales del tipo threshold. Dentro del estudio de series de tiempo, los procesos multivariados y la no-linealidad presentan desarrollos metodológicos de especial interés. En los modelos vectoriales VARMA existen múltiples estructuras con características similares y no existe una solución simple para la identificación de los parametros. Adicionalmente la proliferación de parametros puede ser tan alta como para hacer la estimación intratable en la práctica. Un modelo factorial dinámico no solamente reduce la dimensión del sistema, sino que permite dejar al descubierto componentes comunes al conjunto de variables que explican las interrelaciones dinámicas existentes entre ellas.
Monitoring regression models for lifetimes
(2016) Panza Ospino, Carlos Arturo
Abstract. Monitoring regression models for lifetimes The current study addresses the monitoring of regression models with response variable having a distribution for lifetimes. Certain aspects of this research have relevant importance. First of all, in most of the existing literature, monitoring regression models is treated as a special case of profile monitoring. However, especially in some industrial and healthcare applications, regression models can adequately represent process quality but cannot always be qualified as profiles. This is the case of regression models for lifetimes. The fact is that lifetimes can be measured just once at most in the same experimental unit. Consequently, the nature of responses while monitoring regression models is not multivariate necessarily. However, the main goal of monitoring regression models for lifetimes aims to check the stability of the distributions of n response variables Yi , i = 1, · · · , n. As all these distributions are linked by the same parameter vector, the stability of the formers depends on the one of the latter. Thus, it is clear that profile monitoring and regression monitoring share the same purpose. Techniques from profile monitoring can be used for successfully monitoring regression models for lifetimes as well. Some methodologies for monitoring Weibull regression models for lifetimes with common shape parameter and in phase II processes will be addressed depending on the composition of available regression data structures. The monitoring of the parameter vector characterizing the Weibull regression model allows us to make conclusions about the mean value of the response variable. It will be shown that the monitoring of regression models for lifetimes can be carried out by redesigning existing methods from monitoring continuous quality variables and profile monitoring. In the presence of uncensored lifetimes, it was found out that it is possible to adapt conventional control charts for single observations to the monitoring of the common shape parameter. It is also possible to adapt control techniques and methodologies from profile monitoring to the case of monitoring the entire parameter vector characterizing the basic model. In both cases, chart designing depends on the asymptotic normality of the maximum likelihood estimator of the parameter vector. Thus, it is necessary to implement some existing corrections to the monitoring statistics so that existing control charts work acceptably well when non-large enough data sets are available. When a type I right-censored mechanism is operating on lifetimes, the monitoring can be carried out with the help of one-sided likelihood ratio based cumulative sum control charts. Theese procedures can be used for monitoring one or more of the parameters in the parameter vector and has practically no restrictions respect to the dataset dimension needed for monitoring. Conducted simulations suggest that this chart is more effective than the multivariate exponentially weighted moving average method when detecting the deterioration of the process is wanted.
Optimal sampling design for functional and spatio-temporal random fields
(2015) Bohorquez Castañeda, Martha Patricia; Mateu, Jorge (Thesis advisor)
Esta tesis extiende los diseños de muestreo óptimos a la predicción espacial univariada y multivariada de datos funcionales. En ambos casos, se presentan predictores insesgados con sus respectivas varianzas. En el caso univariado, se propone usar cokriging simple sobre el campo aleatorio escalar formado por los puntajes asociados con la representación de los datos funcionales en términos de sus componentes principales funcionales empíricos. En el caso multivariado, se desarrolla la predicción espacial de una variable funcional en sitios no muestreados, usando covariables funcionales, es decir, se presenta el cokriging funcional. Se demuestra que a través de la representación de cada función en términos de sus componentes principales funcionales empíricos, el cokriging funcional solo depende de la auto-covarianza y de la covarianza cruzada de los vectores de puntajes asociados, los cuales son campos aleatorios escalares. Se proponen criterios de diseño para todos los predictores desarrollados en esta tesis. Adicionalmente, se construye una metodología para diseños de muestreo espacial dinámicos que permitan encontrar la estimación óptima de la media espacial y la predicción espacial óptima en un tiempo futuro, basados en la variación temporal de la estructura de dependencia espacial. Las metodologías son aplicadas a las redes de calidad del aire de Bogotá y México.
Joint modeling of continuous proportions and overdispersed counts
(2015-12-16) Davila Sanabria, Eduardo
Abstract: In the last few years biotechnolgy offers organic substances, defined as elicitors, which activate plant defenses to warding off atack by pathogens and herbivores. Application of this technology in crops induces a probabilistic defense mechanism whereby occur non identically distributed vector of bivariate data, which comprises both continuous proportions and overdispersed counts, where independence assumption cannot be hold. Hence, the goal of this work has been the joint modeling of continuous proportions and overdispersed counts, under the scientific context of induced resistance in plant protection. Theoretical framework was structured by one-par´ameter Clayton (CRM) and by two-par´ameter Joe and Hu (BB1) copulae, with Simplex and Generalized Poisson marginal distributions. Parameter estimations were done by Gauss-Newton type algorithm, variances by Jacknife and model selection by cross validation criterion. The theory was validated with experimental data on a roses crop exposed to an epidemiological complex plant:pathogen:herbivore, and with simulated data by computer as well. It concluded that, unlike classical analysis by Box and Cox adjusted normality and negative binomial, the constructed families of distributions capture the functional shape of bivariate relationship, the degree of dependence and marginal asymmetry, with easy to interpret parametrization and efficient estimators, according to the present study.
Una prueba de rachas para identificar sucesiones markovianas homogéneas de dos estados
(2015-12-11) Vergara Morales, Myrian Elena
Se propone una prueba orientada por los datos para identificar dependencia markoviana positiva de primer orden en una sucesión Bernoulli, basada en una combinación de dos pruebas de rachas: la prueba condicionada de Barton and David (1958) y una modificación de esta no condicionada. Para la modificación propuesta se obtienen expresiones analíticas de la distribución exacta de la estadística de prueba y de su potencia; también se construye un algoritmo para calcular explícitamente la potencia de las dos pruebas. Para comparar la potencia de las pruebas, se calcularon ambas para algunos valores de la proporción de unos y la probabilidad de éxito. Se muestra que hay intervalos para la probabilidad de éxito en los cuales la prueba no condicionada propuesta supera la potencia de la prueba original de Barton y David, y que la prueba orientada por los dato mejora la potencia de las dos pruebas de rachas, cuando se consideran por separado.
Estimation and Inference for 2k-p Experiments with Beta Response
(2015) Grajales Hernández, Luis Fernando; Melo Martínez, Oscar Orlando (Thesis advisor)
Fractional factorial experiments are widely used in industry and engineering. The most common interest in these experiments is to identify a subset of the factors with the greatest effect on the response. With respect to data analysis for these experiments, the most used methods include linear regression, transformations, and the Generalized Linear Model (GLM). This thesis focuses on experiments whose response is measured continuously in the (0,1) interval (if y ∈(a,b), then (y-a)/(b-a) ∈ (0,1)). Analyses for factorial experiments in (0,1) are rarely found in the literature. In this work, advantages and drawbacks of the three mentioned methods for analyzing data from experiments in (0,1) are described. Here, as the beta distribution assumes values in (0,1), the beta regression model (BRM) is proposed for analyzing these kinds of experiments. More specifically, the necessity of considering variable dispersion (VD) and using linear restrictions on parameters are justified in data from 2k and 2k and 2k-p experiments. Thus, the first result in this thesis is to propose, develop, and apply a restricted VDBRM. The restricted VDBRM is developed from frequentist perspective: a penalized likelihood (by means of Lagrange multipliers), restricted maximum likelihood estimators with their respective Fisher Information Matrix, hypothesis tests, and a diagnostic measure. Upon applying the restricted VDBRM, good results were obtained for simulated data, and it is shown that the hypothesis related to 2k and 2k-p experiments are a special case of the restricted model. The second result of this thesis is to explore an integrated Bayesian/likelihood proposal for analyzing data from factorial experiments using the (Bayesian and frequentist) simple BRM's. This was done upon employing at prior distributions in the Bayesian BRM. Thus, comparisons between confidence intervals (frequentist case) and credibility intervals (Bayesian case) on the mean response are done with good and promisory results in real experiments. This work also explores a technique for choosing the best model among several candidates which combine the Half-normal plots (given by the BRM) and the inferential results. Starting from the active factors chosen from each plot, subsequently the respective regression models are fitted and, finally, by means of information criteria, the best model is chosen. This technique was explored with the following models: normal, transformation, generalized linear, and simple beta regression for real 2k and 2k- p experiments: into the greater part of the examples considered for the Bayesian and frequentist BRM's, results were very similar (using at prior distributions). Moreover, four link functions for the mean response in the BRM are compared: results highlight the importance to study each problem at hand.
Bayesian Analysis of Multivariate Threshold Autoregressive Models with Missing Data
(2014-10) Calderón Villanueva, Sergio Alejandro
In some fields, we are forced to work with missing data in multivariate time series, unfortunately the analysis in this context cannot be done as in the case of complete data. Bayesian analysis of multivariate thresholds autoregressive models(MTAR) with exogenous inputs and missing data is carried out. MCMC methods are used to obtain samples from the marginal posterior distributions, including threshold values and missing data. In order to identify autoregressive orders, we adapt the Bayesian variable selection method to the MTAR models. The number of regimes is estimated using marginal likelihood and product space strategies. The forecasting of the output vector is implemented finding its predictive distributions. Simulation experiments and real data examples are presented.

Examinar

Envíos recientes