Previous seminars 2022

Statistics seminars autumn 2022

2022-12-14 (PhD review seminar): Objective Causal Inferences based on Real World Data: A comparative effectiveness evaluation of abiraterone acetate against enzalutamide

Speaker: Paulina Joneus, Department of Statistics, Uppsala University. Time and place: 2022-12-14 at 10:15–12:00, Ekonomikum room H317.

Abstract

Regulatory authorities are recognizing the need for real-world evidence (RWE) as a complement to randomized controlled trials (RCT) in the approval of drugs. However, RWE need to be fit for regulatory purposes. There is an ongoing discussion regarding if the pre-publication of a protocol on appropriate repositories, e.g., ClinicalTrials.gov, would increase the quality and objectivity of RWE, as is the case for RCT. This paper illustrates that an observational study based on a pre-published protocol can entail the same level of detail as a protocol for a randomized experiment.

The strategy is exemplified by designing a comparative effectiveness evaluation of abiraterone acetate (AA) against enzalutamide (ENZ) in clinical practice. These two cancer drugs are prescribed to patients with advanced prostate cancer. Two complementary designs, including pre-analysis plans, were published before data on outcomes and proxy-outcomes were obtained. The underlying assumptions are assessed using the proxy-outcomes, and both analyses show an increased mortality risk from being prescribed AA compared to ENZ.

2022-11-30 (PhD review seminar): Selection bias: An R package for bounding the selection bias

Speaker: Stina Zetterström, Department of Statistics, Uppsala University. Time and place: 2022-11-30 at 10:15–12:00, Ekonomikum room H317.

Abstract

Selection bias is a systematic error that can occur when subjects are included or excluded in the analysis based upon some selection criteria for the study population. This bias can threaten the validity of the study, and methods for estimating the effect of selection bias are desired. One method of estimating the effect of selection bias is through sensitivity analysis, and one type of sensitivity analysis is bounding the bias.

In this work, we present an R package that can be used to calculate two such previously proposed bounds for selection bias. One bound is based on assumptions of values of sensitivity parameters, and this bound is referred to as the SV bound. The other bound is based solely on the observed data, and is therefore referred to as the assumption free (AF) bound. Furthermore, we derive feasible regions for the sensitivity parameters as well as conditions for the SV bound to be sharp, where sharp means that the bias can be as large as the bound. We illustrate both the R package and the sharpness of the bound with a simulated dataset that emulates a study where the effect of zika virus on microcephaly in Brazil is investigated.

2022-11-23: Modelling consensus emergence with nonlinear dynamics

Speaker: Yvette Baurne, Department of Statistics, Lund University. Time and place: 2022-11-23 at 10:15–11:30, Ekonomikum room H317.

Abstract

The study of emergent, bottom-up, processes has long been of interest within organisational and group research. Emergent processes refer to how dynamic interactions among lower-level units (e.g. individuals) over time form a new, shared, construct or phenomena at a higher level (e.g. work group). To properly study emergence of shared constructs one needs models, and data, that both take into account variability across individuals and groups (multilevel), and variability over time (longitudinal).

We make three contributions to the modelling of the emergent process of consensus. First, we propose a formal definition of consensus emergence. Second, we identify two separate patterns of consensus emergence and introduce two models to account for these patterns; the Homogeneous Consensus Emergence Model (HomCEM) and the Heterogeneous Consensus Emergence Model (HetCEM). Third, we show how Gaussian Processes can be used to further extend the consensus emergence models, allowing them to capture nonlinear dynamics, on both individual and group level, in emergent processes.

2022-11-09: Identification of (seasonal) ARMA models revisited

Speaker: Johan Lyhagen, Department of Statistics, Uppsala University. Time and place: 2022-11-09 at 10:15–11:30, Ekonomikum room H317.

Abstract

The standard approach nowadays when identifying a stationary ARMA model is to analyse the autocorrelation function (ACF) and the partial autocorrelation function (PACF). Then estimate the model and check the residuals for remaining autocorrelation, significance of parameters etc. If the model doesn’t pass the check, it is revised. One problem with this approach is that the ACF/PACF pattern in the residuals do not trivially carry over to the ACF/PACF implied by the model. In this paper, a graphical tool is derived aiming to help the identification of ARMA models.

2022-10-19: Variational inference for max-stable processes

Speaker: Alexander Engberg, Department of Statistics, Uppsala University. Time and place: 2022-10-19 at 10:15–11:30, Ekonomikum room H317.

Abstract

Max-stable process provide natural models for the modelling of spatial extreme values observed at a set of spatial sites. Full likelihood inference for max-stable data is, however, complicated by the form of the likelihood function as it contains a sum over all partitions of sites. As such, the number of terms to sum over grows rapidly with the number of sites and quickly becomes prohibitively burdensome to compute.

We propose a variational inference approach to full likelihood inference that circumvents the problematic sum. To achieve this, we first posit a parametric family of partition distributions from which partitions can be sampled. Second, we optimise the parameters of that family in conjunction with the max-stable model to find the partition distribution best supported by data, and to estimate the max-stable model parameters. In a simulation study we show that our method enables full likelihood inference in higher dimensions than previous methods, and is readily applicable to data sets with a large number of observations. Furthermore, our method can easily be used in a Bayesian framework.

2022-10-05: A Bayesian semi-parametric approach for inference on the population partly conditional mean from longitudinal data with dropout

Speaker: Maria Josefsson, Department of Statistics, Uppsala University. Time and place: 2022-10-05 at 10:15–11:30, Ekonomikum room H317.

Abstract

Studies of memory trajectories using longitudinal data often result in highly non-representative samples due to selective study enrolment and attrition. An additional bias comes from practice effects that result in improved or maintained performance due to familiarity with test content or context. These challenges may bias study findings and severely distort the ability to generalise to the target population.

In this study we propose an approach for estimating the finite population mean of a longitudinal outcome conditioning on being alive at a specific time point. We develop a flexible Bayesian semi-parametric predictive estimator for population inference when longitudinal auxiliary information is known for the target population. We evaluate sensitivity of the results to untestable assumptions and further compare our approach to other methods used for population inference in a simulation study. The proposed approach is motivated by 15-year longitudinal data from the Betula longitudinal cohort study. We apply our approach to estimate lifespan trajectories in episodic memory, with the aim to generalize findings to a target population.

2022-09-28: Diffusion Index forecast models with smooth transitions

Speaker: Ingrid Mattsson, Department of Statistics, Uppsala University. Time and place: 2022-09-28 at 10:15–11:30, Ekonomikum room H317.

Abstract

In this paper we extend the Diffusion Index (DI) forecast model, introduced by Stock & Watson (2002), by allowing for smooth transition type nonlinearity (DIST). This is achieved by incorporating a logistic transition function in a factor augmented forecast model, where the factors are estimated using principal components.

Our main contribution is to theoretically justify bootstrap tests for linearity and parameter constancy, based on the wild bootstrap algorithm for linear factor augmented models, developed by Gonçalves & Perron (2014). A Monte Carlo experiment is performed, and it is shown that the wild bootstrap test has desirable small sample properties even in the most general case, where the test based on the regular OLS estimator has considerable size distortions. An empirical example is further included to demonstrate how the DIST model can outperform its linear counterpart in a forecasting situation.

2022-09-14: Flexible Latent Variable Model Framework for Latent DIF Detection

Speaker: Gabriel Wallin, Umeå University and London School of Economics. Date and location: 2022-09-14 at 10:15–12:00, Ekonomikum room H317.

Abstract

In psychometrics, a field concerned with theory and techniques for psychological and educational measurement, it is standard procedure to assess the presence of differential item functioning (DIF). DIF means that questionnaire/test items function differently for different groups of respondents, after controlling for the latent construct that is intended to be measured. It for example occurs in educational testing when groups such as defined by e.g., gender or ethnicity have different probabilities of answering a given item correctly, after controlling for the latent ability that the exam is intended to measure. As such, it relates to fairness in educational testing.

When DIF detection is not based on known groups such as gender or ethnicity but on unknown, homogeneous subgroups, the problem is typically referred to as latent DIF detection, which will be the focus of this talk. To that end, I will present a flexible modelling framework that combines a general latent factor model with a latent class model to capture both normal response behaviour for non-DIF items and deviant behaviour for DIF items. In the proposed model, a sparse DIF effect parameter is introduced that is allowed to vary between the latent classes identified by the model.

Our main contributions are two-folded: Firstly, unlike previous research on DIF detection, no prior knowledge of DIF-free items is required. Instead, they are identified through an 1 penalty on the DIF effect parameter in the marginal likelihood function of the model. Secondly, the proposed model considers a multiple latent group setting, whereas only two groups (a so called manifest and a focal group) are typically facilitated in current DIF detection methods. We propose an EM algorithm for model estimation, where the maximization step is carried out using a quasi-Newton proximal algorithm. Results based on both simulated and empirical data together with theoretical results will be presented.