Computational Statistics, Institute of Statistics and Mathematical Methods in Economics, Vienna University of Technology, Vienna, Austria
Title: Robust linear and logistic regression for high-dimensional compositional data
Abstract: Compositional data analysis (CoDa) is based on analyzing log-ratios between the variables, and it is very useful in contexts where the measured values themselves are not meaningful, and thus the variables need to be compared relative to each other. An example are microbiome data, where the relevant information is contained in relative bacterial taxa abundances rather than in the absolute ones. Such data are typically high-dimensional, and usually only a small subset of bacteria is related to an external property. Another frequent problem with real data are outliers or observations which are inconsistent for some reason. Such observations could negatively affect the parameter estimation, with the consequence that the model is no longer appropriate, neither for regular observations nor for the outliers. We focus on both problems and introduce an elastic-net penalized estimator for linear as well as for logistic regression with compositional data. The proposed methods are based on the so-called log-contrast model, and robustness is achieved by trimming the objective function. We show the advantages of the resulting methods for simulated and for real microbiome data. R code has been made available at https://github.com/giannamonti/RobZS.
Institute of Medical Biometry, Informatics and Epidemiology, Faculty of Medicine, University of Bonn, Bonn, Germany
Title: Competing risks methods for discrete time-to-event data
Abstract: This talk presents an overview of statistical methods for the analysis of discrete failure times with competing events. We describe the most commonly used modeling approaches for this type of data, including discrete versions of the cause-specific hazards model and the subdistribution hazard model. In addition to discussing the characteristics of these methods, we present approaches to nonparametric estimation and model validation. Our literature review suggests that discrete competing-risks analysis has gained substantial interest in the research community and is used regularly in econometrics, biostatistics, and educational research.
Donald B. Rubin
Yau Mathematical Sciences Center, Tsingua University, Beijing, China
Department of Statistical Science, Fox School of Business, Temple University, Philadelphia, Pennsylvania, USA
Department of Statistics, Harvard University, Cambridge, Massachusetts, USA
Title: Applications of multiple imputation, from calibrating changing occupation/industry coding systems in the US Census, to evaluating the impact of nonresponse in the Slovenian Plebiscite, to understanding placebo effects in pharmaceutical experiments.
Abstract: Multiple Imputation (MI) for missing data, first proposed in Rubin (1978), has generated a relatively large corpus of theoretical justification and investigation, some of the earlier work summarized in Rubin (1986, 2006). Despite this rich collection of theoretical work, the main focus of MI has always been on its utility for applications. In this presentation we review two important applications of MI from the past, and preview an important application from the future, which involves MI’s use to help disentangle “placebo effects” from “real effects” in double-blind randomized trials of drugs.