Causal inference under directed acyclic graphs

Wang, Yuan (2015) Causal inference under directed acyclic graphs. Masters thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (787kB)


Directed acyclic graph (DAG) is used to describe the relationships among variables in causal structures according to some priori assumptions. This study mainly focuses on an application area of DAG for causal inference in genetics. In genetic association studies, an observed effect of a genetic marker on a target phenotype can be caused by a direct genetic link and an indirect non-genetic link through an intermediate phenotype which is influenced by the same marker. We consider methods to estimate and test the direct effect of the genetic marker on the continuous target phenotypic variable which is either completely observed or subject to censoring. The traditional standard regression methods may lead to biased direct genetic effect estimates. Therefore, Vansteelandt et al. [2009] proposed a two-stage estimation method using the principle of the sequential G-estimation for direct effects in linear models (Goetgeluk, Vansteelandt and Goetghebeur, 2009). In the first stage, the effect of the intermediate phenotype is estimated and an adjusted target phenotype is obtained by removing the effect of the intermediate phenotype. In the second stage, the direct genetic effect of the genetic marker on the target phenotype is estimated by regressing the genetic marker on the adjusted target phenotype. The two-stage estimation method works well when outcomes are completely observed. In this study, we show that the extension of the two-stage estimation method proposed by Lipman et al. [2011] for analysis of a target time-to-event phenotype which is subject to censoring does not work, and we propose a novel three-stage estimation method to estimate and test the direct genetic effect for censored outcomes under the accelerated failure time model. In order to address the issue in the adjustment procedure caused by survival outcomes which are subject to censoring, in the first stage, we estimate the true values of underlying observations and adjust the target phenotype for censoring. Then, we follow the two-stage estimation method proposed by Vansteelandt et al. [2009] to estimate the direct genetic effect. The test statistic proposed by Vansteelandt et al. [2009] cannot be directly used due to the adjustment for censoring conducted in the first stage; therefore, we propose to use a Wald-type test statistic to test the absence of the direct effect of the genetic marker on the target time-to-event phenotype. Considering the variability due to the estimation in the previous stages, we propose a nonparametric bootstrap procedure to estimate the standard error of the three-stage estimate of the direct effect. We show that the new three-stage estimation method and the Wald-type test statistic can be effectively used to make inference on the direct genetic effect for both uncensored and censored outcomes. Finally, we address the real situation in which the causal association between different phenotypes is not consistent with investigators’ assumptions, and models used to make inference for the direct genetic effect are misspecified. We show that in genetic association studies, simply using a wrong model without having enough evidence on which model is correct will lead to wrong conclusions if the causal relationship among phenotypes is unknown.

Item Type: Thesis (Masters)
Item ID: 11604
Additional Information: Includes bibliographical references (pages 120-123).
Keywords: Causal Inference, Directed Acyclic Graph, Survival Analysis, Model Misspecification, Nonparametric Bootstrap
Department(s): Science, Faculty of > Mathematics and Statistics
Date: September 2015
Date Type: Submission
Library of Congress Subject Heading: Acyclic models; Genetic markers--Mathematical models; Regression analysis; Estimation theory; Phenotype--Mathematical models

Actions (login required)

View Item View Item


Downloads per month over the past year

View more statistics