Abstract
Genetical genomics data present promising opportunities for integrating gene expression and genotype information. Lin et al. (2015) proposed an instrumental variables (IV) regression framework to select important genes with high-dimensional genetical genomics data. The IV regression addresses the issue of endogeneity caused by potential correlations between gene expressions and error terms, thereby improving gene selection performance. Knowing that genes function in networks to fulfill their joint task, incorporating network structures into a regression model can further enhance gene selection performance. In this presentation, I will introduce a graph-constrained penalized IV regression framework for high-dimensional genetical genomic data, aiming to improve gene selection performance by incorporating gene network structures. We propose a two-step estimation procedure that adopts a network-constrained regularization method and establishes selection consistency. Furthermore, considering that gene expressions are time-dependent, we extend the framework to allow for the effect of gene expressions to vary over time within a varying-coefficients IV regression framework. We demonstrate the utility of our method through simulations and real data analysis.
This is a joint work with Bin Gao, Jialin Qu, Xu Liu and Hongzhe Li.
Yuehua Cui 崔跃华
Michigan State University
崔跃华教授目前为美国密西根州立大学统计与概率系教授,研究生部主任,美国统计协会ASA fellow,国际统计学院ISI elected member。担任美国和中国国家自然科学基金评审专家,并担任多家国际学术期刊的副主编和编委,如BMC Genomic Data,Statistics and Probability letters和Computational and Structural Biotechnology Journal等。其主要从事统计遗传和基因组学的方法学研究,发表论文一百余篇,研究获得美国NSF和NIN的资助。