题名

Variable Selection in the Chlamydia Pneumoniae Lung Infection Study

DOI

10.6339/JDS.2013.11(2).1073

作者

Yuan Kang;Nedret Billor

关键词

LASSO ; multicollinearity ; partial least squares regression ; stepwise regression ; variable selection

期刊名称

Journal of Data Science

卷期/出版年月

11卷2期(2013 / 04 / 01)

页次

371 - 387

内容语文

英文

英文摘要

In this study, the data based on nucleic acid amplification techniques (Polymerase chain reaction) consisting of 23 different transcript variables which are involved to investigate genetic mechanism regulating chlamydial infection disease by measuring two different outcomes of muring C. pneumonia lung infection (disease expressed as lung weight increase and C. pneumonia load in the lung), have been analyzed. A model with fewer reduced transcript variables of interests at early infection stage has been obtained by using some of the traditional (stepwise regression, partial least squares regression (PLS)) and modern variable selection methods (least absolute shrinkage and selection operator (LASSO), forward stagewise regression and least angle regression (LARS)). Through these variable selection methods, the variables of interest are selected to investigate the genetic mechanisms that determine the outcomes of chlamydial lung infection. The transcript variables Tim3, GATA3, Lacf, Arg2 (X4, X5, X8 and X13) are being detected as the main variables of interest to study the C. pneumonia disease (lung weight increase) or C. pneumonia lung load outcomes. Models including these key variables may provide possible answers to the problem of molecular mechanisms of chlamydial pathogenesis.

主题分类 基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計