题名 |
Variable Selection in the Chlamydia Pneumoniae Lung Infection Study |
DOI |
10.6339/JDS.2013.11(2).1073 |
作者 |
Yuan Kang;Nedret Billor |
关键词 |
LASSO ; multicollinearity ; partial least squares regression ; stepwise regression ; variable selection |
期刊名称 |
Journal of Data Science |
卷期/出版年月 |
11卷2期(2013 / 04 / 01) |
页次 |
371 - 387 |
内容语文 |
英文 |
英文摘要 |
In this study, the data based on nucleic acid amplification techniques (Polymerase chain reaction) consisting of 23 different transcript variables which are involved to investigate genetic mechanism regulating chlamydial infection disease by measuring two different outcomes of muring C. pneumonia lung infection (disease expressed as lung weight increase and C. pneumonia load in the lung), have been analyzed. A model with fewer reduced transcript variables of interests at early infection stage has been obtained by using some of the traditional (stepwise regression, partial least squares regression (PLS)) and modern variable selection methods (least absolute shrinkage and selection operator (LASSO), forward stagewise regression and least angle regression (LARS)). Through these variable selection methods, the variables of interest are selected to investigate the genetic mechanisms that determine the outcomes of chlamydial lung infection. The transcript variables Tim3, GATA3, Lacf, Arg2 (X4, X5, X8 and X13) are being detected as the main variables of interest to study the C. pneumonia disease (lung weight increase) or C. pneumonia lung load outcomes. Models including these key variables may provide possible answers to the problem of molecular mechanisms of chlamydial pathogenesis. |
主题分类 |
基礎與應用科學 >
資訊科學 基礎與應用科學 > 統計 |