题名

modelSampler: An R Tool for Variable Selection and Model Exploration in Linear Regression

DOI

10.6339/JDS.2013.11(2).1133

作者

Tanujit Dey

关键词

FPE analysis ; model exploration ; model uncertainty ; rescaled spike and slab model ; variable selection

期刊名称

Journal of Data Science

卷期/出版年月

11卷2期(2013 / 04 / 01)

页次

343 - 370

内容语文

英文

英文摘要

We have developed a tool for model space exploration and variable selection in linear regression models based on a simple spike and slab model (Dey, 2012). The model chosen is the best model with minimum final prediction error (FPE) values among all other models. This is implemented via the R package modelSampler.However, model selection based on FPE criteria is dubious and questionable as FPE criteria can be sensitive to perturbations in the data. This R package can be used for empirical assessment of the stability of FPE criteria. A stable model selection is accomplished by using a bootstrap wrapper that calls the primary function of the package several times on the bootstrapped data. The heart of the method is the notion of model averaging for stable variable selection and to study the behavior of variables over the entire model space, a concept invaluable in high dimensional situations.

主题分类 基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計