Skip to content

csetraynor/projpred

This branch is 3364 commits behind stan-dev/projpred:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

1b25633 · Apr 1, 2020
Apr 1, 2020
Aug 29, 2019
Apr 1, 2020
Aug 27, 2019
Jan 28, 2020
Oct 16, 2018
May 4, 2017
Sep 18, 2018
Aug 29, 2017
Mar 30, 2020
Jul 10, 2019
Aug 27, 2019
May 15, 2019

Repository files navigation

Stan Logo

projpred

Build Status CRAN_Status_Badge

An R package to perform projection predictive variable selection for generalized linear models. Compatible with rstanarm and brms but other reference models can also be used.

The method is described in detail in Piironen et al. (2018) and evaluated in comparison to many other methods in Piironen and Vehtari (2017).

Currently, the supported models (family objects in R) include Gaussian, Binomial and Poisson families. See the quickstart-vignette for examples.

Resources

Installation

  • Install the latest release from CRAN:
install.packages('projpred')
  • Install latest development version from GitHub (requires devtools package):
if (!require(devtools)) {
  install.packages("devtools")
  library(devtools)
}
devtools::install_github('stan-dev/projpred', build_vignettes = TRUE)

Example

rm(list=ls())
library(projpred)
library(rstanarm)
options(mc.cores = parallel::detectCores())
set.seed(1)

# Gaussian and Binomial examples from the glmnet-package
data('df_gaussian', package = 'projpred')
#data('df_binom', package = 'projpred')

# fit the full model with a sparsifying prior
fit <- stan_glm(y ~ x, family = gaussian(), data = df_gaussian,
                prior = hs(df = 1, global_scale=0.01), iter = 500, seed = 1)
#fit <- stan_glm(y ~ x, family = binomial(), data = df_binom
#                prior = hs(df = 1, global_scale=0.01), iter = 500, seed = 1)


# perform the variable selection
vs <- varsel(fit)

# print the results
varsel_stats(vs)

# project the parameters for model sizes nv = 3,5 variables 
projs <- project(vs, nv = c(3, 5))

# predict using only the 5 most relevant variables
pred <- proj_linpred(vs, xnew=df_gaussian$x, nv=5, integrated=T)

# perform cross-validation for the variable selection
cvs <- cv_varsel(fit, cv_method='LOO')

# plot the validation results 
varsel_plot(cvs)

References

Dupuis, J. A. and Robert, C. P. (2003). Variable selection in qualitative models via an entropic explanatory power. Journal of Statistical Planning and Inference, 111(1-2):77–94.

Goutis, C. and Robert, C. P. (1998). Model choice in generalised linear models: a Bayesian approach via Kullback–Leibler projections. Biometrika, 85(1):29–37.

Piironen, Juho and Vehtari, Aki (2017). Comparison of Bayesian predictive methods for model selection. Statistics and Computing, 27(3):711-735. doi:10.1007/s11222-016-9649-y. (online).

Piironen, Juho, Paasiniemi, Markus and Vehtari, Aki (2018). Projective inference in high-dimensional problems: prediction and feature selection. (preprint).

About

Projection predictive variable selection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 90.0%
  • C++ 10.0%