variable_select
selects observation variables based on the specified
variable selection method.
variable_select( population, variables, sample = NULL, operation = "variance_threshold", ... )
population | tbl with grouping (metadata) and observation variables. |
---|---|
variables | character vector specifying observation variables. |
sample | tbl containing sample that is used by some variable selection
methods. |
operation | optional character string specifying method for variable
selection. This must be one of the strings |
... | arguments passed to selection operation. |
variable-selected data of the same class as population
.
# In this example, we use `correlation_threshold` as the operation for # variable selection. suppressMessages(suppressWarnings(library(magrittr))) population <- tibble::tibble( x = rnorm(100), y = rnorm(100) / 1000 ) population %<>% dplyr::mutate(z = x + rnorm(100) / 10) sample <- population %>% dplyr::slice(1:30) variables <- c("x", "y", "z") operation <- "correlation_threshold" cor(sample)#> x y z #> x 1.00000000 -0.08022343 0.99463331 #> y -0.08022343 1.00000000 -0.06732153 #> z 0.99463331 -0.06732153 1.00000000#> # A tibble: 6 x 3 #> x y z #> <dbl> <dbl> <dbl> #> 1 0.380 0.00105 0.328 #> 2 -0.502 -0.00105 -0.551 #> 3 -0.333 -0.00126 -0.328 #> 4 -1.02 0.00324 -0.889 #> 5 -1.07 -0.000417 -0.842 #> 6 0.304 0.000298 0.458#> NULL#> # A tibble: 6 x 2 #> y z #> <dbl> <dbl> #> 1 0.00105 0.328 #> 2 -0.00105 -0.551 #> 3 -0.00126 -0.328 #> 4 0.00324 -0.889 #> 5 -0.000417 -0.842 #> 6 0.000298 0.458