stratify
stratifies operations.
stratify(population, sample, reducer, strata, ...)
population | tbl with grouping (metadata) and observation variables. |
---|---|
sample | tbl with the same structure as |
reducer | operation that is to applied in a stratified manner. |
strata | optional character vector specifying grouping variables for stratification. |
... | arguments passed to operation. |
population
with potentially extra columns.
suppressMessages(suppressWarnings(library(magrittr))) population <- tibble::tibble( Metadata_group = sample(c("a", "b"), 100, replace = TRUE), Metadata_type = sample(c("control", "trt"), 100, replace = TRUE), AreaShape_Area = c(rnorm(98), 20, 30), AreaShape_Eccentricity = rnorm(100) ) variables <- c("AreaShape_Area", "AreaShape_Eccentricity") strata <- c("Metadata_group") sample <- population %>% dplyr::filter(Metadata_type == "control") population_marked <- cytominer::stratify( reducer = cytominer::mark_outlier_rows, method = "svd+iqr", population = population, variables = variables, sample = sample, strata = strata ) population_marked %>% dplyr::group_by(is_outlier) %>% dplyr::sample_n(3)#> # A tibble: 6 x 5 #> # Groups: is_outlier [2] #> Metadata_group Metadata_type AreaShape_Area AreaShape_Eccentricity is_outlier #> <chr> <chr> <dbl> <dbl> <lgl> #> 1 a control 0.244 2.04 FALSE #> 2 b control -0.476 -1.05 FALSE #> 3 b control 2.20 0.0210 FALSE #> 4 b control 20 0.651 TRUE #> 5 b trt 0.601 2.00 TRUE #> 6 a trt 1.96 1.34 TRUE