mark_outlier_rows drops outlier rows.

mark_outlier_rows(
  population,
  variables,
  sample,
  method = "svd+iqr",
  outlier_col = "is_outlier",
  ...
)

Arguments

population

tbl with grouping (metadata) and observation variables.

variables

character vector specifying observation variables.

sample

tbl containing sample that is used by outlier removal methods to estimate parameters. sample has same structure as population. Typically, sample corresponds to controls in the experiment.

method

optional character string specifying method for outlier removal. There is currently only one option ("svd_iqr").

outlier_col

optional character string specifying the name for the column that will indicate outliers (in the output). Default "is_outlier".

...

arguments passed to outlier removal method.

Value

population with an extra column is_outlier.

Examples

suppressMessages(suppressWarnings(library(magrittr))) population <- tibble::tibble( Metadata_group = sample(c("a", "b"), 100, replace = TRUE), Metadata_type = sample(c("control", "trt"), 100, replace = TRUE), AreaShape_Area = c(rnorm(98), 20, 30), AreaShape_Eccentricity = rnorm(100) ) variables <- c("AreaShape_Area", "AreaShape_Eccentricity") sample <- population %>% dplyr::filter(Metadata_type == "control") population_marked <- cytominer::mark_outlier_rows( population, variables, sample, method = "svd+iqr" ) population_marked %>% dplyr::group_by(is_outlier) %>% dplyr::sample_n(3)
#> # A tibble: 6 x 5 #> # Groups: is_outlier [2] #> Metadata_group Metadata_type AreaShape_Area AreaShape_Eccentricity is_outlier #> <chr> <chr> <dbl> <dbl> <lgl> #> 1 a trt 0.390 -0.00636 FALSE #> 2 b control 0.0767 -0.0596 FALSE #> 3 a trt 1.52 0.695 FALSE #> 4 a control -2.33 -1.91 TRUE #> 5 b control -1.56 1.58 TRUE #> 6 a trt 2.30 1.31 TRUE