sim_filter_all_same filters a melted similarity matrix to keep pairs with the same values in specific columns, keeping only some of these pairs.

sim_filter_all_same_keep_some(
  sim_df,
  row_metadata,
  all_same_cols,
  filter_keep_right,
  annotation_cols = NULL,
  drop_reference = TRUE,
  sim_cols = c("id1", "id2", "sim")
)

Arguments

sim_df

data.frame with melted similarity matrix.

row_metadata

data.frame with row metadata.

all_same_cols

character vector specifying columns.

filter_keep_right

data.frame of metadata specifying which rows to keep on the right index.

annotation_cols

optional character vector specifying which columns from metadata to annotate the left index of the filtered sim_df with.

drop_reference

optional boolean specifying whether to filter (drop) pairs using filter_keep_right on the left index.

sim_cols

optional character string specifying minimal set of columns for a similarity matrix

Value

Filtered sim_df as a data.frame, where only pairs with the same values in all_same_cols columns are kept, with further filtering using filter_keep_right.Rows are annotated based on the first index, if specified.

Examples

suppressMessages(suppressWarnings(library(magrittr)))
n <- 20
population <- tibble::tibble(
  Metadata_group = sample(c("a", "b"), n, replace = TRUE),
  Metadata_type = sample(c("x", "y"), n, replace = TRUE),
  x = rnorm(n),
  y = x + rnorm(n) / 100,
  z = y + rnorm(n) / 1000
)
annotation_cols <- c("Metadata_group", "Metadata_type")
sim_df <- matric::sim_calculate(population, method = "pearson")
row_metadata <- attr(sim_df, "row_metadata")
sim_df <- matric::sim_annotate(sim_df, row_metadata, annotation_cols)
all_same_cols <- c("Metadata_group")
filter_keep_right <-
  tibble::tibble(Metadata_group = "a", Metadata_type = "x")
drop_reference <- FALSE
matric::sim_filter_all_same_keep_some(
  sim_df,
  row_metadata,
  all_same_cols,
  filter_keep_right,
  annotation_cols,
  drop_reference
)
#>    id1 id2        sim Metadata_group Metadata_type
#> 1    3   1  0.9583369              a             y
#> 2    4   1 -0.9889111              a             x
#> 3    6   1  0.9362880              a             y
#> 4    7   1  0.9666876              a             y
#> 5    8   1  0.9685588              a             y
#> 6   18   1 -0.9342974              a             x
#> 7   20   1  0.9852392              a             y
#> 8    1   4 -0.9889111              a             x
#> 9    3   4 -0.9052901              a             y
#> 10   6   4 -0.8737445              a             y
#> 11   7   4 -0.9179561              a             y
#> 12   8   4 -0.9208719              a             y
#> 13  18   4  0.8709946              a             x
#> 14  20   4 -0.9997362              a             y
#> 15   1  18 -0.9342974              a             x
#> 16   3  18 -0.9972009              a             y
#> 17   4  18  0.8709946              a             x
#> 18   6  18 -0.9999842              a             y
#> 19   7  18 -0.9944217              a             y
#> 20   8  18 -0.9936124              a             y
#> 21  20  18 -0.8594804              a             y