sim_filter_all_same filters a melted similarity matrix to keep pairs with the same values in specific columns.

sim_filter_all_same(
  sim_df,
  row_metadata,
  all_same_cols,
  annotation_cols = NULL,
  include_group_tag = FALSE,
  drop_lower = FALSE,
  sim_cols = c("id1", "id2", "sim")
)

Arguments

sim_df

data.frame with melted similarity matrix.

row_metadata

data.frame with row metadata.

all_same_cols

character vector specifying columns.

annotation_cols

optional character vector specifying which columns from metadata to annotate the left index of the filtered sim_df with.

include_group_tag

optional boolean specifying whether to include an identifier for the pairs using the values in the all_same_cols columns.

drop_lower

optional boolean specifying whether to drop the pairs where the first index is smaller than the second index. This is equivalent to dropping the lower triangular of sim_df.

sim_cols

optional character string specifying minimal set of columns for a similarity matrix

Value

Filtered sim_df as a data.frame, where only pairs with the same values in all_same_cols columns are kept. Rows are annotated based on the first index, if specified.

Examples

suppressMessages(suppressWarnings(library(magrittr)))
n <- 5
population <- tibble::tibble(
  Metadata_group = sample(c("a", "b"), n, replace = TRUE),
  Metadata_type = sample(c("x", "y"), n, replace = TRUE),
  x = rnorm(n),
  y = x + rnorm(n) / 100,
  z = y + rnorm(n) / 1000
)
annotation_cols <- c("Metadata_group", "Metadata_type")
sim_df <- matric::sim_calculate(population, method = "pearson")
row_metadata <- attr(sim_df, "row_metadata")
sim_df <- matric::sim_annotate(sim_df, row_metadata, annotation_cols)
all_same_cols <- c("Metadata_group")
include_group_tag <- TRUE
drop_lower <- FALSE
matric::sim_filter_all_same(
  sim_df,
  row_metadata,
  all_same_cols,
  annotation_cols,
  include_group_tag,
  drop_lower
)
#>    id1 id2        sim Metadata_group Metadata_type
#> 1    3   1  0.9993494              b             y
#> 2    4   1 -0.9235077              b             y
#> 3    5   1 -0.7185683              b             x
#> 4    1   3  0.9993494              b             x
#> 5    4   3 -0.9367415              b             y
#> 6    5   3 -0.6930175              b             x
#> 7    1   4 -0.9235077              b             x
#> 8    3   4 -0.9367415              b             y
#> 9    5   4  0.3968401              b             x
#> 10   1   5 -0.7185683              b             x
#> 11   3   5 -0.6930175              b             y
#> 12   4   5  0.3968401              b             y