sim_calculate_ij calculates similarities given pairs of rows.

sim_calculate_ij(
  population,
  index,
  method = NULL,
  annotation_prefix = "Metadata_",
  ...
)

Arguments

population

data.frame with annotations (a.k.a. metadata) and observation variables.

index

data.frame with at least two columns id1 and id2 specifying rows of population, and an optional attribute metric_metadata$method, which is a character string specifying implemented. Preserve the diagonal entries when constructing index

method

optional character string specifying method for to calculate similarity. method, if specified, overrides attr(index, "metric_metadata")$method.

annotation_prefix

optional character string specifying prefix for annotation columns.

...

arguments passed downstream for parallel processing.

Value

data.frame which is the same as index, but with a new column sim containing similarities, and with the diagonals filtered out.

Examples

suppressMessages(suppressWarnings(library(magrittr)))
population <- tibble::tribble(
  ~Metadata_group, ~x, ~y, ~z,
  1, -1, 5, -5,
  2, 0, 6, -4,
  3, 7, -4, 3,
  4, 14, -8, 6
)

n <- nrow(population)

index <-
  expand.grid(id1 = seq(n), id2 = seq(n), KEEP.OUT.ATTRS = FALSE)

matric::sim_calculate_ij(population, index, method = "cosine")
#>    id1 id2        sim
#> 1    2   1  0.9709195
#> 2    3   1 -0.6836729
#> 3    4   1 -0.6836729
#> 4    1   2  0.9709195
#> 5    3   2 -0.5803433
#> 6    4   2 -0.5803433
#> 7    1   3 -0.6836729
#> 8    2   3 -0.5803433
#> 9    4   3  1.0000000
#> 10   1   4 -0.6836729
#> 11   2   4 -0.5803433
#> 12   3   4  1.0000000

attr(index, "metric_metadata") <- list(method = "cosine")

matric::sim_calculate_ij(population, index)
#>    id1 id2        sim
#> 1    2   1  0.9709195
#> 2    3   1 -0.6836729
#> 3    4   1 -0.6836729
#> 4    1   2  0.9709195
#> 5    3   2 -0.5803433
#> 6    4   2 -0.5803433
#> 7    1   3 -0.6836729
#> 8    2   3 -0.5803433
#> 9    4   3  1.0000000
#> 10   1   4 -0.6836729
#> 11   2   4 -0.5803433
#> 12   3   4  1.0000000