Skip to contents

[Stable] The function computes the sharing between a reference group of interest for each time point and a selection of groups of interest. In this way it is possible to observe the percentage of shared integration sites between reference and each group and identify in which time point a certain IS was observed for the first time.


  ref_group_key = c("SubjectID", "CellMarker", "Tissue", "TimePoint"),
  selection_group_key = c("SubjectID", "CellMarker", "Tissue", "TimePoint"),
  timepoint_column = "TimePoint",
  by_subject = TRUE,
  subject_column = "SubjectID"



A data frame containing one or more groups of reference. Groups are identified by ref_group_key


A data frame containing one or more groups of interest to compare. Groups are identified by selection_group_key


Character vector of column names that identify a unique group in the reference data frame


Character vector of column names that identify a unique group in the selection data frame


Name of the column holding time point info?


Should calculations be performed for each subject separately?


Name of the column holding subjects information. Relevant only if by_subject = TRUE


A list of data frames or a data frame


data("integration_matrices", package = "ISAnalytics")
data("association_file", package = "ISAnalytics")
aggreg <- aggregate_values_by_key(
    x = integration_matrices,
    association_file = association_file,
    value_cols = c("seqCount", "fragmentEstimate")
df1 <- aggreg |>
    dplyr::filter(.data$Tissue == "BM")
df2 <- aggreg |>
    dplyr::filter(.data$Tissue == "PB")
source <- iss_source(df1, df2)
#> $PT001
#> # A tibble: 161 × 14
#>    g1       g1_SubjectID g1_CellMarker g1_Tissue g1_TimePoint g2    g2_SubjectID
#>    <chr>    <chr>        <chr>         <chr>            <int> <chr> <chr>       
#>  1 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  2 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  3 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  4 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  5 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  6 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  7 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  8 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#>  9 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#> 10 PT001_M… PT001        MNC           BM                 180 PT00… PT001       
#> # ℹ 151 more rows
#> # ℹ 7 more variables: g2_CellMarker <chr>, g2_Tissue <chr>, g2_TimePoint <int>,
#> #   chr <chr>, integration_locus <dbl>, strand <chr>, sharing_perc <dbl>
#> $PT002
#> # A tibble: 77 × 14
#>    g1       g1_SubjectID g1_CellMarker g1_Tissue g1_TimePoint g2    g2_SubjectID
#>    <chr>    <chr>        <chr>         <chr>            <int> <chr> <chr>       
#>  1 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  2 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  3 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  4 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  5 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  6 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  7 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  8 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#>  9 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#> 10 PT002_M… PT002        MNC           BM                 180 PT00… PT002       
#> # ℹ 67 more rows
#> # ℹ 7 more variables: g2_CellMarker <chr>, g2_Tissue <chr>, g2_TimePoint <int>,
#> #   chr <chr>, integration_locus <dbl>, strand <chr>, sharing_perc <dbl>
ggplot2::ggplot(source$PT001, ggplot2::aes(
    x = as.factor(g2_TimePoint),
    y = sharing_perc, fill = g1
)) +
    ggplot2::geom_col() +
        x = "Time point", y = "Shared IS % with MNC BM",
        title = "Source of is MNC BM vs MNC PB"