Import a single integration matrix from file

This function allows to read and import an integration matrix (ideally produced by VISPA2) and converts it to a tidy format.

Usage

import_single_Vispa2Matrix(
  path,
  separator = "\t",
  additional_cols = NULL,
  transformations = NULL,
  sample_names_to = pcr_id_column(),
  values_to = "Value",
  to_exclude = lifecycle::deprecated(),
  keep_excluded = lifecycle::deprecated()
)

Arguments

path: The path to the file on disk
separator: The column delimiter used, defaults to \t
additional_cols: Either NULL, a named character vector or a named list. See details.
transformations: Either NULL or a named list of purrr-style lambdas where names are column names the function should be applied to.
sample_names_to: Name of the output column holding the sample identifier. Defaults to pcr_id_column()
values_to: Name of the output column holding the quantification values. Defaults to Value.
to_exclude: Deprecated. Use additonal_cols instead
keep_excluded: Deprecated. Use additonal_cols instead

Value

A data frame object in tidy format

Details

Additional columns

Additional columns are annotation columns present in the integration matrix to import that are not

part of the mandatory IS vars (see mandatory_IS_vars())
part of the annotation IS vars (see annotation_IS_vars())
the sample identifier column
the quantification column

When specified they tell the function how to treat those columns in the import phase, by providing a named character vector, where names correspond to the additional column names and values are a choice of the following:

"char" for character (strings)
"int" for integers
"logi" for logical values (TRUE / FALSE)
"numeric" for numeric values
"factor" for factors
"date" for generic date format - note that functions that need to read and parse files will try to guess the format and parsing may fail
One of the accepted date/datetime formats by lubridate, you can use ISAnalytics::date_formats() to view the accepted formats
"_" to drop the column

For more details see the "How to use import functions" vignette: vignette("workflow_start", package = "ISAnalytics")

Transformations

Lambdas provided in input in the transformations argument, must be transformations, aka functions that take in input a vector and return a vector of the same length as the input.

If the transformation list contains column names that are not present in the data frame, they are simply ignored.

Required tags

The function will explicitly check for the presence of these tags:

All columns declared in mandatory_IS_vars()

Examples

fs_path <- generate_default_folder_structure(type = "correct")
matrix_path <- fs::path(
    fs_path$root, "PJ01", "quantification",
    "POOL01-1", "PJ01_POOL01-1_seqCount_matrix.no0.annotated.tsv.gz"
)
matrix <- import_single_Vispa2Matrix(matrix_path)
#> Reading file...
#> ℹ Mode: fread
#> Reshaping...
#> *** File info *** 
#> • --- Annotated: TRUE
#> • --- Dimensions: 261 x 29
#> • --- Read mode: fread
#> • --- Sample count: 24
head(matrix)
#> # A tibble: 6 × 7
#>   chr   integration_locus strand GeneName     GeneStrand CompleteAmplificationID
#>   <chr>             <int> <chr>  <chr>        <chr>      <chr>                  
#> 1 16             68164148 +      NFATC3       +          PJ01_POOL01_LTR75LC38_…
#> 2 4             129390130 +      LOC100507487 +          PJ01_POOL01_LTR75LC38_…
#> 3 5              84009671 -      EDIL3        -          PJ01_POOL01_LTR75LC38_…
#> 4 12             54635693 -      CBX5         -          PJ01_POOL01_LTR75LC38_…
#> 5 2             181930711 +      UBE2E3       +          PJ01_POOL01_LTR75LC38_…
#> 6 20             35920986 +      MANBAL       +          PJ01_POOL01_LTR75LC38_…
#> # ℹ 1 more variable: Value <int>