Z-Score Transformation — zscore

Calculate normalization parameters for the data via via z-score transformation.

zscore_transform(
  e_data,
  edata_id,
  subset_fn,
  feature_subset,
  backtransform = FALSE,
  apply_norm = FALSE,
  check.names = NULL
)

Arguments

e_data: a $p \times n + 1$ data.frame, where $p$ is the number of peptides, lipids, or metabolites and $n$ is the number of samples. Each row corresponds to data for a peptide, protein, lipid, or metabolite, with one column giving the biomolecule identifier name.
edata_id: character string indicating the name of the peptide, protein, lipid, or metabolite identifier. Usually obtained by calling attr(omicsData, "cnames")$edata_cname.
subset_fn: character string indicating the subset function to use for normalization
feature_subset: character vector containing the feature names in the subset to be used for normalization
backtransform: logical argument. If TRUE, the data will be back transformed after normalization so that the values are on a scale similar to their raw values. See details for more information. Defaults to FALSE.
apply_norm: logical argument. If TRUE, the normalization will be applied to the data. Defaults to FALSE.
check.names: deprecated

Value

List containing two elements: norm_params is list with two elements:

scale	numeric vector of length `n` standard deviations for each sample

location	numeric vector of length `n` means for each sample

backtransform_params is a list with two elements:

scale	numeric value giving the pooled standard deviation across all samples

location	numeric value giving global mean across all samples

If backtransform is set to TRUE then each list item under backtransform_params will be NULL.

If apply_norm is TRUE, the transformed data is returned as a third list item.

Details

Each feature is scaled by subtracting the mean of the feature subset specified for normalization and then dividing the result by the standard deviation (SD) of the feature subset specified for normalization to get the normalized data. The location estimates are the subset means for each sample. The scale estimates are the subset SDs for each sample. If backtransform is TRUE, the normalized feature values are multiplied by a pooled standard deviation (estimated across all samples) and a global mean of the subset data (across all samples) is added back to the normalized values. Means are taken ignoring any NA values.

Author

Lisa Bramer, Kelly Stratton, Bryan Stanfill