The method assigns each sample to a group, for use in future analyses, based on the variable(s) specified as main effects.

group_designation(
  omicsData,
  main_effects = NULL,
  covariates = NULL,
  cov_type = NULL,
  pair_id = NULL,
  pair_group = NULL,
  pair_denom = NULL,
  batch_id = NULL
)

Arguments

omicsData

an object of the class 'lipidData', 'metabData', 'pepData', 'proData', 'isobaricpepData', 'nmrData', or 'seqData', usually created by as.lipidData, as.metabData, as.pepData, as.proData, as.isobaricpepData, as.nmrData, or as.seqData, respectively.

main_effects

a character vector with no more than two variable names that should be used as main effects to determine group membership of samples. The variable name must match a column name from f_data.

covariates

a character vector of no more than two variable names that should be used as covariates in downstream analyses. Covariates are typically variables that a user wants to account for in the analysis but quantifying/examining the effect of the variable is not of interest.

cov_type

An optional character vector (must be the same length as covariates if used) indicating the class or type of each covariate. For example, "numeric", "character", or "factor". Partial matching ("num" for "numeric") is NOT used and the entire class/type must be typed out. If the class of a covariate does not match the input to cov_type the covariate will be coerced to that type. For example, if the covariate is a numeric vector of 0s and 1s (indicating two categories) and the input to cov_type is a class other than numeric this vector will be coerced to a character vector. The default value is NULL. In this case the class of the covariates is neither checked nor altered.

pair_id

A character string indicating the column in f_data that contains the IDs for each pair. This string must match the column name exactly.

pair_group

A character string specifying the column in f_data that indicates which group each pair belongs to. This variable must contain just two levels or values (e.g., "before" and "after"). Numeric values can be used (e.g., 0 and 1). However, they will be converted to character strings.

pair_denom

A character string specifying which pair group is the "control". When taking the difference, the value for the control group will be subtracted from the non-control group value.

batch_id

an optional character vector of no more than one variable that should be used as batch information for downstream analyses. Batch ID is similar to covariates but unlike covariates it is specific to that of specific batch effects

Value

An object of the same class as the input omicsData object - the provided object with the samples filtered out, if any NAs were produced in designating groups. An attribute 'group_DF', a data.frame with columns for sample id and group, is added to the object. If two main effects are provided the original main effect levels for each sample are returned as the third and fourth columns of the data.frame. Additionally, the covariates provided will be listed as attributes of this data.frame.

Details

Groups are formed based on the levels of the main effect variables. One or two main effect variables are allowed. In the case of two main effect variables, groups are formed based on unique combinations of the levels of the two main effect variables. Any samples with level NA for a main effect variable will be removed from the data and will not be included in the final group designation results. Groups with a single sample are allowed, as is a single group.

Author

Lisa Bramer, Kelly Stratton

Examples

library(pmartRdata)
mylipid <- group_designation(
  omicsData = lipid_pos_object,
  main_effects = "Virus"
)
attr(mylipid, "group_DF")