Selects biomolecules for normalization via the method of rank-invariant biomolcules (RIP)

rip(e_data, edata_id, fdata_id, groupDF, alpha = 0.2)

Arguments

e_data

a \(p \times n\) data.frame, where \(p\) is the number of peptides, proteins, lipids, or metabolites and \(n\) is the number of samples. Each row corresponds to data for a peptide, protein, lipid, or metabolite, with one column giving the biomolecule identifier name.

edata_id

character string indicating the name of the peptide, protein, lipid, or metabolite identifier. Usually obtained by calling attr(omicsData, "cnames")$edata_cname.

fdata_id

character string indicating the name of the sample column name in f_data.

groupDF

data.frame created by group_designation with columns for sample.id and group. If two main effects are provided the original main effect levels for each sample are returned as the third and fourth columns of the data.frame.

alpha

numeric p-value threshold, above which the biomolecules are retained as rank invariant (default value 0.25)

Value

Character vector containing the biomolecules belonging to the RIP subset.

Details

Biomolecules with complete data are subjected to a Kruskal-Wallis test (non-parametric one-way ANOVA) on group membership, and those biomolecules with p-value greater than a defined threshold alpha (common values include 0.1 or 0.25) are retained as rank-invariant biomolecules.

Author

Kelly Stratton