R/subset_funcs.R
ppp_rip.Rd
Selects biomolecules for normalization via the method of proportion of biomolecules present and rank invariant biomolecules (ppp_rip)
ppp_rip(e_data, edata_id, fdata_id, groupDF, alpha = 0.2, proportion = 0.5)
a \(p \times n + 1\) data.frame, where \(p\) is the number of peptides, proteins, lipids, or metabolites and \(n\) is the number of samples. Each row corresponds to data for a peptide, protein, lipid, or metabolite, with one column giving the biomolecule identifier name.
character string indicating the name of the peptide, protein,
lipid, or metabolite identifier. Usually obtained by calling
attr(omicsData, "cnames")$edata_cname
.
character string indicating the name of the sample column name in f_data.
data.frame created by group_designation
with columns
for sample.id and group. If two main effects are provided the original main
effect levels for each sample are returned as the third and fourth columns
of the data.frame.
numeric p-value threshold, above which the biomolecules are retained as rank invariant (default value 0.25)
numeric value between 0 and 1, indicating the percentage at or above which a biomolecule must be present across all samples in order to be retained (default value 0.5)
Character vector containing the biomolecules belonging to the ppp_rip subset.
Biomolecules present across proportion
samples are subjected to a
Kruskal-Wallis test (non-parametric one-way ANOVA, where NAs are ignored)
on group membership, and those biomolecules with p-value greater than a
defined threshold alpha
(common values include 0.1 or 0.25) are
retained as rank-invariant biomolecules.