For data types other than seqData, this function calculates principal components using projection pursuit estimation, which implements an expectation-maximization (EM) estimation algorithm when data is missing. For seqData counts, a generalized version of principal components analysis for non-normally distributed data is calculated under the assumption of a negative binomial distribution with global dispersion.

dim_reduction(omicsData, k = 2)

Arguments

omicsData

an object of the class 'pepdata', 'prodata', 'metabData', 'lipidData', 'nmrData', or 'seqData', created by as.pepData, as.proData, as.metabData, as.lipidData, as.nmrData, or as.seqData, respectively.

k

integer number of principal components to return. Defaults to 2.

Value

a data.frame with first k principal component scores, sample identifiers, and group membership for each sample (if group designation was previously run on the data). The object is of class dimRes (dimension reduction Result).

Details

Any biomolecules seen in only one sample or with a variance less than 1E-6 across all samples are not included in the PCA calculations. This function leverages code from pca and glmpca .

References

Redestig H, Stacklies W, Scholz M, Selbig J, & Walther D (2007). pcaMethods - a bioconductor package providing PCA methods for incomplete data. Bioinformatics. 23(9): 1164-7.

Townes FW, Hicks SC, Aryee MJ, Irizarry RA (2019). Feature selection and dimension reduction for single-cell RNA-seq based on a multinomial model. Genome Biol. 20, 1–16.

Huang H, Wang Y, Rudin C, Browne EP (2022). Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization. Communications Biology 5, 719.

Examples

library(pmartRdata)

mylipid <- edata_transform(omicsData = lipid_neg_object, data_scale = "log2")
mylipid <- group_designation(omicsData = mylipid, main_effects = "Virus")
pca_lipids <- dim_reduction(omicsData = mylipid)

# \donttest{
myseq <- group_designation(omicsData = rnaseq_object, main_effects = "Virus")
pca_seq <- dim_reduction(omicsData = myseq)
# }