For data types other than seqData, this function calculates principal components using projection pursuit estimation, which implements an expectation-maximization (EM) estimation algorithm when data is missing. For seqData counts, a generalized version of principal components analysis for non-normally distributed data is calculated under the assumption of a negative binomial distribution with global dispersion.
dim_reduction(omicsData, k = 2)
an object of the class 'pepdata', 'prodata', 'metabData',
'lipidData', 'nmrData', or 'seqData', created by as.pepData
,
as.proData
, as.metabData
,
as.lipidData
, as.nmrData
, or
as.seqData
, respectively.
integer number of principal components to return. Defaults to 2.
a data.frame with first k
principal component scores, sample
identifiers, and group membership for each sample (if group designation was
previously run on the data). The object is of class dimRes (dimension
reduction Result).
Any biomolecules seen in only one sample or with a variance less
than 1E-6 across all samples are not included in the PCA calculations. This
function leverages code from pca
and
glmpca
.
Redestig H, Stacklies W, Scholz M, Selbig J, & Walther D (2007). pcaMethods - a bioconductor package providing PCA methods for incomplete data. Bioinformatics. 23(9): 1164-7.
Townes FW, Hicks SC, Aryee MJ, Irizarry RA (2019). Feature selection and dimension reduction for single-cell RNA-seq based on a multinomial model. Genome Biol. 20, 1–16.
Huang H, Wang Y, Rudin C, Browne EP (2022). Towards a comprehensive evaluation of dimension reduction methods for transcriptomic data visualization. Communications Biology 5, 719.
library(pmartRdata)
mylipid <- edata_transform(omicsData = lipid_neg_object, data_scale = "log2")
mylipid <- group_designation(omicsData = mylipid, main_effects = "Virus")
pca_lipids <- dim_reduction(omicsData = mylipid)
# \donttest{
myseq <- group_designation(omicsData = rnaseq_object, main_effects = "Virus")
pca_seq <- dim_reduction(omicsData = myseq)
# }