Create pmartR Object of Class proData

Converts several data frames of protein data to an object of the class 'proData'. Objects of the class 'proData' are lists with two obligatory components, e_data and f_data. An optional list component, e_meta, is used if analysis or visualization at other levels (e.g. gene) is also desired.

as.proData(
  e_data,
  f_data,
  e_meta = NULL,
  edata_cname,
  fdata_cname,
  emeta_cname = NULL,
  techrep_cname = NULL,
  ...
)

Arguments

e_data: a \(p \times n + 1\) data frame of expression data, where \(p\) is the number of proteins observed and \(n\) is the number of samples. Each row corresponds to data for one protein. One column specifying a unique identifier for each protein (row) must be present.
f_data: a data frame with \(n\) rows. Each row corresponds to a sample with one column giving the unique sample identifiers found in e_data column names and other columns providing qualitative and/or quantitative traits of each sample.
e_meta: an optional data frame with \(p\) rows. Each row corresponds to a protein with one column giving protein names (must be named the same as the column in e_data) and other columns giving biomolecule meta information (e.g. mappings of proteins to genes).
edata_cname: character string specifying the name of the column containing the protein identifiers in e_data and e_meta (if applicable).
fdata_cname: character string specifying the name of the column containing the sample identifiers in f_data.
emeta_cname: character string specifying the name of the column containing the gene identifiers (or other mapping variable) in e_meta (if applicable). Defaults to NULL. Can be the same as edata_cname, if desired. If e_meta is NULL, then either do not specify emeta_cname or specify it as NULL.
techrep_cname: character string specifying the name of the column in f_data that specifies which samples are technical replicates. This column is used to collapse the data when combine_techreps is called on this object. Defaults to NULL (no technical replicates).
...: further arguments

Value

Object of class proData

Details

Objects of class 'proData' contain some attributes that are referenced by downstream functions. These attributes can be changed from their default value by manual specification. A list of these attributes as well as their default values are as follows:

data_scale	Scale of the data provided in `e_data`. Acceptable values are 'log2', 'log10', 'log', and 'abundance', which indicate data is log base 2, base 10, natural log, or raw abundance, respectively. Default values is 'abundance'.

is_normalized	A logical argument, specifying whether the data has been normalized or not. Default value is FALSE.

norm_info	Default value is an empty list, which will be populated with a single named element `is_normalized = is_normalized`. When a normalization is applied to the data, this becomes populated with a list containing the normalization function, normalization subset and subset parameters, the location and scale parameters used to normalize the data, and the location and scale parameters used to backtransform the data (if applicable).

data_types	Character string describing the type of data, most commonly used for lipidomic data (lipidData objects) or NMR data (nmrData objects) but available for other data classes as well. Default value is NULL.

Computed values included in the data_info attribute are as follows:

num_edata	The number of unique `edata_cname` entries.

num_miss_obs	The number of missing observations.

num_emeta	The number of unique `emeta_cname` entries.

prop_missing	The proportion of `e_data` values that are NA.

num_samps	The number of samples that make up the columns of `e_data`.

meta_info	A logical argument, specifying whether `e_meta` is provided.

Author

Kelly Stratton, Lisa Bramer

Examples