select_variable
select_variable.Rd
This function provides an advanced option to select metabolite variables from external dataset(s). The selected variables (as a list) can be further passed to argument selectVar_external
in function run_TIGER
for a customised data correction.
Arguments
- train_num
a numeric data.frame only including the metabolite values of training samples (can be quality control samples). Information such as injection order or well position need to be excluded. Row: sample. Column: metabolite variable. See Examples.
- test_num
an optional numeric data.frame including the metabolite values of test samples (can be subject samples). If provided, the column names of
test_num
should correspond to the column names oftrain_num
. Row: sample. Column: metabolite variable. IfNULL
, the variables will be selected based ontrain_num
only. See Examples.- train_batchID
NULL
or a vector corresponding totrain_num
to specify the batch of each sample. Ignored ifselectVar_batchWise = FALSE
. See Examples.- test_batchID
NULL
or a vector corresponding totest_num
to specify the batch of each sample. Ignored ifselectVar_batchWise = FALSE
. See Examples.- selectVar_corType
a character string indicating correlation (
"cor"
, default) or partial correlation ("pcor"
) is to be used. Can be abbreviated. See Details. Note: computing partial correlations of a large dataset can be very time-consuming.- selectVar_corMethod
a character string indicating which correlation coefficient is to be computed. One of
"spearman"
(default) or"pearson"
. Can be abbreviated. See Details.- selectVar_minNum
an integer specifying the minimum number of the selected variables. If
NULL
, no limited, but 1 at least. See Details. Default: 5.- selectVar_maxNum
an integer specifying the maximum number of the selected variables. If
NULL
, no limited, butncol(train_num) - 1
at most. See Details. Default: 10.- selectVar_batchWise
(advanced) logical. Specify whether the variable selection should be performed based on each batch. Default:
FALSE
. Note: ifTRUE
, batch ID of each sample are required. The support of batch-wise variable selection is provided for data requiring special processing (for example, data with strong batch effects). But in most case, batch-wise variable selection is not necessary. SettingTRUE
might make the algorithm less robust. See Details.- coerce_numeric
logical. If
TRUE
, values intrain_num
andtest_num
will be coerced to numeric before the computation. The columns cannot be coerced will be removed (with warnings). See Examples. Default:FALSE
.