Perform multi-omics analysis using WebGestaltR
Usage
WebGestaltRMultiOmics(
analyteLists = NULL,
analyteListFiles = NULL,
analyteTypes = NULL,
enrichMethod = "ORA",
organism = "hsapiens",
enrichDatabase = NULL,
enrichDatabaseFile = NULL,
enrichDatabaseType = NULL,
enrichDatabaseDescriptionFile = NULL,
collapseMethod = "mean",
minNum = 10,
maxNum = 500,
fdrMethod = "BH",
sigMethod = "fdr",
fdrThr = 0.05,
topThr = 10,
reportNum = 100,
setCoverNum = 10,
perNum = 1000,
gseaP = 1,
isOutput = TRUE,
outputDirectory = getwd(),
projectName = NULL,
dagColor = "binary",
saveRawGseaResult = FALSE,
gseaPlotFormat = "png",
nThreads = 1,
cache = NULL,
hostName = "https://www.webgestalt.org/",
useWeightedSetCover = TRUE,
useAffinityPropagation = FALSE,
usekMedoid = FALSE,
kMedoid_k = 25,
isMetaAnalysis = TRUE,
mergeMethod = "mean",
normalizationMethod = "rank",
referenceLists = NULL,
referenceListFiles = NULL,
referenceTypes = NULL,
listNames = NULL
)
Arguments
- analyteLists
vector
of the ID type of the corresponding interesting analyte list. The supported ID types of WebGestaltR for the selected organism can be found by the functionlistIdType
. If theorganism
isothers
, users do not need to set this parameter. The length ofanalyteLists
should be the same as the length ofanalyteListFiles
oranalyteLists
.- analyteListFiles
If
enrichMethod
isORA
, the extension of theanalyteListFiles
should betxt
and each file can only contain one column: the interesting analyte list. IfenrichMethod
isGSEA
, the extension of theanalyteListFiles
should bernk
and the files should contain two columns separated by tab: the analyte list and the corresponding scores.- analyteTypes
a vector containing the ID types of the analyte lists.
- enrichMethod
Enrichment methods:
ORA
orGSEA
.- organism
Currently, WebGestaltR supports 12 organisms. Users can use the function
listOrganism
to check available organisms. Users can also inputothers
to perform the enrichment analysis for other organisms not supported by WebGestaltR. For other organisms, users need to provide the functional categories, interesting list and reference list (for ORA method). Because WebGestaltR does not perform the ID mapping for the other organisms, the above data should have the same ID type.- enrichDatabase
The functional categories for the enrichment analysis. Users can use the function
listGeneSet
to check the available functional databases for the selected organism. Multiple databases in a vector are supported for ORA and GSEA.- enrichDatabaseFile
Users can provide one or more GMT files as the functional category for enrichment analysis. The extension of the file should be
gmt
and the first column of the file is the category ID, the second one is the external link for the category. Genes annotated to the category are from the third column. All columns are separated by tabs. The GMT files will be combined withenrichDatabase
.- enrichDatabaseType
The ID type of the genes in the
enrichDatabaseFile
. If users setorganism
asothers
, users do not need to set this ID type because WebGestaltR will not perform ID mapping for other organisms. The supported ID types of WebGestaltR for the selected organism can be found by the functionlistIdType
.- enrichDatabaseDescriptionFile
Users can also provide description files for the custom
enrichDatabaseFile
. The extension of the description file should bedes
. The description file contains two columns: the first column is the category ID that should be exactly the same as the category ID in the customenrichDatabaseFile
and the second column is the description of the category. All columns are separated by tabs.- collapseMethod
The method to collapse duplicate IDs with scores.
mean
,median
,min
andmax
represent the mean, median, minimum and maximum of scores for the duplicate IDs.- minNum
WebGestaltR will exclude the categories with the number of annotated genes less than
minNum
for enrichment analysis. The default is10
.- maxNum
WebGestaltR will exclude the categories with the number of annotated genes larger than
maxNum
for enrichment analysis. The default is500
.- fdrMethod
For the ORA method, WebGestaltR supports five FDR methods:
holm
,hochberg
,hommel
,bonferroni
,BH
andBY
. The default isBH
.- sigMethod
Two methods of significance are available in WebGestaltR:
fdr
andtop
.fdr
means the enriched categories are identified based on the FDR andtop
means all categories are ranked based on FDR and then select top categories as the enriched categories. The default isfdr
.- fdrThr
The significant threshold for the
fdr
method. The default is0.05
.- topThr
The threshold for the
top
method. The default is10
.- reportNum
The number of enriched categories visualized in the final report. The default is
20
. A largerreportNum
may be slow to render in the report.- setCoverNum
The number of expected gene sets after set cover to reduce redundancy. It could get fewer sets if the coverage reaches 100%. The default is
10
.- perNum
The number of permutations for the GSEA method. The default is
1000
.- gseaP
The exponential scaling factor of the phenotype score. The default is
1
. When p=0, ES reduces to standard K-S statistics (See original paper for more details).- isOutput
If
isOutput
is TRUE, WebGestaltR will create a folder named by theprojectName
and save the results in the folder. Otherwise, WebGestaltR will only return an Rdata.frame
object containing the enrichment results. If hundreds of gene list need to be analyzed simultaneously, it is better to setisOutput
toFALSE
. The default isTRUE
.- outputDirectory
The output directory for the results.
- projectName
The name of the project. If
projectName
isNULL
, WebGestaltR will use time stamp as the project name.- dagColor
If
dagColor
isbinary
, the significant terms in the DAG structure will be colored by steel blue for ORA method or steel blue (positive related) and dark orange (negative related) for GSEA method. IfdagColor
iscontinous
, the significant terms in the DAG structure will be colored by the color gradient based on corresponding FDRs.- saveRawGseaResult
Whether the raw result from GSEA is saved as a RDS file, which can be used for plotting. Defaults to
FALSE
. The list includes- Enrichment_Results
A data frame of GSEA results with statistics
- Running_Sums
A matrix of running sum of scores for each gene set
- Items_in_Set
A list with ranks of genes for each gene set
- gseaPlotFormat
The graphic format of GSEA enrichment plots. Either
svg
,png
, orc("png", "svg")
(default).- nThreads
The number of cores to use for GSEA and set cover, and in batch function.
- cache
A directory to save data cache for reuse. Defaults to
NULL
and disabled.- hostName
The server URL for accessing data. Mostly for development purposes.
- useWeightedSetCover
Use weighted set cover for ORA. Defaults to
TRUE
.- useAffinityPropagation
Use affinity propagation for ORA. Defaults to
FALSE
.- usekMedoid
Use k-medoid for ORA. Defaults to
TRUE
.- kMedoid_k
The number of clusters for k-medoid. Defaults to
25
.- isMetaAnalysis
whether to perform meta-analysis. Defaults to
TRUE
.- mergeMethod
The method to merge the results from multiple omics (options:
mean
,max
). Only used ifisMetaAnalysis = FALSE
. Defaults tomean
.- normalizationMethod
The method to normalize the results from multiple omics (options:
rank
,median
,mean
). Only used ifisMetaAnalysis = FALSE
.- referenceLists
For the ORA method, users can also use an R object as the reference gene list.
referenceLists
should be an Rvector
object containing the reference gene list.- referenceListFiles
For the ORA method, the users need to upload the reference gene list. The extension of the
referenceListFile
should betxt
and the file can only contain one column: the reference gene list.- referenceTypes
Vector of the ID types of the reference lists. The supported ID types of WebGestaltR for the selected organism can be found by the function
listIdType
. If theorganism
isothers
, users do not need to set this parameter.- listNames
The names of the analyte lists.