Perform multi-omics analysis using WebGestaltR
Usage
WebGestaltRMultiOmics(
analyteLists = NULL,
analyteListFiles = NULL,
analyteTypes = NULL,
enrichMethod = "ORA",
organism = "hsapiens",
enrichDatabase = NULL,
enrichDatabaseFile = NULL,
enrichDatabaseType = NULL,
enrichDatabaseDescriptionFile = NULL,
collapseMethod = "mean",
minNum = 10,
maxNum = 500,
fdrMethod = "BH",
sigMethod = "fdr",
fdrThr = 0.05,
topThr = 10,
reportNum = 100,
setCoverNum = 10,
perNum = 1000,
gseaP = 1,
isOutput = TRUE,
outputDirectory = getwd(),
projectName = NULL,
dagColor = "binary",
saveRawGseaResult = FALSE,
gseaPlotFormat = "png",
nThreads = 1,
cache = NULL,
hostName = "https://www.webgestalt.org/",
useWeightedSetCover = TRUE,
useAffinityPropagation = FALSE,
usekMedoid = FALSE,
kMedoid_k = 25,
isMetaAnalysis = TRUE,
mergeMethod = "mean",
normalizationMethod = "rank",
referenceLists = NULL,
referenceListFiles = NULL,
referenceTypes = NULL,
listNames = NULL
)Arguments
- analyteLists
vectorof the ID type of the corresponding interesting analyte list. The supported ID types of WebGestaltR for the selected organism can be found by the functionlistIdType. If theorganismisothers, users do not need to set this parameter. The length ofanalyteListsshould be the same as the length ofanalyteListFilesoranalyteLists.- analyteListFiles
If
enrichMethodisORA, the extension of theanalyteListFilesshould betxtand each file can only contain one column: the interesting analyte list. IfenrichMethodisGSEA, the extension of theanalyteListFilesshould bernkand the files should contain two columns separated by tab: the analyte list and the corresponding scores.- analyteTypes
a vector containing the ID types of the analyte lists.
- enrichMethod
Enrichment methods:
ORAorGSEA.- organism
Currently, WebGestaltR supports 12 organisms. Users can use the function
listOrganismto check available organisms. Users can also inputothersto perform the enrichment analysis for other organisms not supported by WebGestaltR. For other organisms, users need to provide the functional categories, interesting list and reference list (for ORA method). Because WebGestaltR does not perform the ID mapping for the other organisms, the above data should have the same ID type.- enrichDatabase
The functional categories for the enrichment analysis. Users can use the function
listGeneSetto check the available functional databases for the selected organism. Multiple databases in a vector are supported for ORA and GSEA.- enrichDatabaseFile
Users can provide one or more GMT files as the functional category for enrichment analysis. The extension of the file should be
gmtand the first column of the file is the category ID, the second one is the external link for the category. Genes annotated to the category are from the third column. All columns are separated by tabs. The GMT files will be combined withenrichDatabase.- enrichDatabaseType
The ID type of the genes in the
enrichDatabaseFile. If users setorganismasothers, users do not need to set this ID type because WebGestaltR will not perform ID mapping for other organisms. The supported ID types of WebGestaltR for the selected organism can be found by the functionlistIdType.- enrichDatabaseDescriptionFile
Users can also provide description files for the custom
enrichDatabaseFile. The extension of the description file should bedes. The description file contains two columns: the first column is the category ID that should be exactly the same as the category ID in the customenrichDatabaseFileand the second column is the description of the category. All columns are separated by tabs.- collapseMethod
The method to collapse duplicate IDs with scores.
mean,median,minandmaxrepresent the mean, median, minimum and maximum of scores for the duplicate IDs.- minNum
WebGestaltR will exclude the categories with the number of annotated genes less than
minNumfor enrichment analysis. The default is10.- maxNum
WebGestaltR will exclude the categories with the number of annotated genes larger than
maxNumfor enrichment analysis. The default is500.- fdrMethod
For the ORA method, WebGestaltR supports five FDR methods:
holm,hochberg,hommel,bonferroni,BHandBY. The default isBH.- sigMethod
Two methods of significance are available in WebGestaltR:
fdrandtop.fdrmeans the enriched categories are identified based on the FDR andtopmeans all categories are ranked based on FDR and then select top categories as the enriched categories. The default isfdr.- fdrThr
The significant threshold for the
fdrmethod. The default is0.05.- topThr
The threshold for the
topmethod. The default is10.- reportNum
The number of enriched categories visualized in the final report. The default is
20. A largerreportNummay be slow to render in the report.- setCoverNum
The number of expected gene sets after set cover to reduce redundancy. It could get fewer sets if the coverage reaches 100%. The default is
10.- perNum
The number of permutations for the GSEA method. The default is
1000.- gseaP
The exponential scaling factor of the phenotype score. The default is
1. When p=0, ES reduces to standard K-S statistics (See original paper for more details).- isOutput
If
isOutputis TRUE, WebGestaltR will create a folder named by theprojectNameand save the results in the folder. Otherwise, WebGestaltR will only return an Rdata.frameobject containing the enrichment results. If hundreds of gene list need to be analyzed simultaneously, it is better to setisOutputtoFALSE. The default isTRUE.- outputDirectory
The output directory for the results.
- projectName
The name of the project. If
projectNameisNULL, WebGestaltR will use time stamp as the project name.- dagColor
If
dagColorisbinary, the significant terms in the DAG structure will be colored by steel blue for ORA method or steel blue (positive related) and dark orange (negative related) for GSEA method. IfdagColoriscontinous, the significant terms in the DAG structure will be colored by the color gradient based on corresponding FDRs.- saveRawGseaResult
Whether the raw result from GSEA is saved as a RDS file, which can be used for plotting. Defaults to
FALSE. The list includes- Enrichment_Results
A data frame of GSEA results with statistics
- Running_Sums
A matrix of running sum of scores for each gene set
- Items_in_Set
A list with ranks of genes for each gene set
- gseaPlotFormat
The graphic format of GSEA enrichment plots. Either
svg,png, orc("png", "svg")(default).- nThreads
The number of cores to use for GSEA and set cover, and in batch function.
- cache
A directory to save data cache for reuse. Defaults to
NULLand disabled.- hostName
The server URL for accessing data. Mostly for development purposes.
- useWeightedSetCover
Use weighted set cover for ORA. Defaults to
TRUE.- useAffinityPropagation
Use affinity propagation for ORA. Defaults to
FALSE.- usekMedoid
Use k-medoid for ORA. Defaults to
TRUE.- kMedoid_k
The number of clusters for k-medoid. Defaults to
25.- isMetaAnalysis
whether to perform meta-analysis. Defaults to
TRUE.- mergeMethod
The method to merge the results from multiple omics (options:
mean,max). Only used ifisMetaAnalysis = FALSE. Defaults tomean.- normalizationMethod
The method to normalize the results from multiple omics (options:
rank,median,mean). Only used ifisMetaAnalysis = FALSE.- referenceLists
For the ORA method, users can also use an R object as the reference gene list.
referenceListsshould be an Rvectorobject containing the reference gene list.- referenceListFiles
For the ORA method, the users need to upload the reference gene list. The extension of the
referenceListFileshould betxtand the file can only contain one column: the reference gene list.- referenceTypes
Vector of the ID types of the reference lists. The supported ID types of WebGestaltR for the selected organism can be found by the function
listIdType. If theorganismisothers, users do not need to set this parameter.- listNames
The names of the analyte lists.