User Manual for Ontology Enrichment Analysis

OAT User Manual

Overview
Input Data
Statistical Analysis
Dynamic Analysis
- Disambiguation Input Genes
- Planteome Query API
Analysis Results
- Analysis Results Table
- Visualizations

Overview

The Ontology Enrichment Analysis Tool (OAT) supports two ways to conduct the ontology enrichment analysis: Statisticsl analysis and Dynamic analysis. The statistic analysis allow users to find enriched ontology terms based on user's input annotations. The dynamic analysis will find the enriched ontology terms from the database of Planteome. OAT supplies three main statistic methods to calculate the significant p-values for each ontology terms associated with the input gene list. After finding the significant enriched ontolog terms from supported ontology categories, such as GO:CC, GO:MF, GO:BP, PO, TO, EO, the OAT provides three types of visualizations to help users to analyze the enriched terms.

Input Data

For Statistical analysis, the users need to input a list of annotations in form of

Gene_ID Ontology_Term

which are two strings separated by spaces or tabs to be the Reference Background. Then a list of "Gene_ID" should be input in the Gene List.

For example, the string "fakeGeneID_0001 GO:0000001" can be a valid input annotation. It is suggested that different annotations should be separated by line breaks.

For Dynamic analysis, the users need to input a list of interesting genes. The genes can be specified by the name of the gene product (e.g. AT1G65620 )or by the unique database ID of the gene product (e.g. TAIR:locus:2034163).

The user can switch between different analysis method by selecting from the Query Type.

SwitchAnalysisMethod

Besides switching between querying from different databases, from the Setting panel, users can also set parameters for the statistic analysis such as statistic analysis method, the species(taxon) of the input genes, the cut-off significant p-value, and the specific ontology database.

Statistical Analysis

The statistical analysis allow users to find enriched ontology terms based on user's input annotations. Based on user's input data, OAT can find the ontology terms which are significant enriched by the input gene list.

staticAnalysis

Dynamic Analysis

The dynamic analysis will find the enriched ontology terms from the database of Planteome. The ontology curators keep updating the annotation database to ensure it contains the most comprehensive ontology information.

After inputitng the interested gene list, the users can process the analysis by following buttons.

buttons

Submit will submit the input gene list to disambiguation and create the final input gene list.

Analyze will submit the disambiguated input gene list to OAT. OAT utilizes the Planteome Query APIs to conduct the significant analysis with considering the user's input parameters. A table of the analysis results will be shown to

Visualize will create a new page in which all the enriched terms will be used to create visualzation results to help users to study the structure of the enriched ontology terms.

Disambiguation Input Genes

Since different genes can have the same name, to conduct accurate enrichment analysis, OAT requires users to specify the intended input genes from all the ambiguous input gene names. To avoid this trivial procedure, unique gene IDs are encouraged to be used.

In the process of disambiguation, all the genes not appear in the database will be listed as following:

unre

Recognized genes with ambiguities will be listed to allow users to select the prefered ones.

disam

Planteome Query API

Planteome provides a series of APIs to query the ontology data, annotation data, and gene product data.

Analysis Results

Analysis Results Table

All the enriched ontology terms are listed in a table for users to study details of each ontology terms. OAT provides convenient functionalities such as downloading the table, sorting the table, linking ontology terms to Planteome database, filtering the data based on ontology categories, and quick search among analysis results.

table

Visualizations

Three types of visualizations are provided in OAT, they are

Network Visualization
Hierarchical Visualization
Matrix Visualization

The selected nodes in the network visualization and hierarchical visualization are shown as following.

unre

Users can click the the shown parent nodes and children nodes to locate them in these two visualizations quickly.

Color Scheme

Each ontology node is assigned a color based on its significant p-value. Higher significance (less p-value) results in more red, less significant (higher p-value) ones will be colored in yellow.

Search and Locate Node

A quick search functionality is provided so that the users can search among all the enriched nodes.

The locating is requires the input of a valid Ontology ID. After typing some strings in the search input box, the OAT will list all the ontology terms whose name contains the input string.

unre

Aftering selcting from all the listed ontology terms. The ontology ID will be automatically shown in the input box. By clicking the button Go, the node in the input box will be centered in the canvas.

Network Visualization

All the nodes of the enriched ontology terms are located in the canvas based on force-directed method. The initial layout put all the nodes in a cirle, after clicking the button Resume Animation, the nodes will be relocated based on their adjacency infomation. This procedure will be animated. Note that this animation can be slow when there are significant number of enriched terms.

unre

Hierarchical Visualization

The nodes in one ontology category is clustered together and distributed hierarchically.

unre

Matrix Visualization

An annotation matrix is displayed to show which genes are associated with which enriched ontology terms. OAT allows sorting the rows and columns of the matrix based on different schemes.

unre

OAT User Manual

Contents

Overview

Input Data

Statistical Analysis

Dynamic Analysis

Disambiguation Input Genes

Planteome Query API

Analysis Results

Analysis Results Table

Visualizations

Color Scheme

Search and Locate Node

Network Visualization

Hierarchical Visualization

Matrix Visualization