In the new age of comparative plant biology, we are looking at datasets from numerous inter and intra-specific comparative analysis experiments on transcriptome, proteomics, phenomics and genome annotation projects. These experiments may describe, for example, a set of genes from one or more plant species that are differentially expressed in response to some type of treatment. These genes may have associations to phenotypes and molecular functions, in addition to various gene and protein features. For a researcher looking at this data, the value comes from the analysis of this data. Unfortunately, the data is present in many locations in online data repositories and is also annotated using different vocabularies and keywords that often do not match descriptions between different resources. The problem can be solved in two ways: (1) keep the data in different locations, but annotate it with common reference vocabularies (ontologies) that can be queried in real time using common query words and/or (2) keep the data in a centralized location, and resolve the conflicting descriptions by adopting a single standard. Considering the limited resources and enormous amount of data distributed at many sites, cROP project brings an integrated approach of adopting common annotation standards and a set of reference ontologies for Plants.

Coordination with Outside Groups

The cROP project is a heavily collaborative project that depends on coordination with several national and international projects. Existing and new collaborations are of following four types:

  • Development and enrichment of new and existing reference ontologies for plants.
  • Ontology use. Annotating the gene expression profiles and phenotypes in OMICs and Tree of Life projects and sharing the annotations with cROP data warehouse.
  • Ontology cros-references. Project collaborators will continue to use the ontologies developed as part of the cROP initiatives as a reference vocabularies to map terms from similar vocabularies described or already in use by various plant databases and individual research projects - thereby creating a semantic web of ontologies required for plant biology.
  • Develop data annotation standards. We will engage in close collaboration with the various plant genome sequencing and annotation projects as well as with the comparative genomics databases such as Gramene, Interpro, Uniprot, EBI-ArrayExpress/ATLAS, PlantEnsembl, PLEXdb, DOE-KnowledgeBase, NSF-Phenotype-RCN, iPlant, etc.