GO Tools: Text Mining

agriGO

Platform
Online tool
Developer
Bioinformatics Center, China Agricultural University, Beijing, China.
Contact
Zhen Su
Publications
PMID:20435677
License
Free for academic use
GO data updates
weekly (or more frequently)
GO data used
  • terms
  • definitions and comments
  • synonyms
  • cross-references
  • relationships
  • subsets or GO slims
  • gene product data
  • taxon
  • evidence codes
  • references
  • qualifiers

Gene Ontology (GO), the de facto standard in gene functionality description, is used widely in functional annotation and enrichment analysis. Here, we introduce agriGO, an integrated web-based GO analysis toolkit for the agricultural community, using the advantages of our previous GO enrichment tool (EasyGO), to meet analysis demands from new technologies and research objectives. EasyGO is valuable for its proficiency, and has proved useful in uncovering biological knowledge in massive data sets from high-throughput experiments. For agriGO, the system architecture and website interface were redesigned to improve performance and accessibility. The supported organisms and gene identifiers were substantially expanded (including 38 agricultural species composed of 274 data types). The requirement on user input is more flexible, in that user-defined reference and annotation are accepted. Moreover, a new analysis approach using Gene Set Enrichment Analysis strategy and customizable features is provided. Four tools, SEA (Singular enrichment analysis), PAGE (Parametric Analysis of Gene set Enrichment), BLAST4ID (Transfer IDs by BLAST) and SEACOMPARE (Cross comparison of SEA), are integrated as a toolkit to meet different demands. We also provide a cross-comparison service so that different data sets can be compared and explored in a visualized way. Lastly, agriGO functions as a GO data repository with search and download functions; agriGO is publicly accessible at http://bioinfo.cau.edu.cn/agriGO/.

  • ontology or annotation browser
  • ontology or annotation visualization
  • database or data warehouse
  • statistical analysis
  • term enrichment
  • text mining

Tool listing last updated 02 June 2010

Bioconductor

Platform
Windows compatible Mac OS X compatible Linux compatible Unix compatible
Developer
BioConductor.
Contact
Bioconductor webmaster
License
Free for academic use
GO data updates
every three months (or more frequently)
GO data used
  • terms
  • definitions and comments
  • synonyms
  • cross-references
  • relationships
  • subsets or GO slims
  • gene product data
  • taxon
  • evidence codes
  • references
  • qualifiers
Open Source (OSI) Logo

Bioconductor provides tools for the analysis and comprehension of high-throughput (microarray, sequence, flow, etc.) genomic data. Bioconductor uses the R statistical programming language, and is open source and open development. There are more than core and user-contributed 400 packages. Bioconductor packages the GO ontology into our semi-annual release, with software tools to: query; join with diverse additional gene, microarray, and sequence annotations; incorporate GO into annotation, differential expression, and gene set enrichment work flows; and visualize.

  • ontology or annotation browser
  • ontology or annotation search engine
  • ontology or annotation visualization
  • database or data warehouse
  • software library
  • statistical analysis
  • term enrichment
  • text mining
  • other analysis
  • Flexible integration of GO into statistical analysis and comprehension of high-throughput genetic data.

Tool listing last updated 04 February 2011

Database for Annotation, Visualization and Integrated Discovery (DAVID)

Platform
Online tool
Developer
National Institute of Allergy and Infectious Diseases, Bethesda, Maryland, USA.
Contact
DAVID bioinformatics team
Publications
PMID:12734009
PMID:19131956
License
Free for academic use
GO data updates
no fixed update schedule
GO data used
  • terms
  • definitions and comments
  • synonyms
  • cross-references
  • relationships
  • subsets or GO slims
  • gene product data
  • taxon
  • evidence codes
  • references
  • qualifiers

Database for Annotation, Visualization and Integrated Discovery (DAVID) now provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes, which are usually derived from high-throughtput experiments, such as micorarray and proteomic studies. By the year of 2010, DAVID tools have been cited in over 2,000 publications.

  • term enrichment
  • text mining

Tool listing last updated 11 January 2011

Gene Ontology For Functional Analysis (GOFFA)

Platform
Windows compatible Mac OS X compatible Linux compatible Unix compatible
Developer
National Center for Toxicological Research (NCTR), Food and Drug Administration, Jefferson, Arkansas, USA.
Contact
Don Ding
Publications
PMID:17118145
License
Free for academic use
GO data updates
monthly (or more frequently)
GO data used
  • terms
  • definitions and comments
  • synonyms
  • cross-references
  • relationships
  • subsets or GO slims
  • gene product data
  • taxon
  • evidence codes
  • references
  • qualifiers

Gene Ontology For Functional Analysis (GOFFA) is a tool developed for ArrayTrack that takes a list of genes and identifies terms in Gene Ontology associated with those genes. GOFFA provides tools to view/access the following:
* GO term hierarchy
* Full listing of GO terms annotated with the genes associated with a given term
* Fisher's exact test p-value providing the probability of identifying that many genes for a given term by chance alone
* Relative enrichment factor (E-value) giving the enrichment of a GO term for genes in the submitted list relative to the frequency of genes assigned to that term from the full set of GOFFA annotated genes for a particular species

  • ontology or annotation browser
  • ontology or annotation search engine
  • ontology or annotation visualization
  • database or data warehouse
  • term enrichment
  • text mining

Tool listing last updated 26 September 2011

GeneMerge

Platform
Online tool Windows compatible Mac OS X compatible Linux compatible Unix compatible
Developer
Castillo-Davis Laboratory, University of Maryland, Maryland, USA.
Contact
Dr. Cristian Castillo-Davis
Publications
PMID:12724301
License
Free for academic use
GO data updates
every three months (or more frequently)
GO data used
  • terms
  • definitions and comments
  • synonyms
  • cross-references
  • relationships
  • subsets or GO slims
  • gene product data
  • taxon
  • evidence codes
  • references
  • qualifiers
Open Source (OSI) Logo

GeneMerge is a web-based and standalone application that returns a wide range of functional genomic data for a given set of study genes and provides rank scores for over-representation of particular functions or categories in the data. GeneMerge uses the hypergeometric test statistic which returns statistically correct results for samples of all sizes and is the #2 fastest GO tool available (Khatri and Draghici, 2005). GeneMerge can be used with any discrete, locus-based annotation data, including, literature references, genetic interactions, mutant phenotypes as well as traditional Gene Ontology queries.

  • statistical analysis
  • slimmer-type tool
  • term enrichment
  • text mining
  • false discovery rate and Bonferroni correction

Tool listing last updated 14 January 2011

GoAnnotator

Platform
Online tool
Developer
Faculty of Sciences, University of Lisbon, Lisbon, Portugal.
Contact
Francisco M Couto
Publications
PMID:17181854
License
Free for academic use
GO data updates
every three months (or more frequently)
GO data used
  • terms
  • definitions and comments
  • synonyms
  • cross-references
  • relationships
  • subsets or GO slims
  • gene product data
  • taxon
  • evidence codes
  • references
  • qualifiers

GoAnnotator is a tool for assisting the GO annotation of UniProt entries by linking the GO terms present in the uncurated annotations with evidence text automatically extracted from the documents linked to UniProt entries.

  • text mining

Tool listing last updated 04 June 2010

TXTGate

Platform
Online tool
Developer
Bioinformatics group, ESAT / K. U. Leuven, Belgium.
Publications
PMID:15186494
License
Free for academic use

TXTGate is a web-service that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, selected textual fields and MedLine abstracts of LocusLink and SGD are indexed. Subclustering and links to external resources allow for an in-depth analysis of the resulting term profiles.

  • text mining

Tool listing submitted before 2009; tool may be unsupported or inactive.