GO Annotation Tools
The following tools make use of the GO ontologies or the gene associations provided by Consortium members. Being listed on this page does not represent an endorsement by the GO Consortium, nor has the Consortium tested the tool or found that it uses the Consortium information accurately. This page is provided to promote an exchange of information between users and software developers.
compatible OSs (for downloadable tools)
Unless stated otherwise, tools are free for academic use.
Blast2GO (B2G) joins in one universal application similarity search based GO annotation and functional analysis. B2G offers the possibility of direct statistical analysis on gene function information and visualization of relevant functional features on a highlighted GO direct acyclic graph (DAG). Furthermore B2G includes various statistics charts summarizing the results obtained at BLASTing, GO-mapping, annotation and enrichment analysis (Fisher's Exact Test). All analysis process steps are configurable and data import and export are supported at any stage. The application also accepts pre-existing BLAST or annotation files and takes them to subsequent steps. The tool offers a very suitable platform for high throughput functional genomics research in non-model species. B2G is a species-independent, intuitive and interactive desktop application which allows monitoring and comprehending the whole annotation and analysis process supported by additional features like GO Slim integration, evidence code (EC) consideration, a Batch-Mode or GO-Multilevel-Pies.
g:Profiler is a public web server for characterising and manipulating gene lists resulting from mining high-throughput genomic data. g:Profiler has a simple, user-friendly web interface with powerful visualisation for capturing Gene Ontology, pathway, or transcription factor binding site enrichments down to individual gene levels. Besides standard multiple testing corrections, a new improved method for estimating the true effect of multiple testing over complex structures like GO has been introduced. Interpreting ranked gene lists is supported from the same interface with very efficient algorithms. Such ordered lists may arise when studying the most significantly affected genes from high-throughput data or genes co-expressed with the query gene. Other important aspects of practical data analysis are supported by modules tightly integrated with g:Profiler. These are: g:Convert for converting between different database identifiers; g:Orth for finding orthologous genes from other species; and g:Sorter for searching a large body of public gene expression data for co-expression. g:Profiler supports 31 different species, and underlying data is updated regularly from sources like the Ensembl database. Bioinformatics communities wishing to integrate with g:Profiler can use alternative simple textual outputs.
GeneTools is a collection of web-based tools that brings together information from a broad range of resources, and provides this in a manner particularly useful for genome-wide analyses. Today, the two main tools connected to this database are the NMC Annotation Database V2.0 and eGOn V2.0 (explore Gene Ontology). The NMC Annotation Database V2.0 provides information from UniGene, EntrezGene, SwissProt and Gene Ontology (GO).
Major features are:
- Single search/Batch search, extraction of data for single or batches of genes.
- Manage reporter lists: in folders and share selected lists with other users.
- Manual GO Annotation: add your own Gene Ontology (GO) annotations to genes of interest.
- Export: to Excel, text or XML format.
GOanna is used to find annotations for proteins using a similarity search. The input can be a list of IDs or it can be a list of sequences in FASTA format. GOanna will retrieve the sequences if necessary and conduct the specified BLAST search against a user-specified database. The resulting file contains GO annotations of the top BLAST hits. The sequence alignments are also provided so the user can use these to access the quality of the match.
GoAnnotator is a tool for verification of electronic protein annotations using GO terms automatically extracted from literature.
The Gene Ontology Categorizer (GOCat) is an automatic text categorizer. The tool classifies any input text (a few words, an abstract, a set of PubMed Identifiers...) into Gene Ontology categories. The system, originally developed for the first BioCreative evaluation campaign, aims at facilitating functional annotation of gene and gene products using text mining methods. For every predicted category, a confidence score and a short text passage, extracted from the input text, are provided. The interface can also be used to navigate the Gene Ontology thanks to direct Quick GO links.
Institute for Molecular Bioscience, The University of Queensland, Australia
[Publication abstracts 1, 2]
Gene Ontology for Motifs (GOMO) is an alignment- and threshold-free comparative genomics approach for assigning functional roles to DNA regulatory motifs from DNA sequence. The algorithm detects associations between a user-specified DNA regulatory motif (expressed as a position weight matrix; PWM) and Gene Ontology terms.
The original method for predicting the roles of transcription factors (TFs starts with a PWM motif describing the DNA-binding affinity of the TF. GOMO uses the PWM to score the promoter region of each gene in the genome for its likelihood to be bound by the TF. The resulting ‘affinity’ scores are then used to test each term in the Gene Ontology for association with high-scoring genes. The algorithm was subsequently extended to leverage conserved signals using multiple, related species in a comparative approach, which greatly improves the resulting annotations.
GoPubMed is a web server which allows users to explore PubMed search results with the Gene Ontology. GoPubMed submits a user's keywords to PubMed, retrieves the abstracts, detects Gene Ontology terms in the abstracts, displays the subset of Gene Ontology relevant to the original query, and allows the user to browse through the ontology displaying associated papers and their GO annotation.
GOtcha provides a prediction of a set of GO terms that can be assosciated with a given query sequence. Each term is scoredindependently and the scores calibrated against reference searches to give an accurate percentage likelihood of correctness. These results can be displayed graphically.
The tool is currently web-based; contact David Martin for details of the standalone version.
InGOt is a module for Inpharmatica's new modular system for protein annotation, a proprietary system for applying Gene Ontology to all proteins. It offers an unparalleled resource to elucidate protein function: its graphical user interface enables you to investigate links from summary data through to hierarchical context and literature/evidence links. InGOt has more sequences than any public resource and assignments harvested from the broadest possible GO-linked resources. InGOt can integrate with other modules such as Domain Professor for detailed protein domain annotation and Blu-Chip for instant and accurate probe to protein assignments.
Note that InGOt is proprietary software, but web-based licensed software is offered at discounted prices to academic users.
Databases of protein domains and functional sites have become vital resources for the prediction of protein functions. During the last decade, several signature- recognition methods have evolved to address different sequence analysis problems, resulting in rather different and, for the most part, independent databases. Diagnostically, these resources have different areas of optimum application owing to the different strengths and weaknesses of their underlying analysis methods. Thus, for best results, search strategies should ideally combine all of them. InterProScan is a perl-based program which combines these different protein signature recognition methods into one resource.
The J. Craig Venter Institute
Manatee is a web-based gene evaluation and genome annotation tool; Manatee can store and view annotation for prokaryotic and eukaryotic genomes. The Manatee interface allows biologists to quickly identify genes and make high quality functional assignments, such as GO classifications, using search data, paralogous families, and annotation suggestions generated from automated analysis. Manatee can be downloaded and installed to run under the CGI area of a web server, such as Apache.
The Arabidopsis Information Resource
PubSearch is a web-based literature curation tool, allowing curators to search and annotate genes to keywords from articles. It has a simple mySQL database backend and uses a set of Java Servlets and JSPs for querying, modifying, and adding gene, gene-annotation, and literature information. PubSearch can be downloaded from GMOD.
Download icon courtesy of mac.axonz.com.