Current Annotations
Annotation Details and Downloads
The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.
Ontology and annotation data is integrated in the mySQL and XML files. See the GO database guide for more information.
These files can also be downloaded via FTP; we recommend this method for the larger files, such as the UniProt dataset, as the web-based download may not work correctly.
Filtered Files
These files are taxon-specific and reflect the work of specific projects, primarily the model organisms database groups, to provide comprehensive, non-redundant annotation files for their organism. All the files in this table have been filtered using the annotation file QC checks script. A major component to the filtering is the requirement that particular taxon IDs can only be included within the association files provided by specific projects; please see the list of the authoritative groups for the major model organisms.
Statistics as of June 16, 2013
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download filtered files |
|---|---|---|---|---|
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download filtered files |
| Agrobacterium tumefaciensstr. C58 PAMGO |
82 | 248 (248 non-IEA) |
9/24/2012 |
|
| Arabidopsis thaliana TAIR |
30329 | 180388 (180285 non-IEA) |
6/14/2013 |
|
| Aspergillus nidulans AspGD |
47607 | 206593 (88514 non-IEA) |
6/16/2013 |
|
| Comprehensive Microbial Resource [multispecies] JCVI |
61682 | 154257 (154257 non-IEA) |
6/14/2013 |
|
| Bos taurus GO Annotations @ EBI |
19926 | 121385 (13637 non-IEA) |
6/14/2013 |
|
| Caenorhabditis elegans WormBase |
16715 | 103843 (60738 non-IEA) |
6/6/2013 |
|
| Candida albicans CGD |
22780 | 156503 (27195 non-IEA) |
6/16/2013 |
|
| Canis lupus familiaris GO Annotations @ EBI |
19013 | 81967 (4224 non-IEA) |
6/14/2013 |
|
| Danio rerio ZFIN |
18436 | 135672 (31312 non-IEA) |
6/14/2013 |
|
| Dickeya dadantii PAMGO |
124 | 296 (296 non-IEA) |
9/24/2012 |
|
| Dictyostelium discoideum dictyBase |
8226 | 57799 (19955 non-IEA) |
6/2/2013 |
|
| Drosophila melanogaster FlyBase |
13748 | 91728 (78187 non-IEA) |
6/14/2013 |
|
| Escherichia coli EcoCyc & EcoliHub |
3745 | 34982 (10709 non-IEA) |
6/14/2013 |
|
| Gallus gallus GO Annotations @ EBI |
11570 | 62658 (6335 non-IEA) |
6/14/2013 |
|
| Homo sapiens GO Annotations @ EBI |
45240 | 365386 (194177 non-IEA) |
6/14/2013 |
|
| Leishmania major Sanger GeneDB |
352 | 892 (892 non-IEA) |
6/14/2013 |
|
| Magnaporthe grisea PAMGO |
11274 | 27618 (27618 non-IEA) |
2/15/2013 |
|
| Mus musculus MGI |
25486 | 294676 (198929 non-IEA) |
6/14/2013 |
|
| Oomycetes PAMGO |
30 | 126 (126 non-IEA) |
9/24/2012 |
|
| Oryza sativa Gramene |
41142 | 49296 (49296 non-IEA) |
10/12/2012 |
|
| Protein Data Bank [multispecies] GO Annotations @ EBI |
120985 | 1210159 (52497 non-IEA) |
6/14/2013 |
|
| Plasmodium falciparum Sanger GeneDB |
2182 | 4546 (4546 non-IEA) |
2/15/2013 |
|
| Pseudomonas aeruginosa PAO1 PseudoCAP |
1519 | 6884 (6884 non-IEA) |
6/14/2013 |
|
| Rattus norvegicus RGD |
22642 | 271638 (167175 non-IEA) |
6/15/2013 |
|
| Reactome [multispecies] CSHL & EBI |
402 | 2148 (2148 non-IEA) |
4/8/2013 |
|
| Saccharomyces cerevisiae SGD Stanford University |
6381 | 91470 (46787 non-IEA) |
6/15/2013 |
|
| Schizosaccharomyces pombe PomBase University of Cambridge, UK |
5456 | 39587 (34242 non-IEA) |
6/6/2013 |
|
| Solanaceae SGN |
309 | 562 (562 non-IEA) |
9/24/2012 |
|
| Sus scrofa GO Annotations @ EBI |
19385 | 99303 (4504 non-IEA) |
6/14/2013 |
|
| Trypanosoma brucei Sanger GeneDB |
2096 | 3511 (3511 non-IEA) |
6/14/2013 |
|
| UniProt [multispecies] IEA annotations have been removed GO Annotations @ EBI |
67149 | 216675 (216675 non-IEA) |
5/30/2013 |
|
| gonuts.gz | 194 | 285 (285 non-IEA) |
2/1/2013 |
|
Unfiltered Files
These files have not been filtered with the annotation file QC checks script. The most important difference between these files and the filtered files above is that gene products from certain taxa are not stripped out of the file; they may also contain annotations to obsolete terms or outdated IEA annotations. Please see the annotation file QC script documentation for full details of the checks performed.
Please note that if you use unfiltered files in conjunction with filtered files, there may be duplicated annotations.
Statistics as of June 16, 2013
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download unfiltered files |
|---|---|---|---|---|
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download unfiltered files |
| Protein Data Bank [multispecies] GO Annotations @ EBI |
190579 | 2893664 (821753 non-IEA) |
5/30/2013 |
|
| Reactome [multispecies] CSHL & EBI |
13789 | 98638 (98638 non-IEA) |
4/8/2013 |
|
| UniProt [multispecies] GO Annotations @ EBI |
23367340 | 155335562 (1289308 non-IEA) |
5/30/2013 |
|
| Arabidopsis thaliana GO Annotations @ EBI |
27681 | 152893 (101135 non-IEA) |
5/30/2013 |
|
| Danio rerio GO Annotations @ EBI |
28777 | 88688 (20068 non-IEA) |
5/30/2013 |
|
| Mus musculus GO Annotations @ EBI |
40988 | 355794 (245612 non-IEA) |
5/30/2013 |
|
| Rattus norvegicus GO Annotations @ EBI |
22075 | 162590 (64826 non-IEA) |
5/30/2013 |
|
| Canis lupus familiaris GO Annotations @ EBI |
19013 | 81997 (4231 non-IEA) |
5/30/2013 |
|
| Sus scrofa GO Annotations @ EBI |
19385 | 99345 (4508 non-IEA) |
5/30/2013 |
|
In the tables above gene association counts are provided for all evidence codes and separately for everything except IEA, Inferred from Electronic Annotation. The IEA code means there has been no human involvement in the assignment of the association; see the GO evidence code documentation for more details.
gp2protein files
The gp2protein directory contains files that map between model organism database object IDs and UniProt accessions.