Current Annotations
Annotation Details and Downloads
The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.
Ontology and annotation data is integrated in the mySQL and XML files. See the GO database guide for more information.
These files can also be downloaded via FTP; we recommend this method for the larger files, such as the UniProt dataset, as the web-based download may not work correctly.
Filtered Files
These files are taxon-specific and reflect the work of specific projects, primarily the model organisms database groups, to provide comprehensive, non-redundant annotation files for their organism. All the files in this table have been filtered using the annotation file QC checks script. A major component to the filtering is the requirement that particular taxon IDs can only be included within the association files provided by specific projects; please see the list of the authoritative groups for the major model organisms.
Statistics as of March 10, 2010
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download filtered files |
|---|---|---|---|---|
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download filtered files |
| Anaplasma phagocytophilum HZ JCVI |
1289 | 3473 (3473 non-IEA) |
2/8/2010 |
|
| Agrobacterium tumefaciensstr. C58 PAMGO |
83 | 250 (250 non-IEA) |
9/12/2008 |
|
| Arabidopsis thaliana TAIR/JCVI |
52785 | 151796 (112531 non-IEA) |
3/9/2010 |
|
| Aspergillus nidulans AspGD |
3453 | 15893 (5544 non-IEA) |
3/8/2010 |
|
| Bacillus anthracis Ames JCVI |
5280 | 13088 (13088 non-IEA) |
2/8/2010 |
|
| Bos taurus GO Annotations @ EBI |
23879 | 106813 (5223 non-IEA) |
3/4/2010 |
|
| Carboxydothermus hydrogenoformans Z-2901 JCVI |
2611 | 6391 (6391 non-IEA) |
2/8/2010 |
|
| Caenorhabditis elegans WormBase |
17530 | 119679 (65039 non-IEA) |
2/26/2010 |
|
| Campylobacter jejuni RM1221 JCVI |
1829 | 4625 (4625 non-IEA) |
2/8/2010 |
|
| Candida albicans CGD |
4068 | 20580 (7398 non-IEA) |
3/8/2010 |
|
| Clostridium perfringens ATCC13124 JCVI |
2892 | 7446 (7446 non-IEA) |
2/8/2010 |
|
| Colwellia psychrerythraea 34H JCVI |
4752 | 12088 (12088 non-IEA) |
2/8/2010 |
|
| Coxiella burnetii RSA 493 JCVI |
2033 | 5162 (5162 non-IEA) |
2/8/2010 |
|
| Danio rerio ZFIN |
15371 | 132808 (23045 non-IEA) |
2/22/2010 |
|
| Dehalococcoides ethenogenes 195 JCVI |
1584 | 3945 (3945 non-IEA) |
2/8/2010 |
|
| Dickeya dadantii PAMGO |
126 | 313 (313 non-IEA) |
3/3/2010 |
|
| Dictyostelium discoideum dictyBase |
7519 | 31399 (20699 non-IEA) |
3/7/2010 |
|
| Drosophila melanogaster FlyBase |
12613 | 73444 (58298 non-IEA) |
3/1/2010 |
|
| Escherichia coli EcoCyc & EcoliHub |
3799 | 42709 (8906 non-IEA) |
3/4/2010 |
|
| Ehrlichia chaffeensis Arkansas JCVI |
1091 | 2845 (2845 non-IEA) |
2/8/2010 |
|
| Gallus gallus GO Annotations @ EBI |
16343 | 70824 (2956 non-IEA) |
3/4/2010 |
|
| Geobacter sulfurreducens PCA JCVI |
3410 | 8185 (8185 non-IEA) |
2/8/2010 |
|
| Homo sapiens GO Annotations @ EBI |
18596 | 186092 (88384 non-IEA) |
3/4/2010 |
|
| Hyphomonas neptunium ATCC 15444 JCVI |
3116 | 8036 (7976 non-IEA) |
12/19/2009 |
|
| Leishmania major Sanger GeneDB |
10 | 28 (28 non-IEA) |
2/8/2010 |
|
| Listeria monocytogenes 4b F2365 JCVI |
2822 | 7215 (7215 non-IEA) |
1/30/2010 |
|
| Magnaporthe grisea PAMGO |
11274 | 27649 (27649 non-IEA) |
11/7/2009 |
|
| Methylococcus capsulatus Bath JCVI |
2928 | 7322 (7305 non-IEA) |
12/19/2009 |
|
| Mus musculus MGI |
35163 | 251666 (149826 non-IEA) |
3/5/2010 |
|
| Neorickettsia sennetsu Miyayama JCVI |
929 | 2421 (2421 non-IEA) |
2/8/2010 |
|
| Oomycetes PAMGO |
30 | 126 (126 non-IEA) |
2/13/2008 |
|
| Oryza sativa Gramene |
41146 | 50047 (50042 non-IEA) |
2/13/2010 |
|
| Protein Data Bank [multispecies] GO Annotations @ EBI |
21836 | 191654 (0 non-IEA) |
3/4/2010 |
|
| Plasmodium falciparum Sanger GeneDB |
2205 | 4631 (4631 non-IEA) |
1/30/2010 |
|
| Pseudomonas aeruginosa PAO1 PseudoCAP |
1519 | 7316 (7316 non-IEA) |
2/8/2010 |
|
| Pseudomonas fluorescens Pf-5 JCVI |
4822 | 12582 (12241 non-IEA) |
2/8/2010 |
|
| Pseudomonas syringae DC3000 JCVI |
4012 | 10332 (10332 non-IEA) |
2/8/2010 |
|
| Pseudomonas syringae pv. phaseolicola 1448A JCVI |
3571 | 9290 (9272 non-IEA) |
12/19/2009 |
|
| Rattus norvegicus RGD |
20077 | 197071 (111891 non-IEA) |
3/6/2010 |
|
| Reactome [multispecies] CSHL & EBI |
276 | 7942 (7942 non-IEA) |
2/8/2010 |
|
| Saccharomyces cerevisiae SGD |
6355 | 90056 (47858 non-IEA) |
3/6/2010 |
|
| Schizosaccharomyces pombe Sanger GeneDB |
5230 | 36086 (32235 non-IEA) |
2/13/2010 |
|
| Shewanella oneidensis MR-1 JCVI |
4843 | 12178 (12178 non-IEA) |
2/8/2010 |
|
| Silicibacter pomeroyi DSS-3 JCVI |
4252 | 10785 (10785 non-IEA) |
2/8/2010 |
|
| Solanaceae SGN |
155 | 305 (305 non-IEA) |
10/30/2009 |
|
| Trypanosoma brucei Sanger GeneDB |
2977 | 10459 (10459 non-IEA) |
2/8/2010 |
|
| UniProt [multispecies] IEA annotations removed GO Annotations @ EBI |
8837 | 36030 (36030 non-IEA) |
3/4/2010 |
|
| Vibrio cholerae JCVI |
3858 | 9421 (9421 non-IEA) |
2/8/2010 |
|
Unfiltered Files
These files have not been filtered with the annotation file QC checks script. The most important difference between these files and the filtered files above is that gene products from certain taxa are not stripped out of the file; they may also contain annotations to obsolete terms or outdated IEA annotations. Please see the annotation file QC script documentation for full details of the checks performed.
Please note that if you use unfiltered files in conjunction with filtered files, there may be duplicated annotations.
Statistics as of March 10, 2010
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download unfiltered files |
|---|---|---|---|---|
| Species, Database | Gene Products Annotated | Annotations | Submission date MM/DD/YYYY | Download unfiltered files |
| Arabidopsis thaliana GO Annotations @ EBI |
23011 | 110770 (35590 non-IEA) |
3/4/2010 |
|
| Mus musculus GO Annotations @ EBI |
35541 | 248028 (97880 non-IEA) |
3/4/2010 |
|
| Rattus norvegicus GO Annotations @ EBI |
27480 | 157460 (44175 non-IEA) |
3/4/2010 |
|
| Danio rerio GO Annotations @ EBI |
26937 | 95440 (7345 non-IEA) |
3/4/2010 |
|
| Protein Data Bank [multispecies] GO Annotations @ EBI |
36357 | 313530 (0 non-IEA) |
3/4/2010 |
|
| Reactome [multispecies] CSHL & EBI |
4351 | 36312 (36312 non-IEA) |
12/16/2009 |
|
| UniProt [multispecies] GO Annotations @ EBI |
7169403 | 58001747 (585091 non-IEA) |
3/4/2010 |
|
In the tables above gene association counts are provided for all evidence codes and separately for everything except IEA, Inferred from Electronic Annotation. The IEA code means there has been no human involvement in the assignment of the association; see the GO evidence code documentation for more details.
gp2protein files
The gp2protein directory contains files that map between model organism database object IDs and UniProt accessions.
