Current Annotations

Annotation Details and Downloads

The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.

Ontology and annotation data is integrated in the mySQL and XML files. See the GO database guide for more information.

These files can also be downloaded via FTP; we recommend this method for the larger files, such as the UniProt dataset, as the web-based download may not work correctly.

Filtered Files

These files are taxon-specific and reflect the work of specific projects, primarily the model organisms database groups, to provide comprehensive, non-redundant annotation files for their organism. All the files in this table have been filtered using the annotation file QC checks script. A major component to the filtering is the requirement that particular taxon IDs can only be included within the association files provided by specific projects; please see the list of the authoritative groups for the major model organisms.

Statistics as of April 20, 2014

Filtered Annotation File Downloads
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Agrobacterium tumefaciensstr. C58
PAMGO
82 248
(248 non-IEA)
9/24/2012
Arabidopsis thaliana
TAIR
30321 180085
(180085 non-IEA)
3/7/2014
Aspergillus nidulans
AspGD
141342 579116
(83909 non-IEA)
4/20/2014
Comprehensive Microbial Resource [multispecies]
JCVI
61680 154223
(154223 non-IEA)
11/1/2013
Bos taurus
GO Annotations @ EBI
20028 128514
(18789 non-IEA)
4/15/2014
Caenorhabditis elegans
WormBase
19959 132900
(65770 non-IEA)
3/31/2014
Candida albicans
CGD
47785 270383
(27203 non-IEA)
4/20/2014
Canis lupus familiaris
GO Annotations @ EBI
19846 116329
(6790 non-IEA)
4/15/2014
Danio rerio
ZFIN
18641 143505
(35718 non-IEA)
4/14/2014
Dickeya dadantii
PAMGO
124 296
(296 non-IEA)
9/24/2012
Dictyostelium discoideum
dictyBase
8247 60309
(21302 non-IEA)
4/6/2014
Drosophila melanogaster
FlyBase
13773 95846
(82952 non-IEA)
3/25/2014
Escherichia coli
PortEco
3756 47793
(13696 non-IEA)
3/7/2014
Gallus gallus
GO Annotations @ EBI
14233 93697
(7989 non-IEA)
4/15/2014
Gene Ontology Normal Usage Tracking System (GONUTS) 194 285
(285 non-IEA)
2/1/2013
Homo sapiens
GO Annotations @ EBI
44900 392213
(217031 non-IEA)
4/15/2014
Leishmania major
Sanger GeneDB
352 890
(890 non-IEA)
1/17/2014
Magnaporthe grisea
PAMGO
11274 27618
(27618 non-IEA)
2/15/2013
Mus musculus
MGI
25416 314688
(220323 non-IEA)
4/17/2014
Oomycetes
PAMGO
30 126
(126 non-IEA)
9/24/2012
Oryza sativa
Gramene
41141 49292
(49292 non-IEA)
11/1/2013
Protein Data Bank [multispecies]
GO Annotations @ EBI
135824 1337769
(46593 non-IEA)
4/15/2014
Plasmodium falciparum
Sanger GeneDB
2181 4543
(4543 non-IEA)
10/25/2013
Pseudomonas aeruginosa PAO1
PseudoCAP
1043 1978
(1978 non-IEA)
2/5/2014
Rattus norvegicus
RGD
38682 421726
(217706 non-IEA)
4/19/2014
Saccharomyces cerevisiae
SGD
Stanford University
6381 92915
(48042 non-IEA)
4/19/2014
Schizosaccharomyces pombe
PomBase
University of Cambridge, UK
5385 39492
(34045 non-IEA)
3/20/2014
Solanaceae
SGN
309 561
(561 non-IEA)
12/6/2013
Sus scrofa
GO Annotations @ EBI
19724 106958
(6352 non-IEA)
4/15/2014
Trypanosoma brucei
Sanger GeneDB
2145 3679
(3679 non-IEA)
1/17/2014
UniProt [multispecies]
IEA annotations have been removed
GO Annotations @ EBI
109417 317774
(317774 non-IEA)
4/15/2014
goa_ref_chicken.gz 11421 54061
(6762 non-IEA)
1/6/2014
goa_ref_cow.gz 16744 97205
(15723 non-IEA)
1/6/2014
goa_ref_dog.gz 15167 90388
(4700 non-IEA)
1/6/2014
goa_ref_human.gz 18315 269682
(195233 non-IEA)
1/6/2014
goa_ref_pig.gz 16279 88652
(4690 non-IEA)
1/6/2014

Unfiltered Files

These files have not been filtered with the annotation file QC checks script. The most important difference between these files and the filtered files above is that gene products from certain taxa are not stripped out of the file; they may also contain annotations to obsolete terms or outdated IEA annotations. Please see the annotation file QC script documentation for full details of the checks performed.

Please note that if you use unfiltered files in conjunction with filtered files, there may be duplicated annotations.

Statistics as of April 20, 2014

Unfiltered Annotation File Downloads
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Protein Data Bank [multispecies]
GO Annotations @ EBI
214050 3342092
(1015134 non-IEA)
4/15/2014
Reactome [multispecies]
CSHL & EBI
15212 109408
(109408 non-IEA)
2/5/2014
UniProt [multispecies]
GO Annotations @ EBI
36244927 258891175
(1495109 non-IEA)
4/15/2014
Canis lupus familiaris
GO Annotations @ EBI
19846 116329
(6790 non-IEA)
4/15/2014
Sus scrofa
GO Annotations @ EBI
19724 106958
(6352 non-IEA)
4/15/2014

In the tables above gene association counts are provided for all evidence codes and separately for everything except IEA, Inferred from Electronic Annotation. The IEA code means there has been no human involvement in the assignment of the association; see the GO evidence code documentation for more details.

Back to top

gp2protein files

The gp2protein directory contains files that map between model organism database object IDs and UniProt accessions.

Back to top