the Gene Ontology

Search
  • Open menus
  • Home
  • FAQ
  • Downloads
    • Ontologies
    • Annotations
    • Database
    • Mappings to GO
    • Teaching Resources
    • Other files
    • FTP and CVS downloads
  • Tools
    • Browsers
    • Microarray tools
    • Annotation tools
    • Other tools
    • Submit New Tools
  • Documentation
    • Introduction
    • Ontology...
      • Ontology structure
      • Ontology relations
      • Cellular Component
      • Molecular Function
      • Biological Process
      • GO Slim Guide
      • OBO v1.2 format
    • Annotation...
      • Annotation Guide
      • Evidence Codes
      • Conventions
      • SOPs
      • Species and Databases
      • File Format
      • Reference collection
    • Database...
      • GO Database Guide
      • Database schema
      • Database abbreviations
    • File Formats...
      • File Format Guide
      • Annotation
      • OBO v1.2
      • OBO v1.0
      • GO RDF-XML
    • Meeting minutes
  • About GO
    • GO Consortium
    • Publications
    • Citation Policy
    • Mailing lists
    • Interest Groups
    • GO People
    • Funding
    • Acknowledgements
    • Newsletter
  • Projects
    • Reference Genomes
    • Cardiovascular
    • Renal
  • Contact GO
    • News
    • RSS
    • twitter

Current Annotations

  • Annotation Details and Downloads
  • Filtered files
  • Unfiltered files
  • gp2protein files

Annotation Details and Downloads

The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.

Ontology and annotation data is integrated in the mySQL and XML files. See the GO database guide for more information.

These files can also be downloaded via FTP; we recommend this method for the larger files, such as the UniProt dataset, as the web-based download may not work correctly.

Filtered Files

These files are taxon-specific and reflect the work of specific projects, primarily the model organisms database groups, to provide comprehensive, non-redundant annotation files for their organism. All the files in this table have been filtered using the annotation file QC checks script. A major component to the filtering is the requirement that particular taxon IDs can only be included within the association files provided by specific projects; please see the list of the authoritative groups for the major model organisms.

Statistics as of February 8, 2010

Filtered Annotation File Downloads
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download filtered files
Anaplasma phagocytophilum HZ
JCVI
1289 3473
(3473 non-IEA)
2/8/2010
  • annotations [39.7 kb]
  • README
Agrobacterium tumefaciensstr. C58
PAMGO
83 250
(250 non-IEA)
9/12/2008
  • annotations [3.4 kb]
  • README
Arabidopsis thaliana
TAIR/JCVI
52710 151292
(111837 non-IEA)
2/8/2010
  • annotations [3.6 mb]
  • README
Aspergillus nidulans
AspGD
3451 15844
(5493 non-IEA)
2/8/2010
  • annotations [171.6 kb]
  • README
Bacillus anthracis Ames
JCVI
5280 13088
(13088 non-IEA)
2/8/2010
  • annotations [152.9 kb]
  • README
Bos taurus
GO Annotations @ EBI
23892 107643
(4631 non-IEA)
2/8/2010
  • annotations [1.2 mb]
  • README
Carboxydothermus hydrogenoformans Z-2901
JCVI
2611 6391
(6391 non-IEA)
2/8/2010
  • annotations [82.0 kb]
  • README
Caenorhabditis elegans
WormBase
17537 119765
(65125 non-IEA)
2/8/2010
  • annotations [1.1 mb]
  • README
Campylobacter jejuni RM1221
JCVI
1829 4625
(4625 non-IEA)
2/8/2010
  • annotations [61.2 kb]
  • README
Candida albicans
CGD
4054 20460
(7276 non-IEA)
2/8/2010
  • annotations [367.9 kb]
  • README
Clostridium perfringens ATCC13124
JCVI
2892 7446
(7446 non-IEA)
2/8/2010
  • annotations [94.7 kb]
  • README
Colwellia psychrerythraea 34H
JCVI
4752 12088
(12088 non-IEA)
2/8/2010
  • annotations [144.6 kb]
  • README
Coxiella burnetii RSA 493
JCVI
2033 5162
(5162 non-IEA)
2/8/2010
  • annotations [59.6 kb]
  • README
Danio rerio
ZFIN
15307 107439
(23052 non-IEA)
2/2/2010
  • annotations [1.8 mb]
  • README
Dehalococcoides ethenogenes 195
JCVI
1584 3945
(3945 non-IEA)
2/8/2010
  • annotations [48.6 kb]
  • README
Dictyostelium discoideum
dictyBase
7511 31261
(20551 non-IEA)
2/8/2010
  • annotations [416.7 kb]
  • README
Drosophila melanogaster
FlyBase
12611 73266
(58134 non-IEA)
2/8/2010
  • annotations [1.1 mb]
  • README
Escherichia coli
EcoCyc & EcoliHub
1870 7014
(7000 non-IEA)
1/5/2010
  • annotations [143.7 kb]
  • README
Ehrlichia chaffeensis Arkansas
JCVI
1091 2845
(2845 non-IEA)
2/8/2010
  • annotations [35.7 kb]
  • README
Gallus gallus
GO Annotations @ EBI
16338 71242
(2522 non-IEA)
2/8/2010
  • annotations [785.5 kb]
  • README
Geobacter sulfurreducens PCA
JCVI
3410 8185
(8185 non-IEA)
2/8/2010
  • annotations [102.6 kb]
  • README
Homo sapiens
GO Annotations @ EBI
18615 172280
(72835 non-IEA)
2/8/2010
  • annotations [7.2 mb]
  • README
Hyphomonas neptunium ATCC 15444
JCVI
3116 8036
(7976 non-IEA)
12/19/2009
  • annotations [111.9 kb]
  • README
Leishmania major
Sanger GeneDB
10 28
(28 non-IEA)
2/8/2010
  • annotations [1.3 kb]
  • README
Listeria monocytogenes 4b F2365
JCVI
2822 7215
(7215 non-IEA)
1/30/2010
  • annotations [91.3 kb]
  • README
Magnaporthe grisea
PAMGO
11274 27649
(27649 non-IEA)
11/7/2009
  • annotations [338.2 kb]
  • README
Methylococcus capsulatus Bath
JCVI
2928 7322
(7305 non-IEA)
12/19/2009
  • annotations [97.4 kb]
  • README
Mus musculus
MGI
33220 220660
(118464 non-IEA)
2/8/2010
  • annotations [2.3 mb]
  • README
Neorickettsia sennetsu Miyayama
JCVI
929 2421
(2421 non-IEA)
2/8/2010
  • annotations [31.0 kb]
  • README
Oomycetes
PAMGO
30 126
(126 non-IEA)
2/13/2008
  • annotations [2.3 kb]
  • README
Oryza sativa
Gramene
41321 50463
(50042 non-IEA)
2/8/2010
  • annotations [831.5 kb]
  • README
Plasmodium falciparum
Sanger GeneDB
2205 4631
(4631 non-IEA)
1/30/2010
  • annotations [78.8 kb]
  • README
Pseudomonas aeruginosa PAO1
PseudoCAP
1519 7316
(7316 non-IEA)
2/8/2010
  • annotations [132.5 kb]
  • README
Pseudomonas fluorescens Pf-5
JCVI
4822 12582
(12241 non-IEA)
2/8/2010
  • annotations [162.5 kb]
  • README
Pseudomonas syringae DC3000
JCVI
4012 10332
(10332 non-IEA)
2/8/2010
  • annotations [119.8 kb]
  • README
Pseudomonas syringae pv. phaseolicola 1448A
JCVI
3571 9290
(9272 non-IEA)
12/19/2009
  • annotations [121.7 kb]
  • README
Rattus norvegicus
RGD
20067 196067
(110900 non-IEA)
2/8/2010
  • annotations [4.2 mb]
  • README
Reactome [multispecies]
CSHL & EBI
276 7942
(7942 non-IEA)
2/8/2010
  • annotations [43.5 kb]
  • README
Saccharomyces cerevisiae
SGD
6355 89962
(47764 non-IEA)
2/7/2010
  • annotations [1.2 mb]
  • README
Schizosaccharomyces pombe
Sanger GeneDB
5230 36088
(32237 non-IEA)
1/20/2010
  • annotations [630.9 kb]
  • README
Shewanella oneidensis MR-1
JCVI
4843 12178
(12178 non-IEA)
2/8/2010
  • annotations [141.2 kb]
  • README
Silicibacter pomeroyi DSS-3
JCVI
4252 10785
(10785 non-IEA)
2/8/2010
  • annotations [139.1 kb]
  • README
Solanaceae
SGN
155 305
(305 non-IEA)
10/30/2009
  • annotations [7.5 kb]
  • README
Trypanosoma brucei
Sanger GeneDB
2977 10459
(10459 non-IEA)
2/8/2010
  • annotations [176.1 kb]
  • README
UniProt [multispecies] IEA annotations removed
GO Annotations @ EBI
8199 34427
(34427 non-IEA)
1/22/2010
  • annotations [534.4 kb]
  • README
Vibrio cholerae
JCVI
3858 9421
(9421 non-IEA)
2/8/2010
  • annotations [98.9 kb]
  • README

Unfiltered Files

These files have not been filtered with the annotation file QC checks script. The most important difference between these files and the filtered files above is that gene products from certain taxa are not stripped out of the file; they may also contain annotations to obsolete terms or outdated IEA annotations. Please see the annotation file QC script documentation for full details of the checks performed.

Please note that if you use unfiltered files in conjunction with filtered files, there may be duplicated annotations.

Statistics as of February 8, 2010

Unfiltered Annotation File Downloads
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Species, Database Gene Products Annotated Annotations Submission date MM/DD/YYYY Download unfiltered files
Arabidopsis thaliana
GO Annotations @ EBI
22923 108024
(32289 non-IEA)
1/21/2010
  • annotations [1.6 mb]
  • README
Mus musculus
GO Annotations @ EBI
35442 235344
(82847 non-IEA)
1/21/2010
  • annotations [3.1 mb]
  • README
Rattus norvegicus
GO Annotations @ EBI
23261 140616
(38599 non-IEA)
1/21/2010
  • annotations [1.7 mb]
  • README
Danio rerio
GO Annotations @ EBI
26707 95085
(6521 non-IEA)
1/21/2010
  • annotations [1.1 mb]
  • README
Protein Data Bank [multispecies]
GO Annotations @ EBI
0 0
(0 non-IEA)
1/21/2010
  • annotations [176 b]
  • README
Reactome [multispecies]
CSHL & EBI
4351 36312
(36312 non-IEA)
12/16/2009
  • annotations [306.5 kb]
  • README
UniProt [multispecies]
GO Annotations @ EBI
6936729 56131276
(573612 non-IEA)
2/4/2010
  • annotations [440.8 mb]
  • README

In the tables above gene association counts are provided for all evidence codes and separately for everything except IEA, Inferred from Electronic Annotation. The IEA code means there has been no human involvement in the assignment of the association; see the GO evidence code documentation for more details.

Back to top

gp2protein files

The gp2protein directory contains files that map between model organism database object IDs and UniProt accessions.

Back to top


Open Biomedical Ontologies logo Last modified Thursday, 09-Jul-2009 17:26:43 PDT
Help • Cite • Terms of use • Site Map
Copyright © 1999-Tuesday, 09-Feb-2010 14:29:46 PST the Gene Ontology