GO Annotation File Format 2.0

Annotation data is submitted to the GO Consortium in the form of Gene Association Format, or GAFs. This guide lays out the format specifications for GAF 2.0; for the older GAF 1.0 file syntax, please see the GAF 1.0 file format guide.

Please see the information on the changes in GAF 2.0.

General information about annotation can be found in the GO annotation guide.

The Reference Genome Annotation Project

The GO Consortium coordinated an effort to maximize and optimize GO annotations for a large and representative set of key genomes, known as 'reference genomes'. The Reference Genome Annotation Project aimed to completely annotate twelve reference genomes, producing a resource that may effectively seed automatic annotation efforts of other genomes.

IC: Inferred by Curator

Updated September 22, 2011 

The IC evidence code is to be used for those cases where an annotation is not supported by any direct evidence, but can be reasonably inferred by a curator from other GO annotations, for which evidence is available.

IPI: Inferred from Physical Interaction

Updated October 10, 2014

  • 2-hybrid interactions
  • Co-purification
  • Co-immunoprecipitation
  • Ion/protein binding experiments

Covers physical interactions between the entity of interest and another molecule (such as a protein, ion or complex). IPI can be thought of as a type of IDA, where the actual binding partner or target can be specified, using "with" in the with/from field.

Guide to GO Evidence Codes

A GO annotation consists of a GO term associated with a specific reference that describes the work or analysis upon which the association between a specific GO term and gene product is based. Each annotation must also include an evidence code to indicate how the annotation to a particular term is supported. Although evidence codes do reflect the type of work or analysis described in the cited reference which supports the GO term to gene product association, they are not necessarily a classification of types of experiments/analyses.

ID Mapping Files

ID Mapping Files

This page documents the file formats used to store the mapping between the Database object IDs to corresponding sequence IDs in UniProtKB or NCBI.
  • gp2protein file
  • gp2rna file
  • gp_unlocalized

gp2protein file

A gp2protein file is a tab-delimited file that provides a mapping between database object IDs and protein sequence IDs. gp2protein files contributed by annotation groups are available for download.

Need for gp2protein file

    Current Annotations

    Current Annotations
    • Annotation Details and Downloads
    • Filtered files
    • Unfiltered files
    • gp2protein files

    Annotation Details and Downloads

    The gene association files submitted by GO Consortium members are shown in the tables below. Files are in the GO annotation file format and are compressed using the UNIX gzip utility. Please see the appropriate README file for further details on the annotation set. Any errors or omissions in annotations should be reported by writing to the GO helpdesk.


    The GO Tools Registry is no longer supported