software

Software related questions (e.g. AmiGO, OBO-Edit, database, scripts, OWL, file formats...).

What is an OWL file?

OWL is the acronym for Web Ontology Language, a standard produced by the W3C. GO in OWL is based on a translation from OBO to OWL and is available for download here. OWL files can be opened in an editing tool such as Protege.

What are the file formats used by the Gene Ontology?

A general introduction to the project's file formats is available. This page provides information about the file formats relevant to the ontology and the files used to express ontology-gene product annotations.

What is a GPAD file?

The GPAD - Gene Product Association File Format - is an alternative means of exchanging annotations from the Gene Association File (GAF). The GPAD format is designed to be more normalized than GAF, and is intended to work in conjunction with a separate format for exchanging gene product information. The GPAD specification is defined in detail here

What is a GAF file?

A GAF file is a GO annotation file containing annotations made to the GO by a contributing resource such as FlyBase or Pombase. There are two versions of the file format, the most recent is GAF version 2.0 An explanation of the differences between versions 1.0 and 2.0 is available and the 1.0 specification is described here

I want to use the database files but...

The *.txt files must be imported into a MySQL instance using mysqlimport. They are not intended to be loaded into excel or parsed using custom tools. Why are so many of the .txt files empty?

For some exports of the GO database, some tables will necessarily be empty. For example, the termdb dump by design omits any data on genes or gene associations, so the corresponding tables are mepty. MySQL requires a file be present for every table, hence there will be some tables with 0 rows in the .txt files.

Why do the IDs in the database not match the GO IDs?

The GO SQL database employs the common practice of using surrogate IDs for primary keys. These are intended to be internal to the database, and not exposed to the casual user. In addition, they are not stable and will change with each release. For example, the term table has columns including:

  • id -- internal numeric identifier
  • acc -- public GO ID
  • name -- term label

The id column is the primary key for the term table used as a foreign key in tables that link here, such as term2term.

How do I find terms, annotations, or gene products in the database?

We maintain a set of examples that cover, or can be used as a base to cover, most common queries. This is also the set used with GOOSE.

How do I query, access, install/mirror the GO database?

To avoid repeating ourselves and leaving the possibility of letting our documentation get out of sync, we're pretty sure that all questions can be answered by the the database overview and database guide pages.

Where can I find software to allow me to make or edit GO annotations?

GO annotations can be made and edited using various database-specific tools. Please contact the relevant database to find out how their GO annotation is done. The GMOD online tool, Canto, supports functional gene annotation by community researchers as well as by professional curators.