database

Questions and answers concerning the legacy Gene Ontology MySQL database.

How do I annotate a de novo assembled transcriptome against the GO database?

You can annotate the coding sequences in your transcripts using InterProScan. You can do this using WebServices or by downloading the tool and running it locally. Details can be found at: http://www.ebi.ac.uk/interpro/search/sequence-search/

This will predict GO terms based on domains detected using the mapping file here: http://geneontology.org/page/download-mappings

What is the best way to obtain the GO annotations for a list of UniProt Accession Numbers in batch?

With UniProt accession numbers, you can obtain all GO annotations by parsing a GOA gene association file, which are provided in a simple tab-delimited format. These files are available from the GOA FTP site.

I want to use the database files but...

The *.txt files must be imported into a MySQL instance using mysqlimport. They are not intended to be loaded into excel or parsed using custom tools. Why are so many of the .txt files empty?

For some exports of the GO database, some tables will necessarily be empty. For example, the termdb dump by design omits any data on genes or gene associations, so the corresponding tables are mepty. MySQL requires a file be present for every table, hence there will be some tables with 0 rows in the .txt files.

Why do the IDs in the database not match the GO IDs?

The GO SQL database employs the common practice of using surrogate IDs for primary keys. These are intended to be internal to the database, and not exposed to the casual user. In addition, they are not stable and will change with each release. For example, the term table has columns including:

  • id -- internal numeric identifier
  • acc -- public GO ID
  • name -- term label

The id column is the primary key for the term table used as a foreign key in tables that link here, such as term2term.

How do I find terms, annotations, or gene products in the database?

We maintain a set of examples that cover, or can be used as a base to cover, most common queries. This is also the set used with GOOSE.

How do I query, access, install/mirror the GO database?

To avoid repeating ourselves and leaving the possibility of letting our documentation get out of sync, we're pretty sure that all questions can be answered by the the database overview and database guide pages.

What is the status of the GO MySQL database?

While the GO MySQL database is currently considered to be in "legacy" mode, meaning that there will likely not be any new developments on it, it is a widely used and convenient resource for many types of queries. More information about it can be found in the GO MySQL Database Guide.