The GO File Format Guide documents the structure and syntax of the files available on the GO website, to assist users who need to read, write parsers for, or create these files. The following file formats are documented separately:
- OBO 1.4 File Format Guide: the ontology file format used and recommended by the GO Consortium
- OBO 1.2 File Format Guide: previous iteration of the OBO format
- OBO 1.0 File Format Guide: previous iteration of the OBO format
- GO RDF-XML format
- GO annotation ("gene association") file format
- External mapping file format
OBO is fully supported via the OWL-API.
- OBO format tools in GitHub:a wrapper for the Java (OWL-API) implementation of a parser for OBOF1.4 syntax and an implementation of the OBOF1.4 mapping to OWL (uses the OWL API)
- OWL API in Github:a Java API for creating, manipulating and serialising OWL Ontologies.
Ontology Flat File Formats
The GO Consortium uses the OBO flat file format to store the ontology data. The current version is OBO 1.4, although the ontology data is also available in the previous version, OBO 1.2. The GO Consortium no longer uses or supports files in the legacy GO format. Should you require a file in this format, the command-line script obo2flat can be used to interconvert between OBO format and the legacy GO format. obo2flat is a Java script and comes as part of the OBO-Edit package; instructions on usage are provided in the OBO-Edit User Guide.
OBO-XML is a direct XML serialization of the OBO 1.2 format specification. The schema is specified using RELAX-NG compact syntax: obo-xml.rnc. Currently, only the ontology is available as OBO-XML.
OWL RDF/XML Format
OWL is a standard for ontology languages, produced by the W3C. Details of the translation used for GO is available on the official OboInOwl page.
Sequence data for gene products in the GO database is available in standard FASTA format from the GO database archives.