ISO: Inferred from Sequence Orthology

  • Pairwise or multiple alignments between a query protein and experimentally characterized match proteins when the proteins are established to be orthologs of each other.
  • Phylogenetic analysis of a set of proteins to define orthologous groups.
  • An entry in the with field is mandatory.

The ISO code is a sub-category of the ISS code. Orthology is a relationship between genes in different species indicating that the genes derive from a common ancestor. Orthology is established by multiple criteria generally including amino acid and/or nucleotide sequence comparisons and one or more of the following:

  • phylogenetic analysis
  • coincident expression
  • conserved map location
  • functional complementation
  • immunological cross-reaction
  • similarity in subcellular localization
  • subunit structure
  • substrate specificity
  • response to specific inhibitors

It should be noted that there are known cases where a gene in one organism is significantly different in size from its ortholog(s) in other species. For example, the U2 snRNA in S. cerevisiae is much larger than vertebrate U2 snRNAs due to several additional domains. However it has been shown that both S. cerevisiae and vertebrate U2 snRNAs have the same conserved core and perform the same basic role in the spliceosome, even though a simplistic sequence comparison might miss this due to the large size difference between U2 in S. cerevisiae and U2 in mammalian species.

When making an annotation using the ISO evidence code, an entry in the with field is mandatory. This entry will be the accession number of an experimentally characterized orthologous gene product. The matching orthologous gene product must have substantiating experimental evidence to support the annotation. In addition, there will be cases where a gene product in one species is the ortholog of several closely related paralogous genes in another species. In these cases, the ID for all of these paralogs should be included in the with field. Annotations made with ISO without an entry in the with field will be filtered out by the Annotation File Format Quality Control script.

If the paper being used to make the annotation demonstrates the orthology, then that paper is used as the reference for that annotation. However, if the group doing the annotation is establishing orthology and there is no published reference, a reference can be used from the GO Consortium's collection of GO references; if there is nothing appropriate in this set, the annotating group submit a description of the methods of data collection and evaluation used, and submit it to the GO Consortium. This will be added to the reference collection and will receive a GO_REF accession number for use in annotations. For e.g., GO_REF:0000096 describes MGI's practice of transferring experimental GO annotations from rat and human to mouse genes based on orthology evidence (i.e. ISO).

It is important to note that if revised predictions on orthologous protein sets are produced at a later time than the original annotation, annotations should be updated accordingly.

Example of when to use ISO:

  • PMID:12507466 describes a set of proteins containing both experimentally confirmed and predicted N-terminal acetyltransferases (NATs) that were collected and assigned to orthologous groups based on phylogenetic analysis. Three of the groups, Ard1, Mak3, and Nat3, were named based on the well characterized gene by that name from S. cerevisiae that is a member of the group. Proteins in these orthologous groups without experimental characterization can be assigned the function term peptide alpha-N-acetyltransferase activity based on orthology to the experimentally characterized proteins within the orthologous group. The evidence code for this annotation is ISO, the reference is the paper which performed the analysis, and the accession numbers of the experimentally characterized members of the orthologous group should be placed in the with field. The paper also makes it clear that the genes, ARD1, MAK3, and NAT3 are well characterized experimentally, thus one could use the relevant one of these genes in the with field for annotations of members of their orthology groups without further reading. There may be additional characterized genes in each group, but it is not obvious from the paper. Also note that this paper also describes a putative Nat5 family only based on sequence similarity of Nat5p (YOR253Wp) to other NATs. As there is no experimentally characterized member of the Nat5 family, no annotations may be made based on the Nat5 orthology grouping, though see the ISA section for a description of the annotation which may be made for NAT5.