Connecting together multiple GO annotations: a report on the LEGO jamboree

In December, a group of us gathered at the Swiss Institute for Bioinformatics in Geneva to beta test and develop curation guidelines for a new GO annotation tool called Noctua. Noctua advances on the current GO annotation paradigm by allowing curators to connect together multiple annotations to give a cohesive picture of a biological process or mechanism.

To illustrate, consider the following two annotations which follow the current datamodel, which associates genes with GO terms:

  • TEM1 enables GTPase activity
  • BFA1 enables GTPase inhibitor activity
This gives a correct but incomplete picture of the biology: the piece that is missing is the fact that in the context of some biological process (such as exit from mitosis), it's the activity of BFA1 that inhibits the GTPase activity of TEM1. There is no explicit connection between these annotations in the current GO datamodel.

The following screenshot shows how we would curate this using Noctua; here the two annotations are connected via a directly inhibits relationship:

Noctua screenshot

We sometimes refer to the resulting set of annotations as a LEGO model

Meeting Report

The meeting took place over three days, commencing with a talk from Paul Thomas, describing the need for a more expressive model for describing the biological role of genes.

We then took a tour of the different editing capabilities of Noctua - some of these can be seen in the demo videos. After this, we all put the tool through its paces using it to curate a number of different papers which we had assembled in advance. We would periodically reassemble and go over the models we had made - one of the nice features of Noctua is realtime collaborative editing, so more than one person can be editing a model at any time, with the changes made by the other immediately reflected on your display.

This was very informative as it helped us identify a number of bugs, as well as curation best practice. Moving from a one-annotation-at-a-time, one-paper-at-a-time model is a big shift for GO, and we weren't sure how curators would adapt to this more expressive mode of curating. Happily it turned out that this turned out to be much easier for curators - rather than having to request a long complicated term from the ontology to describe a gene's function, it was easier to build up that description visually, using smaller components that can be connected together.

We gathered a large number of requirements (see the issue tracker for details). Some of these are blockers for some groups: for example, SIB need to be able to annotate to any UniProt accession, a feature we hope to deliver soon. MGI, like many MODs needed the ability to export any model to the corresponding (lossy) Gene Association File representation, to be able to load back into the main MGI database, so this has become a priority.

Some groups want to the ability to embed the Noctua model within gene pages on their main database, like this mockup: mockup

Curator Documentation

We identified curator documentation as a key need, and have started on a LEGO curator guide -- comments welcome in the google doc.

Next steps

As a result of the meeting, we have released the beta version of Noctua, and will continue to develop it as resources permit over the course of 2016. Some groups are currently in the process of switching over to Noctua as their main curation tool. This means that we can expect to see a large growth of models deposited in the Noctua LEGO models repository over this year.

For now, the best place to see current developments is on the Noctua Web Application itself.

Participants

Thanks to all the participants who made the meeting a success!

LEGO attendees