Web feeds

Noctua

GO wiki (new pages) - Wed, 01/17/2018 - 10:54

Paul Thomas:

[[Category: Annotation]]
=Summary=
Noctua is an online system for making extensible GO annotations, which we call "GO-CAM models". Anything from simple annotations to complicated pathways are supported. However, the overall goal should be for a model to represent a unit that roughly corresponds to a biological pathway. This document describes how to make GO-CAM models using Noctua.

=Requirements=
A web browser. Chrome is recommended.

=Launching Noctua=
* Go to http://geneontology.org/cam.

=Setup=
* Before using Noctua to edit or create models, please follow this procedure to request edit access. You will need a ORCID (https://orcid.org), so you can be uniquely identified. Each part of a Noctua model is individually attributed to an editor, as well as the project that provided their funding (if applicable).

=Using Noctua=

==Login==
You can view models without logging in, but you must log in before creating new annotations (by editing an existing model, or creating a new model). Click on the Login button in the right upper corner of the page. There are several options for logging in. We recommend using Github (if you don't already have an account just go to http://github.com).

==Editing an existing model==
Just click on the "Edit" button in the rightmost column of the model list. The model list can be filtered using the search box just above the list of available models.

==Starting a new model==
Just click on the blue "Create Noctua Model" button.

==What is a Noctua model?==
===Molecular activity===
Each Noctua model consists of at least one MOLECULAR ACTIVITY (FUNCTION), carried out by at least one GENE PRODUCT. Ideally, all of the following “aspects” of the gene product’s function will be specified in the model. However, in cases where some or most of these aspects are unknown, a model may still be constructed with details added as more information becomes available. Users should attempt to specify functions as fully as possible, but partial models are expected and still contribute to the GO knowledgebase. The following aspects are represented in a model:
*Molecular function (MF): the molecular activity carried out by a gene product as part of a larger biological process; this is specified by a term from the GO molecular function ontology. MF may be qualified, using defined relations, as follows:
**If the function acts upon another “target” molecule, this can be specified using a gene product identifier (for a protein or a gene) or term from the ChEBI ontology (for a small molecule)
**If the function acts during a particular “biological phase” (e.g. a particular stage in organism development), this can be specified using a term from an appropriate ontology
*Cellular component (CC): the location of the gene product when it is carrying out its activity; this is specified by a term from the GO cellular component ontology. CC may be qualified, using defined relations, as follows:
**If the activity occurs in a specific cell type, this can be specified using a term from a Cell Type or Anatomy Ontology.
**If the activity occurs in a specific anatomical structure, this can be specified using a term from the Uberon, or other organismal Anatomy, ontology.
*Biological process (BP): the larger “biological program” to which the activity contributes; this is specified by a term from the GO biological process ontology. BP may be qualified, using defined relations, as follows:
**If the process is a part of a larger biological program, it can be linked to the larger biological program with another GO biological process term.
All of these aspects together constitute a unit of annotation, which we call an “annoton”. Each annoton is centered on the molecular activity, as this is the most basic description of gene product function.
===Molecular activities can be linked by causal relations===
Activities can be linked together by relations that describe their causal dependence. The most common relations are “regulates” and “provides input for”, but there are other relations of greater and lesser specificity, depending on what is known. “Regulates” should be used to denote biological control of a downstream activity. “Provides input for” should be used when there is no control, but an upstream function creates a molecular entity that is the target of the downstream function, such as in a metabolic pathway.


==Creating a new activity and its properties==
After either selecting an existing model or starting a new one, you will see the graph view by default. To create new activities, you should use the “Simple annoton editor” tool, available in the Workbench menu:
Workbench -> Simple annoton editor
[[File:Launch_noctua_activity_form.png|thumb|Fig. 1 Launching the simple annoton editor|400px]]
This will launch a new browser tab
===Step 1. Fill in the form===
Fill in as many fields as possible in the form, by typing in the field, and then selecting from the autocomplete suggestions by moving the mouse over your selection and clicking. Tips:
*The required fields are gene product, molecular function and evidence for the function. All other fields are optional.
*You can annotate a complex instead of a single gene product, by choosing "macromolecular complex" from the drop-down menu
*For gene products, you can type in the gene symbol, e.g. Wnt3a. If necessary to narrow down the choices, type a space after the symbol, and enter the three letter code for the species (first letter from genus and two from species name, e.g. mmu for Mus musculus). Each entry in the autocomplete will also show the associated unique database identifier or accession, so curators can confirm that they are selecting the appropriate entity for annotation.
* In general, enter a space after a complete word, to narrow down the choices.
Just as in conventional GO annotation, you must fill in the Evidence and Reference fields for each line of annotation, or an annoton will not be created.
We recommend that you fill in as many fields as possible before creating the annoton, as after it is created, you will need to edit it from the graph canvas, which requires more steps to do.
===Step 2. Add the new activity to a model===
Press the CREATE button. A new activity will appear on the graph canvas (the main window). Tips:
*Each new activity will appear on the same part of the canvas, so if you add more than one activity you will need to move them around on the canvas (by clicking and dragging) to see the ones underneath.
*If the CREATE button is grayed-out, there is some information missing from the form that you still need to fill in. You can press the "why is the save button disabled?" for a list of missing fields.

The analogy is to a library. You will first find and check out (lock) the families you want to curate, and then select a family to curate from your list of locked families. All families now have a curation status (curated, partially curated, uncurated).

--------- Paul Thomas
Categories: GO Internal

Annotation File Pipelines

GO wiki (new pages) - Tue, 01/16/2018 - 08:48

Vanaukenk:

= Annotation File Pipelines =
== SVN ==
*[[How to run Mike Cherry's filtering script locally before checking the GAF into SVN?]]
== Jenkins Jobs ==
== LBNL File Retrieval from GOC Members ==

Back to: [[Annotation]]

[[Category: Annotation]] Vanaukenk
Categories: GO Internal

Annotation Extensions

GO wiki (new pages) - Tue, 01/16/2018 - 08:32

Vanaukenk:

= Annotation Extensions =
*[[Annotation Extension Relation Subsets]]
*[[Column 16: Cell Type]]
*[[Column 16: Targets]]
*[[Issues with Annotation Extension relations]]
*[[Annotation Extension Relation Documentation Jamboree]]
*[[Guidance for updating deprecated Annotation Extension Relations]]

Back to: [[Annotation]]


[[Category: Annotation]] Vanaukenk
Categories: GO Internal

Cellular Component

GO wiki (new pages) - Tue, 01/16/2018 - 08:28

Vanaukenk: /* Annotation Relations for Cellular Component Annotations */

= Cellular Component Annotation Guidelines =
== Meaning of a GO Cellular Component Annotation ==
== Annotation Relations for Cellular Component Annotations ==
=== part of ===
=== colocalizes with ===

== Annotation Extensions for Cellular Component Annotations ==
== Term-specific Guidelines for Cellular Component ==


Back to: [[Annotation]]

[[Category: Annotation]] Vanaukenk
Categories: GO Internal

Molecular Function

GO wiki (new pages) - Tue, 01/16/2018 - 08:25

Vanaukenk: /* Molecular Function */

= Molecular Function Annotation Guidelines =

== Meaning of a GO Molecular Function Annotation ==
== Annotation Relations for Molecular Function Annotations ==
=== enables ===
=== contributes_to ===

== Annotation Extensions for Molecular Function Annotations ==
== Term-specific Guidelines for Molecular Function ==

Back to: [[Annotation]]

[[Category: Annotation]] Vanaukenk
Categories: GO Internal

Archived Pages

GO wiki (new pages) - Tue, 01/16/2018 - 07:38

Vanaukenk:

=== GO Annotation Meetings ===
[http://gocwiki.geneontology.org/index.php/2010_GO_camp_Meeting_Logistics 2010 GO Annotation Camp in Geneva, Switzerland, June 16-18 2010 ]

[[Guidelines from Annotation Camp]]

[http://wiki.geneontology.org/index.php/2012_Annotation_Meeting_Stanford Annotation Meeting Stanford, CA, Feb 2012]

=== Migration of annotating groups to Protein2GO ===

[[Procedure for migration of protein annotations to Protein2GO]]

[[Extension of Protein2GO to non-UniProtKB Identifiers]]

=== Activities ===

*[http://wiki.geneontology.org/index.php/Annotation_Advocacy_Roadmap_2010 Roadmap 2010]
*[http://wiki.geneontology.org/index.php/Annotation_Advocacy_Roadmap_2011 Roadmap 2011]
*[[Media:AnnotationRoadMap.pdf | RoadMap]] (PDF file - provides project goals with time line)
*Educate GOC curators about best annotation practice;
*Enforce annotation rules/policies within GOC
*Maintain the annotation/evidence code documentation
**[[Mock-ups of new GOC Annotation pages]]
*Train/assist new groups with annotations: see [[How_External_Communities_can_contribute_annotations_to_the_GO_Consortium]]
*Educate and keep all the annotating groups up-to-date with changes in GAF format

*Keep all curators up-to-date with ontology development and how it affects annotations

*[[Chain of Evidence]]- Proposal to represent chain/summation of evidence in an annotation.
*[[Evidence Code Ontology (ECO)]] - Proposal to represent evidence in an ontology
*[[Proposal for cron tabs]] - Proposal for setting up cron tabs for generating PAINT and MF-BP inferences
*[[Examples for phylogeny based evidence codes]]
*[[Ideas for GOC community curation tool]]
*[[Protein Complex ids as GO annotation objects]]
** [[Annotation guidelines for annotating complexes as annotation objects]]
** [[Protein Complex Conference Call June19, 2015]]
** [[Protein Complex Conference Call July15, 2015]]
*[[LEGO-style annotation ideas]]
* [[Proposed Developments to the GAF annotation format]]
*[[InterPro2GO Session October 4th 2011]]
*[[MF-BP inferences]]
*[[Protein Binding clean up]]
*[[Annotation Guidance Pages]]
*[[Evidence Code proposals]]
*[[With/From field restrictions for evidence codes]]
*[[Core Consortium annotation activities]]
*[[:Category:GPAD|GPAD]]

*[[Common Annotation Framework Specification]]
*[[gp2protein file]]
*[[gp2rna file]]
*[[gp_unlocalized file]]
*[[mechanisms for reducing annotation redundancy]]
*[[Annotations to Cell Fraction-type terms]]
*[[TermEnrichment: Gold Standard Data Sets]]
*[["Response to" terms]]
*[[GAF 2.1 specs]]

=== Monthly Reports ===
*[[September2011_Annotation_Advocacy_Report|September 2011]]
*[[August2011_Annotation_Advocacy_Report|August 2011]]
*[[July2011_Annotation_Advocacy_Report|July 2011]]
*[[June2011_Annotation_Advocacy_Report|June 2011]]
*[[May2011_Annotation_Advocacy_Report|May 2011]]
*[[April2011_Annotation_Advocacy_Report|April 2011]]
*[[March2011_Annotation_Advocacy_Report|March 2011]]
*[[February2011_Annotation_Advocacy_Report|February 2011]]
*[[January2011_Annotation_Advocacy_Report|January 2011]]
*[[December2010_Annotation_Advocacy_Report|December 2010]]
*[[November2010_Annotation_Advocacy_Report|November 2010]]
*[[October2010_Annotation_Advocacy_Report|October 2010]]
*[[September2010_Annotation_Advocacy_Report|September 2010]]
*[[August2010_Annotation_Advocacy_Report|August 2010]]
*[[July2010_Annotation_Advocacy_Report|July 2010]]
*[[Jun2010_Annotation_Advocacy_Report|June 2010]]

=== Meeting calendar ===
[[Annotation_topics for next meeting]]

[[Annotation_31Jan10]]

[[Annotation_21Dec10]]

[[Annotation_19Oct10]]

[[Annotation_21Sept10]]

[[Annotation_17Aug10]]

[[Annotation_04Aug10]]

[[Annotation_19July10]]

[[Annotation_22June10]]

[[Annotation_08June10]]

[[Annotation_01June10]]

[[Annotation_17May10]]

[[Annotation_29Apr10]]

[[Annotation_15Apr10]]

[[Annotation_08Apr10]]

[[Annotation_22Mar10]]

[[Annotation_15Mar10]]

[[Annotation_10Apr10]]

=== Process ===

[http://gocwiki.geneontology.org/index.php/Annotation_Issues_and_Management Annotation Issues]

=== Issues for the Annotation Group ===
# Division of annotation and GO content development effort and feedback between participating databases (UniProtKB and MODs)

[[Annotations to Catalytic activity with IPI]]

[[Annotation Quality Control Checks]]

[[Action items from March 2010 GOC meeting]]

====How to load the ontology file into Obo-Edit tool directly from the web without downloading the file from cvs etc?====
OBO-Edit can load files from the disk OR from a URL.<br>
Choose the "File -> Load..." menu option<br>
Choose the OBO File Adapter<br>
Type http://www.geneontology.org/ontology/gene_ontology.obo into the filename box<br>
(From:http://www.geneontology.org/newsletter/archive/200705.shtml#tip)

To see the has_part relationships, use this URL for the extended GO version:<br>

http://www.geneontology.org/ontology/obo_format_1_2/gene_ontology_ext.obo

[[Category: Annotation]] [[Category:Working Groups]] Vanaukenk
Categories: GO Internal

GO-CAM and Noctua 2017 Agenda and Minutes

GO wiki (new pages) - Tue, 01/16/2018 - 07:31

Vanaukenk:

[[GO-CAM May 10th, 2017]]

[[GO-CAM May 24th, 2017]]

[[GO-CAM June 14th, 2017]]

[[GO-CAM June 28th, 2017]]

[[GO-CAM July 12th, 2017]]

[[GO-CAM ectopic meeting July 19th 2017]]

[[GO-CAM July 26th, 2017]]

[[GO-CAM August 9th, 2017]]

[[GO-CAM August 23, 2017]]

[[GO-CAM Sept 13, 2017]]

[[GO-CAM Sept 27, 2017]]

[[GO-CAM October 25th, 2017]]

[[GO-CAM November 8th, 2017]]

[[GO-CAM November 22, 2017]]

[[GO-CAM December 13, 2017]]

Back to: [[Annotation]]


[[Category: Annotation]] Vanaukenk
Categories: GO Internal

GO-CAM and Noctua 2016 Agenda and Minutes

GO wiki (new pages) - Tue, 01/16/2018 - 07:24

Vanaukenk:

[[LEGO January 4, 2016]]

[[LEGO January 18, 2016]]

[[LEGO February 1, 2016]]

[[LEGO February 15, 2016]]

[[LEGO March 7, 2016]]

[[LEGO March 21, 2016]]

[[LEGO March 28, 2016]]

[[LEGO April 4, 2016]]

[[LEGO April 25, 2016]]

[[LEGO May 2, 2016]]

[[LEGO May 9, 2016]]

[[LEGO May 16, 2016]]

[[LEGO May 23, 2016]]

[[LEGO June 6, 2016]]

[[LEGO June 13, 2016]]

[[LEGO June 20, 2016]]

[[LEGO June 27, 2016]]

[[LEGO July 11, 2016]]

[[LEGO July 18, 2016]]

[[LEGO July 25, 2016]]

[[LEGO August 8, 2016]]

[[LEGO August 15, 2016]]

[[LEGO August 22, 2016]]

[[LEGO August 29, 2016]]

[[LEGO September 12, 2016]]

[[LEGO September 19, 2016]]

[[LEGO September 26, 2016]]

[[LEGO GAF/GPAD September 28, 2016]]

[[LEGO October 5, 2016]]

[[LEGO GAF/GPAD October 5, 2016]]

[[LEGO October 10, 2016]]

[[LEGO October 17, 2016]]

[[LEGO October 24, 2016]]

[[LEGO October 31, 2016]]


Back to: [[Annotation]]

[[Category: Annotation]] Vanaukenk
Categories: GO Internal

Manager Call 2018-01-18

GO wiki (new pages) - Tue, 01/16/2018 - 02:57

Pascale:

[[Category:GO Managers Meetings]]

= Call in info=
https://stanford.zoom.us/j/754529609

= Agenda =

==Noctua V1.0 progress report==


==Docathon==
*Review draft agenda for docathon
**http://wiki.geneontology.org/index.php/2018_Berkeley_GO_Docathon
**Create use cases and user stories to focus documentation efforts
**Who are the users?
***Curators
***MODs, other curation projects (e.g. Reactome)
***Bioinformaticians
***Systems biologists
***Software developers
***Bench scientists
**Post schedule for coming day so people can videoconference



==NYU meeting==
*Do we have any info on logistics?
*Start working on agenda?


==EC2GO mappings==
*We are missing about 1,500 reactions (25%) of EC (it's possible that we are also missing the corresponding terms
*What is the goal: (a) keep up to date? (b) remove incorrect (moved) mappings?
*Back in 2013 there was a Plan of action with respect to EC and Rhea, is this still our plan? What's the priority? http://wiki.geneontology.org/index.php/Enzymes_and_EC_mappings#Plan_of_Action


= Minutes =
*On call: Pascale
Categories: GO Internal

Inferred from High Throughput Expression Pattern (HEP)

GO wiki (new pages) - Wed, 01/10/2018 - 14:06

Vanaukenk: Created page with "'''HEP: Inferred from High Throughput Expression Pattern''' No data at: http://www.evidenceontology.org/term/ECO:0007007/ [http://wiki.geneontology.org/index.php/Guide_to_..."

'''HEP: Inferred from High Throughput Expression Pattern'''

No data at:

http://www.evidenceontology.org/term/ECO:0007007/


[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred from High Throughput Genetic Interaction (HGI)

GO wiki (new pages) - Wed, 01/10/2018 - 14:04

Vanaukenk: Created page with "'''HTP: Inferred from High Throughput Genetic Interaction''' No data at: http://www.evidenceontology.org/term/ECO:0007003/ [http://wiki.geneontology.org/index.php/Guide_t..."

'''HTP: Inferred from High Throughput Genetic Interaction'''

No data at:

http://www.evidenceontology.org/term/ECO:0007003/


[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred from Hight Throughput Mutant Phenotype (HMP)

GO wiki (new pages) - Wed, 01/10/2018 - 14:00

Vanaukenk: Created page with "'''HMP: Inferred from High Throughput Mutant Phenotype''' No data at: http://www.evidenceontology.org/term/ECO:0007001/ [http://wiki.geneontology.org/index.php/Guide_to_GO_..."

'''HMP: Inferred from High Throughput Mutant Phenotype'''

No data at:

http://www.evidenceontology.org/term/ECO:0007001/

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred from High Throughput Direct Assay (HDA)

GO wiki (new pages) - Wed, 01/10/2018 - 13:59

Vanaukenk:

'''HDA: High Throughput Direct Assay'''

No data at:

http://www.evidenceontology.org/term/ECO:0007005/



[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred from High Throughput Experiment (HTP)

GO wiki (new pages) - Wed, 01/10/2018 - 13:57

Vanaukenk: Created page with "'''HTP: High Throughput Experiment''' [http://www.evidenceontology.org/term/ECO:0006056/ ECO:0006056 high throughput evidence used in manual assertion] [http://wiki.geneon..."

'''HTP: High Throughput Experiment'''

[http://www.evidenceontology.org/term/ECO:0006056/ ECO:0006056 high throughput evidence used in manual assertion]


[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred from Electronic Annotation (IEA)

GO wiki (new pages) - Wed, 01/10/2018 - 12:40

Vanaukenk: Created page with "Automatically-assigned Evidence Codes The Automatically-assigned Evidence Code is: IEA: Inferred from Electronic Annotation Note: Annotations using the IEA code should be rev..."

Automatically-assigned Evidence Codes
The Automatically-assigned Evidence Code is:

IEA: Inferred from Electronic Annotation
Note: Annotations using the IEA code should be reviewed after one year, any older than this date will be deleted.

Annotations based on "matches" in sequence similarity comparisons if they have not been reviewed by a curator
Annotations transferred from database records, if not reviewed by a curator
Annotations made on the basis of keyword mapping files, if not reviewed by a curator
If annotations based on sequence similarity based methods have been reviewed by a curator, use ISS instead and change the reference from the one that describes the computational analysis to one that says that the curator reviewed the sequence similarity and approved it.
Used for annotations that depend directly on computation or automated transfer of annotations from a database, particularly when the analysis is performed internally and not published. A key feature that distinguishes this evidence code from others is that it is not made by a curator; use IEA when no curator has checked the specific annotation to verify its accuracy. The actual method used (BLAST search, Swiss-Prot keyword mapping, etc.) doesn't matter.

When the method used to make annotations using the IEA code is performed internally by the annotating group and is not published, a short description of the method should be written and added to the GO Consortium's collection of GO references, where it will be given a GO_REF ID which can be used to cite the reference in gene association files.

Examples where the IEA evidence code should be used:

Annotations based on "matches" in sequence similarity comparisons if they have not been reviewed by a curator. If annotations based on sequence similarity based methods have been reviewed by a curator, use ISS instead.
Annotations transferred from database records, if not reviewed by a curator. If such annotations are reviewed by a curator and the database record has no linked publication, consider the NAS code.
Annotations made on the basis of keyword mapping files, if not reviewed by a curator
Examples where the IEA evidence code should not be used:

Annotations based on "matches" in sequence similarity comparisons and which have been reviewed by a curator should be made with ISS code.
Annotations transferred from database records, where the annotation is reviewed by a curator should not receive the IEA code. If the source is not traceable and the annotation is worth making, NAS should be used.
Usage of the With/From Column for IEA

At the January 2007 GOC meeting, it was agreed that it will be required to make an entry in the with/from column for all annotations made after May 1, 2007 when using this evidence code to indicate what individual sequences, sequence objects, methods, keyword mapping files, etc. are the basis of the annotation. When multiple entries are placed in the with/from field, they are separated by pipes.

...
2.

DB Object ID

3.

DB Object Symbol

4.

Qualifier

5.

GO ID

6.

DB:Reference

7.

Evidence Code

8.

With/From

...
... UniProt:A0A7W6 A0A7W6_9PARI GO:0006118 GOA:interpro|GO_REF:0000002 IEA InterPro:IPR005797 ...
... UniProt:A0A7W4 A0A7W6_9PARI GO:0006118 GOA:spkw|GO_REF:0000004 IEA SP_KW:KW-0496 ...
... UniProt:A0K8M1 A0K8M1_BURCH GO:0004830 GOA:spec|GO_REF:0000003 IEA EC:6.1.1.2 ...
... UniProt:A0KAB8 Y2695_BURCH GO:0008237 GOA:hamap|GO_REF:0000020 IEA HAMAP:MF_00009 ...
... UniProt:O77797 AKAP3_BOVIN GO:0009434 GOA:compara|GO_REF:0000019 IEA Ensembl:ENSMUSP00000093091 ...

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

No biological Data available (ND) evidence code

GO wiki (new pages) - Wed, 01/10/2018 - 12:39

Vanaukenk: Created page with "ND: No Biological Data Available Updated November 9, 2007 Used for annotations when information about the molecular function, biological process, or cellular component of the..."

ND: No Biological Data Available
Updated November 9, 2007

Used for annotations when information about the molecular function, biological process, or cellular component of the gene or gene product being annotated is not available.

Use of the ND evidence code indicates that the annotator at the contributing database found no information that allowed making an annotation to any term indicating specific knowledge from the ontology in question (molecular function, biological process, or cellular component) as of the date indicated. This code should be used only for annotations to the root terms, molecular function ; GO:0003674, biological process ; GO:0008150, or cellular component ; GO:0005575, which, when used in annotations, indicate that no knowledge is available about a gene product in that aspect of GO.

Annotations made with the ND evidence code should be accompanied by a reference that explains that curators looked but found no information. Note that some groups check only published literature while other groups also make sequence comparisons to see if an annotation can be made on the basis of a sequence comparison. The GO Reference collection includes a reference that can be used with ND when both literature and sequence have been checked; to use it, put "GO_REF:0000015" in the reference column of a gene association file.

Note that use of the ND evidence code with an annotation to one of the root nodes to indicate lack of knowledge in that aspect makes a statement about the lack of knowledge only with respect to that particular aspect of the ontology. Use of the ND evidence code to indicate lack of knowledge in one particular aspect does not make any statement about the availability of knowledge or evidence in the other GO aspects.

Even if an author states in a paper that there is no data available or nothing is known about the gene product in a particular GO aspect, annotation to the corresponding root node should be made with ND evidence code citing either the annotating group's internal reference or the GOC's reference on use of the ND evidence code, not a specific paper.

Note: The ND evidence code, unlike other evidence codes, should be considered as a code that indicates curation status/progress than as method used to derive an annotation.

When a gene product is annotated to a GO term using the NOT qualifier, this is a statement that it is not appropriate to associate that specific GO term with that particular gene product. However, such a negative annotation does not make any positive statements about the role of that gene product. Thus, there should always be a positive annotation, in addition to the NOT annotation. If nothing is known about the role of the gene product in a given aspect (molecular function, biological process, or cellular component) of GO, then the positive annotation should be made to the root node for that aspect using the ND evidence code.

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred by Curator (IC)

GO wiki (new pages) - Wed, 01/10/2018 - 12:39

Vanaukenk: Created page with "IC: Inferred by Curator Updated September 22, 2011 The IC evidence code is to be used for those cases where an annotation is not supported by any direct evidence, but can be..."

IC: Inferred by Curator
Updated September 22, 2011

The IC evidence code is to be used for those cases where an annotation is not supported by any direct evidence, but can be reasonably inferred by a curator from other GO annotations, for which evidence is available.

An example would be when there is evidence (be it direct assay, sequence similarity or even from electronic annotation) that a particular gene product has the function RNA polymerase II transcription factor activity ; GO:0003702. There is no direct evidence showing that this gene product is located in the nucleus, but this would be a perfectly reasonable inference for a curator to make since the curator is annotating a eukaryotic gene product that is associated with a specific nuclear RNA polymerase. This inference will be linked to the annotation to the term RNA polymerase II transcription factor activity ; GO:0003702 in two ways: both annotations will share the same reference; and the annotation inferred by a curator will include one or more with/from statements pointing to the GO term(s) used by the curator for the inference.

In many cases a GO term can be inferred from just one other annotation as described above. Occasionally, there are cases where a curator has to infer the GO term based on evidence from multiple sources of evidence/GO annotations. The 'with/from' field in these annotations will therefore supply >1 GO identifier, obtained from the set of supporting GO annotations assigned to the same gene/gene product identifier which cite publicly-available references. In addition, such IC-annotations will use reference GO_REF:0000036.

Usage of the With/From Column for IC

Note that the with/from field must always be filled in with a GO ID when using this evidence code.

For example, Noel et al., 1998 (PMID:9651335) provides evidence that the protein encoded by the S. cerevisiae UGA3 gene has the function "specific RNA polymerase II transcription factor activity" ; GO:0003704. From this, the curator deduces it is located in the nucleus and thus makes an annotation to the cellular component term "nucleus" ; GO:0005634 with the GO ID for the function term in the with/from for the component annotation.

The second example shown below illustrates the use of IC with GO_REF:0000036. In this case, a curator has inferred an annotation for the CUP9 gene to the GO Term "RNA polymerase II transcription factor activity, sequence-specific transcription regulatory region DNA binding"; GO:0001133 based on evidence from PMID:9427760 that CUP9 is involved in "RNA polymerase II core promoter proximal region sequence-specific DNA binding" (GO:0000978), as well as evidence from PMID:18708352 that CUP9 is involved in "negative regulation of transcription from RNA polymerase" (GO:0000122). The with/from column supplies the GO IDs derived from these two publications separated by comma symbols (meaning AND) because both of these GO terms are required to support the inferred annotation to GO:0001133. If either of the GO terms could support the inference, they should be separated with a pipe (meaning OR).

...
2.

DB Object ID

3.

DB Object Symbol

4.

Qualifier

5.

GO ID

6.

DB:Reference

7.

Evidence Code

8.

With/From

...
... SGDID:S000002329 UGA3 GO:0003704 PMID:9651335 IPI ...
... SGDID:S000002329 UGA3 GO:0005634 PMID:9651335 IC GO:0003704 ...
...
2.

DB Object ID

3.

DB Object Symbol

4.

Qualifier

5.

GO ID

6.

DB:Reference

7.

Evidence Code

8.

With/From

...
... SGDID:S000006098 CUP9 GO:0000122 PMID:18708352 IMP ...
... SGDID:S000006098 CUP9 GO:0000978 PMID:9427760 IDA ...
... SGDID:S000006098 CUP9 GO:0001133 GO_REF:0000036 IC GO:0000122,GO:0000978 ...
Where;

GO:0003704 specific RNA polymerase II transcription factor activity
GO:0005634 nucleus
GO:0000122 negative regulation of transcription from RNA polymerase II promoter
GO:0000978 RNA polymerase II core promoter proximal region sequence-specific DNA binding
GO:0001133 RNA polymerase II transcription factor activity, sequence-specific transcription regulatory region DNA binding

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Non-traceable Author Statement (NAS)

GO wiki (new pages) - Wed, 01/10/2018 - 12:37

Vanaukenk: Created page with "NAS: Non-traceable Author Statement Updated November 9, 2007 Database entries that don't cite a paper (e.g. UniProt Knowledgebase records, YPD protein reports) Statements in..."

NAS: Non-traceable Author Statement
Updated November 9, 2007

Database entries that don't cite a paper (e.g. UniProt Knowledgebase records, YPD protein reports)
Statements in papers (abstract, introduction, or discussion) that a curator cannot trace to another publication
The NAS evidence code should be used in all cases where the author makes a statement that a curator wants to capture but for which there are neither results presented nor a specific reference cited in the source used to make the annotation. The source of the information may be peer reviewed papers, textbooks, or database records. For some annotations using the NAS code, there will not be an entry in the with/from field.

The NAS code is also used for making annotations from database entries when a curator reviews the annotations that result. Typically such annotations will refer to an unpublished reference describing what was done, either a reference with a GO_REF id or an internal reference from the specific annotating database.

Cases where the NAS code should be used:

In Ladd et al., 2001 (PMID:11158314), the authors state that:
"All of the CELF proteins contain multiple potential protein kinase C and casein kinase II phosphorylation sites. All are predicted to have predominantly nuclear localization, and CELF3, CELF4, and CELF5 each possess a consensus nuclear localization signal sequence near the C terminus."
As this paper provided no reference to support the author's ascertion that CELF3 is located to the nucleus (nor presentation of sequence analyses related to this statement), and the absence of better published data at the time of curation, CELF3 has been annotated to the GO term nucleus with the NAS evidence code.
...
2.

DB Object ID

3.

DB Object Symbol

4.

Qualifier

5.

GO ID

6.

DB:Reference

7.

Evidence Code

8.

With/From

...
... UniProt:Q5SZQ8 CELF3_HUMAN GO:0009102 PMID:11158314 NAS ...
Cases where the NAS code should not be used:

When an author makes a statement that is attributed to a source cited in the reference list, use the TAS evidence code.
When an annotator makes an annotation based on a combination of another GO annotation and common knowledge. For example, if a curator makes an annotation to the cellular component term nucleus on the basis that the gene product is already annotated to the molecular function term general RNA polymerase II transcription factor activity and the common knowledge that transcription factors interacting with RNA polymerase II act in the nucleus, then the IC evidence code should be used with the GO ID for the GO term from which the annotation was derived in the with/from field and the same reference should be cited as was used for the annotation to the term whose GO ID is placed in the with/from field.

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Traceable Author Statement (TAS)

GO wiki (new pages) - Wed, 01/10/2018 - 12:36

Vanaukenk: Created page with "TAS: Traceable Author Statement Updated November 9, 2007 Any statement in an article where the original evidence (experimental results, sequence comparison, etc.) is not dire..."

TAS: Traceable Author Statement
Updated November 9, 2007

Any statement in an article where the original evidence (experimental results, sequence comparison, etc.) is not directly shown, but is referenced in the article and therefore can be traced to another source.
The TAS evidence code covers author statements that are attributed to a cited source. Typically this type of information comes from review articles. Material from the introductions and discussion sections of non-review papers may also be suitable if another reference is cited as the source of experimental work or analysis.

When annotating with this code the curator should use caution and be aware that authors often cite papers dealing with experiments that were performed in organisms different from the one being discussed in the paper at hand. Thus a problem with the TAS code is that it may turn out from following up the references in the paper that no experiments were performed on the gene in the organism actually being characterized in the primary paper. For this reason we recommend (when time and resources allow) that curators track down the cited paper and annotate directly from the experimental paper using the appropriate experimental evidence code. When this is not possible and it is necessary to annotate from reviews, the TAS code is the appropriate code to use for statements that are associated with a cited reference.

Once an annotation has been made to a given term using an experimental evidence code, we recommend removing any annotations made to the same term using the TAS evidence code.

Note that prior to July 2006, it was allowed to use the TAS evidence code for annotations based on information found in a text book or dictionary; as text book material has often become common knowledge (e.g. "everybody" knows that enolase is a glycolytic enzyme). However, at the 2006 GO Annotation Camp, it was concluded that this sort of information is not traceable to its source and is thus not suitable for the TAS evidence code. When annotating on the basis of common knowledge possessed by the curator, consider the IC code. When annotating an author statement that that is not associated with a cited reference, use the NAS code.

Examples where the TAS evidence code should be used:

Annotating the twelve S. cerevisiae genes (RPO21, RPB2, RPB3, RPB4, RPB5, RPO26, RPB7, RPB8, RPB9, RPB10, RPC10, and RPB11) that are part of the core complex of RNA polymerase II to the GO term DNA-directed RNA polymerase II, core complex ; GO:00005665 based on a table in Meyer and Young, 1998 (PMID:9774381) listing each of these genes as encoding a subunit of the enzyme and giving one or more references for each subunit.
Annotating the human myo9b gene to the GO term Rho GTPase activator activity ; GO:0005100 based on this statement in the introduction of a research article, Post et al., 2002 (PMID:11801597):
"Biochemical characterization of both bacterially expressed Myr5 and Myr7 tail domains and tissue-purified human Myo9b demonstrate that these myosins IX are active GAPs for Rho but not Rac or CDC 42 (3,4,7)."
Examples where the TAS evidence code should not be used:

In Ladd et al., 2001 (PMID:11158314), the authors state:
"All of the CELF proteins contain multiple potential protein kinase C and casein kinase II phosphorylation sites. All are predicted to have predominantly nuclear localization, and CELF3, CELF4, and CELF5 each possess a consensus nuclear localization signal sequence near the C terminus."
As this paper provided no reference to support the author's ascertion that CELF3 is located to the nucleus (nor presentation of sequence analyses related to this statement), and the absence of better published data at the time of curation, CELF3 has been annotated to the GO term nucleus with the NAS evidence code and not the TAS evidence code.
...
2.

DB Object ID

3.

DB Object Symbol

4.

Qualifier

5.

GO ID

6.

DB:Reference

7.

Evidence Code

8.

With/From

...
... gene B GO:0005634 PMID:11158314 IGC operon_geneA_ID|operon_geneC_ID (from operon in annotated organism) ...
... UniProt:Q5SZQ8 CELF3_HUMAN GO:0009102 PMID:15347579 NAS ...
When an annotator makes an annotation based on a combination of another GO annotation and common knowledge. For example, if a curator makes an annotation to the cellular component term nucleus on the basis that the gene product is already annotated to the molecular function term general RNA polymerase II transcription factor activity and the common knowledge that transcription factors interacting with RNA polymerase II act in the nucleus, then the IC evidence code should be used with the GO ID for the GO term from which the annotation was derived in the with/from field and the same reference should be cited as was used for the annotation to the term whose GO ID is placed in the with/from field.

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal

Inferred from Reviewed Computational Analysis (RCA)

GO wiki (new pages) - Wed, 01/10/2018 - 12:36

Vanaukenk: Created page with "RCA: inferred from Reviewed Computational Analysis Updated November 9, 2007 Note: Annotations using the RCA code should be reviewed after one year, any older than this date w..."

RCA: inferred from Reviewed Computational Analysis
Updated November 9, 2007

Note: Annotations using the RCA code should be reviewed after one year, any older than this date will be deleted.

Predictions based on computational analyses of large-scale experimental data sets
Predictions based on computational analyses that integrate datasets of several types, including experimental data (e.g. expression data, protein-protein interaction data, genetic interaction data, etc.), sequence data (e.g. promoter sequence, sequence-based structural predictions, etc.), or mathematical models
The RCA code should be used for annotations made from predictions based on computational analyses of large-scale experimental data sets, or on computational analyses that integrate multiple types of data into the analysis. Acceptable experimental data types include protein-protein interaction data (e.g. two-hybrid results, mass spectroscopic identification of proteins identified by affinity tag purifications, etc.) synthetic genetic interactions, microarray expression results. Sequence-based data based on the sequence of the gene product, including structural predictions based on sequence, may be included provided that the analysis included non-sequence-based data as well. Sequence information related to promotor sequence features may also be included as a data type within these analyses. Predictions based on mathematical modelling which attempts to duplicate existing experimental results are also appropriate for use of this evidence code.

Analyses based purely on comparisons of the gene product sequence, including sequence similarity with experimentally characterized gene products, as determined by pairwise or multiple alignment; prediction methods for non-coding RNA genes; recognized functional domains, as determined by tools such as InterPro, Pfam, SMART, etc. and including the use of files such as interpro2go, pfam2go, smart2go to convert the domain hits to GO terms; predicted protein features, e.g., transmembrane regions, signal sequence, etc.; structural similarity with experimentally characterized gene products, as determined by crystallography, nuclear magnetic resonance, or computational prediction; or analyses combining multiple types of data based on the gene product sequence should use the ISS evidence code (or the IEA code if it is not reviewed by a curator).

Similarly for experimental data, if the annotation was made purely on the basis of an experimental result, e.g. a protein-protein interaction with a characterized protein, a genetic interaction with a characterized gene, or having a similar microarray expression pattern as a characterized gene, then the appropriate experimental evidence code, IPI, IGI, or IEP, respectively, should be used instead.

Examples where the RCA evidence code should be used:

Samanta and Liang, 2003 (PMID:14566057) analyzed all interactions for S. cerevisiae present in the Database of Interacting Proteins (DIP) and made predictions about the roles of genes that were uncharacterized at the time. GO Annotations resulting from this publication include the process term 'rRNA processing' for both UTP30 and NOP6, neither of which was experimentally characterized at the time. A role for NOP6 in the biogenesis of the small ribosomal subunit has subsequently been indicated via a genetic interaction with the experimentally characterized gene EMG1.
Troyanskaya et al., 2003 (PMID:12826619) ...
Examples where the RCA evidence code should not be used:

Annotations based on more than one type of gene product sequence based evidence, including such things as BLAST, profile HMMs, TMHMM, SignalP, PROSITE, InterPro, mapping files such as interpro2go etc. should use the ISS code.
Annotations based on integrated computational analyses, if they have not been reviewed by a curator, should receive the IEA code.

[http://wiki.geneontology.org/index.php/Guide_to_GO_Evidence_Codes Back to: Guide to GO Evidence Codes]

[[Category: Annotation]]
[[Category: Evidence Codes]] Vanaukenk
Categories: GO Internal