Glycomics paper curation: Difference between revisions
(Add basic structure) |
|||
Line 2: | Line 2: | ||
== Curation workflow == | == Curation workflow == | ||
* Identify a paper of interest, and look for glycan structures being reported in both primary and supplemental data sections. | |||
* Draw the glycans in Grits Toolbox, | |||
** Pay close attention to drawing them exactly as they are presented in the paper with no assumptions from the curator including | |||
*** Reducing end type (reduced, free etc) | |||
*** Derivitization status (permethylated C12, native, etc) | |||
*** Topology (which arm monosaccharides are placed on) | |||
*** Linkages | |||
* Export the glycan drawings to gws format and put in a folder in SharePoint | |||
* Request the gws file to be processed (by Sena) into an excel sheet and glytoucan id's added. Structures without glytoucan ids are submitted to glytoucan and registered. The finished excel is placed in Sharepoint and notification is given. | |||
* Once this is complete, retrieve the excel file from SharePoint | |||
* Check that the glytoucan structure matches the structure you submitted. | |||
* Add meta info using the specified format (see curation table section) and fill in those columns. | |||
* Make a copy tab and delete columns not specified in the Curation table (ex file name, row number, both cartoons, status, error) | |||
* Save a tab as Final-GlyGen to designate which tab to use in downstream steps | |||
== Curation table == | == Curation table == | ||
Table structure for glycomics information | |||
The file will be a CSV file using “,” as cell delimiter and all cells will be quoted. | |||
List of columns in the curation file that needs to be filled for glycomics information. Bold indicates mandatory information: | |||
{| class="wikitable" | |||
|GlyTouCan ID | |||
|G17689DH | |||
|From Senas spreadsheet | |||
|- | |||
|Paper | |||
|PMID:25753706 | |||
DOI:10.1007/978-1-4939-2343-4_8 | |||
|PMID or DOI | |||
|- | |||
|Species | |||
|9606 | |||
|From NCBI Taxonomy browser | |||
|- | |||
|Strain (fly, yeast, mouse) | |||
|Oregon-R | |||
|Flybase for fly | |||
SGD for Yeast | |||
Mouse | |||
Or text from paper if not in a dictionary | |||
|- | |||
|Tissue | |||
|UBERON:0002107 | |||
|From Uberon, if it can not be found we will discuss. | |||
|- | |||
|Cell line ID | |||
|Cellosaurus:CVCL_A4VI | |||
|From Cellosaurus | |||
|- | |||
|Disease | |||
|DOID:3571 | |||
|From Human Disease Ontology | |||
|- | |||
|has_abundance | |||
|yes | |||
no | |||
|Are there Numbers associated with the amount present in a sample | |||
|- | |||
|has_expression | |||
|yes | |||
no | |||
|Don't use for now until talk with Karina | |||
|- | |||
|Functional annotation/Keyword | |||
|<nowiki><term1>|<term2> </nowiki> | |||
|We (Mike) will provide a dictionary (~15 terms) that Mindy will use. | |||
|- | |||
|Glycan dictionary term ID | |||
|GSD000011 | |||
|From the [[Glycan structure dictionary]]. | |||
|- | |||
|Contributor | |||
|<nowiki>createdBy:Mindy Porterfield(mindy@something.de, CCRC)|createdBy:Name(email,institution) </nowiki> | |||
|From ticket 42 | |||
|- | |||
|Experimental technique | |||
|<nowiki>LC-MS|MS profile </nowiki> | |||
|Free text, we create dictionary to avoid things like LC/MS LC-MS | |||
|- | |||
|Variant (Fly, yeast, mouse) | |||
|Wild type, Tollo, | |||
|Flybase for fly | |||
SGD for Yeast | |||
|- | |||
|Organismal/cellular Phenotype | |||
|Eye color, blood type, | |||
|Mondo | |||
HPO | |||
Fly anatomy FBBD | |||
Text in paper, observable features, including disease phenotypes such as diabetes, autism spectrum disorder and cancer, or traits such as height, hair color and blood type. | |||
|- | |||
|Molecular Phenotype | |||
|APOE, | |||
|direct effect of a variant at the molecular level | |||
|} | |||
The following columns can have multiple entries per line/cell: | |||
* Disease | |||
* Functional annotation | |||
* Keywords | |||
* Glycan dictionary term ID | |||
* Contributor | |||
* Experimental technique | |||
The following format will be used for these cells: <term1>|<term2> | |||
For experimental techniques the following (non-comprehensive) dictionary will be used: | |||
* MS | |||
* MS/MS | |||
* LC-MS/MS | |||
* LC-MS | |||
* CE-MS | |||
* CE-MS/MS | |||
* CE | |||
* HPLC | |||
* GC | |||
* GC-MS | |||
This list will be extended if new experimental techniques are detected in the papers. | |||
For Functional annotation use the following (non-comprehensive) dictionary: | |||
* adhesion | |||
* homing | |||
* inflammation | |||
* protein targeting | |||
* protein secretion | |||
* protein stability | |||
* protein folding | |||
* ER stress | |||
* protein degradation | |||
* circulating half-life | |||
* clearance | |||
* internalization | |||
* metastasis | |||
* shielding | |||
* recognition | |||
* toxin receptor | |||
* viral receptor | |||
* microbial receptor | |||
* receptor signaling | |||
* sperm maturation | |||
* Added terms (Mike) | |||
* differentiation | |||
* biomarker | |||
* | |||
This list will be extended if new function annotation terms are detected in the papers. | |||
== Curation rules == | == Curation rules == |
Revision as of 20:52, 17 August 2023
This article describes the curation of glycomics papers as part of the GlyGen project.
Curation workflow
- Identify a paper of interest, and look for glycan structures being reported in both primary and supplemental data sections.
- Draw the glycans in Grits Toolbox,
- Pay close attention to drawing them exactly as they are presented in the paper with no assumptions from the curator including
- Reducing end type (reduced, free etc)
- Derivitization status (permethylated C12, native, etc)
- Topology (which arm monosaccharides are placed on)
- Linkages
- Pay close attention to drawing them exactly as they are presented in the paper with no assumptions from the curator including
- Export the glycan drawings to gws format and put in a folder in SharePoint
- Request the gws file to be processed (by Sena) into an excel sheet and glytoucan id's added. Structures without glytoucan ids are submitted to glytoucan and registered. The finished excel is placed in Sharepoint and notification is given.
- Once this is complete, retrieve the excel file from SharePoint
- Check that the glytoucan structure matches the structure you submitted.
- Add meta info using the specified format (see curation table section) and fill in those columns.
- Make a copy tab and delete columns not specified in the Curation table (ex file name, row number, both cartoons, status, error)
- Save a tab as Final-GlyGen to designate which tab to use in downstream steps
Curation table
Table structure for glycomics information
The file will be a CSV file using “,” as cell delimiter and all cells will be quoted.
List of columns in the curation file that needs to be filled for glycomics information. Bold indicates mandatory information:
GlyTouCan ID | G17689DH | From Senas spreadsheet |
Paper | PMID:25753706
DOI:10.1007/978-1-4939-2343-4_8 |
PMID or DOI |
Species | 9606 | From NCBI Taxonomy browser |
Strain (fly, yeast, mouse) | Oregon-R | Flybase for fly
SGD for Yeast Mouse Or text from paper if not in a dictionary |
Tissue | UBERON:0002107 | From Uberon, if it can not be found we will discuss. |
Cell line ID | Cellosaurus:CVCL_A4VI | From Cellosaurus |
Disease | DOID:3571 | From Human Disease Ontology |
has_abundance | yes
no |
Are there Numbers associated with the amount present in a sample |
has_expression | yes
no |
Don't use for now until talk with Karina |
Functional annotation/Keyword | <term1>|<term2> | We (Mike) will provide a dictionary (~15 terms) that Mindy will use. |
Glycan dictionary term ID | GSD000011 | From the Glycan structure dictionary. |
Contributor | createdBy:Mindy Porterfield(mindy@something.de, CCRC)|createdBy:Name(email,institution) | From ticket 42 |
Experimental technique | LC-MS|MS profile | Free text, we create dictionary to avoid things like LC/MS LC-MS
|
Variant (Fly, yeast, mouse) | Wild type, Tollo, | Flybase for fly
SGD for Yeast
|
Organismal/cellular Phenotype | Eye color, blood type, | Mondo
HPO Fly anatomy FBBD
|
Molecular Phenotype | APOE, | direct effect of a variant at the molecular level |
The following columns can have multiple entries per line/cell:
- Disease
- Functional annotation
- Keywords
- Glycan dictionary term ID
- Contributor
- Experimental technique
The following format will be used for these cells: <term1>|<term2>
For experimental techniques the following (non-comprehensive) dictionary will be used:
- MS
- MS/MS
- LC-MS/MS
- LC-MS
- CE-MS
- CE-MS/MS
- CE
- HPLC
- GC
- GC-MS
This list will be extended if new experimental techniques are detected in the papers.
For Functional annotation use the following (non-comprehensive) dictionary:
- adhesion
- homing
- inflammation
- protein targeting
- protein secretion
- protein stability
- protein folding
- ER stress
- protein degradation
- circulating half-life
- clearance
- internalization
- metastasis
- shielding
- recognition
- toxin receptor
- viral receptor
- microbial receptor
- receptor signaling
- sperm maturation
- Added terms (Mike)
- differentiation
- biomarker
This list will be extended if new function annotation terms are detected in the papers.