Protein details/Glycosylation: Difference between revisions

From GlyGen Wiki
Jump to navigation Jump to search
(First draft of the subpage.)
 
 
(15 intermediate revisions by 4 users not shown)
Line 1: Line 1:
The glycosylation section of the [[Protein information]] page in GlyGen provides the detailed list of glycosylation sites, reported glycans attached to the protein and predicted sites.
The Glycosylation section of the [[Protein details]] page in GlyGen provides a detailed list of glycosylation sites, reported glycans attached to the protein, and sites text mined from publications.


== GlyGen portal ==
==Glycosylation==
[[File:Protein Glycosylation Screenshot.jpg|thumb|upright|Screenshot of the Glycosylation section on the [[Protein information]] page in GlyGen.]]
[[File:Protein_Glycosylation_Screenshot.jpg|thumb|upright|Screenshot of the Glycosylation section on the [[Protein details]] page in GlyGen.]]
Glycosylation is presented in 4 tabs on the [[Protein information]] page in GlyGen.
Glycosylation is presented in 4 tabs on the [[Protein details]] page in GlyGen. The Glycosylation summary provides the overall information about the section like the total number of sites (O,N,S,C linked sites), total number of N linked and O-linked sites with the total number of glycan annotations (structures) observed on the protein. Eg. 18 site(s) total, 169 N-linked annotation(s) at 17 site(s), 1 O-linked annotation(s) at 1 site(s)
 
===Reported Sites with Glycan===
This tab shows reported glycosylation sites with glycan information. The following columns are presented in the table:
 
*'''Source''' - GlyGen evidence linking to the databases and papers that provided the glycosylation information
*'''Type''' - Type of glycosylation. Eg. N-linked, O-linked
*'''GlyTouCan ID''' - Unique accession assigned to the registered glycan structure in [https://glytoucan.org/ GlyTouCan] database. Eg. G01543ZX
*'''Glycan Image''' - Image of the glycan in SNFG format
*'''Residue''' - Amino acid residue of the given protein along with its position. Eg. Asn294
*'''Note''' - Additional information about the entry such as curation notes, O-glycosylation subtype, remarks, etc.
 
===Reported Sites===
This tab shows reported glycosylation sites without glycan information. The following columns are presented in the table:
 
*'''Source''' - GlyGen evidence linking to the databases and papers that provided the glycosylation information
*'''Type''' - Type of glycosylation. Eg. N-linked, O-linked
*'''Residue''' - Amino acid residue of the given protein along with its position. Eg. Asn294
*'''Note''' - Additional information about the entry such as curation notes, O-glycosylation subtype, remarks, etc.
 
===Predicted Only===
This tab shows predicted glycosylation sites. The following columns are presented in the table:


=== Reported Sites with Glycan ===
{{Expand section|small=no}}
{{Expand section|small=no}}
This tab shows a table with the following columns:
* '''Source''' - [[GlyGen evidence linking]] to the databases and papers that provided the glycosyltation information
* '''Type''' - ...
* '''GlyTouCan ID''' - ...
* '''Glycan Image''' - ...
* '''Residue''' - ...
* '''Note''' - ...


=== Reported Sites ===
===Text Mining===
{{Expand section|small=no}}
{{Expand section|small=no}}
<br />
==Source of information==
The Glycosylation data is collected and integrated from the resources such as '''[https://www.uniprot.org/ UniProtKB]''', [https://glyconnect.expasy.org/ '''Glyconnect'''], '''[http://www.unicarbkb.org/ UniCarbKB], [https://www.rcsb.org/ RCSB PDB], [https://www.oglcnac.mcw.edu/ The O-GlcNAc Database]'''.
*[https://www.uniprot.org/ '''UniProtKB'''] - only reported sites information and predicted information is downloaded from UniProtKB
*[https://glyconnect.expasy.org/ '''Glyconnect'''] - reported sites with glycan information on known and unknown residues is downloaded from Glyconnect
*'''[http://unicarbkb.org/ UniCarbKB]''' - reported sites with glycan information on known and unknown residues is downloaded from UniCarbKB
*[https://www.rcsb.org/ '''RCSB PDB'''] - only reported sites information is downloaded from RCSB PDB
*[https://www.oglcnac.mcw.edu/ '''The O-GlcNAc Database'''] - Only O-GlcNAcylation reported sites with glycan information on known and unknown residues is downloaded from The O-GlcNAc database
==Data access==
The collected data is processed and stored in '''[https://data.glygen.org data.glygen.org]''' in the following datasets.
Homo Sapiens (Human) Datasets
*Human Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000038 GLY_000038])
*Glycosylation Sites (UniCarbKB [Human proteins]; [https://data.glygen.org/GLY_000040 GLY_000040])
*Human Glycosylation Sites (RCSB PDB; [https://data.glygen.org/GLY_000042 GLY_000042])
*Human Glycosylation Sites (GlyConnect; [https://data.glygen.org/GLY_000329 GLY_000329])
*Human Glycosylation Sites ([GPTwiki]; [https://data.glygen.org/GLY_000480 GLY_000480])
*Human Glycosylation Sites ([Automatic Literature Mining] [Automatically verified]; [https://data.glygen.org/GLY_000481 GLY_000481])
*Human O-GlcNAc Glycosylation Sites (MCW; [https://data.glygen.org/GLY_000517 GLY_000517])
*Human Glycosylation Sites UniCarbKB Glycomics Study ([https://data.glygen.org/GLY_000611 GLY_000611])


== Source of information ==
Hepatitis C Virus Datasets
{{Expand section|small=no}}
 
The following resources provided information that was integrated in GlyGen
*HCV1a Glycosylation Sites (Literature + UniCarbKB; [https://data.glygen.org/GLY_000335 GLY_000335])
* [https://www.uniprot.org/ UniProtKB] - only reported sites information and predicted information is downloaed from UniProtKB
*HCV1a Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000382 GLY_000382])
* ...
*HCV1b Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000383 GLY_000383])
 
SARS Coronavirus Datasets
 
*SARS-CoV2 Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000473 GLY_000473])
*Glycosylation Sites (UniCarbKB [SARS CoV 2 proteins]; [https://data.glygen.org/GLY_000479 GLY_000479])
*SARS-CoV1 Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000495 GLY_000495])
*SARS-CoV2 Glycosylation sites (UniprotKB; [https://data.glygen.org/GLY_000473 GLY_000473])
*SARS-CoV1 Glycosylation Sites (Literature; [https://data.glygen.org/GLY_000612 GLY_000612])
 
Mus musculus (Mouse) Datasets
 
*Mouse Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000039 GLY_000039])
*Glycosylation sites (UniCarbKB [Mouse proteins]; [https://data.glygen.org/GLY_000041 GLY_000041])
*Mouse Glycosylation Sites (RCSB PDB; [https://data.glygen.org/GLY_000043 GLY_000043])
*Mouse Glycosylation Sites (GlyConnect; [https://data.glygen.org/GLY_000330 GLY_000330])
 
Rattus norvegicus (Rat) Datasets


== Data access ==
*Glycosylation sites (UniCarbKB [Rat proteins]; [https://data.glygen.org/GLY_000221 GLY_000221])
{{Expand section|small=no}}
*Rat Glycosylation Sites (UniProtKB; [https://data.glygen.org/GLY_000224 GLY_000224])
Data from '''[https://www.uniprot.org/ UniProtKB]''' was downloaded using the UniProt API....
*Rat Glycosylation Sites (RCSB PDB; [https://data.glygen.org/GLY_000226 GLY_000226])
*Rat Glycosylation Sites (GlyConnect; [https://data.glygen.org/GLY_000331 GLY_000331])


== Data harmonization ==
==Data harmonization==
{{Expand section|small=no}}
{{Expand section|small=no}}


== Data filtering ==
==Data filtering==
{{Expand section|small=no}}
{{Expand section|small=no}}

Latest revision as of 20:16, 19 May 2022

The Glycosylation section of the Protein details page in GlyGen provides a detailed list of glycosylation sites, reported glycans attached to the protein, and sites text mined from publications.

Glycosylation

Screenshot of the Glycosylation section on the Protein details page in GlyGen.

Glycosylation is presented in 4 tabs on the Protein details page in GlyGen. The Glycosylation summary provides the overall information about the section like the total number of sites (O,N,S,C linked sites), total number of N linked and O-linked sites with the total number of glycan annotations (structures) observed on the protein. Eg. 18 site(s) total, 169 N-linked annotation(s) at 17 site(s), 1 O-linked annotation(s) at 1 site(s)

Reported Sites with Glycan

This tab shows reported glycosylation sites with glycan information. The following columns are presented in the table:

  • Source - GlyGen evidence linking to the databases and papers that provided the glycosylation information
  • Type - Type of glycosylation. Eg. N-linked, O-linked
  • GlyTouCan ID - Unique accession assigned to the registered glycan structure in GlyTouCan database. Eg. G01543ZX
  • Glycan Image - Image of the glycan in SNFG format
  • Residue - Amino acid residue of the given protein along with its position. Eg. Asn294
  • Note - Additional information about the entry such as curation notes, O-glycosylation subtype, remarks, etc.

Reported Sites

This tab shows reported glycosylation sites without glycan information. The following columns are presented in the table:

  • Source - GlyGen evidence linking to the databases and papers that provided the glycosylation information
  • Type - Type of glycosylation. Eg. N-linked, O-linked
  • Residue - Amino acid residue of the given protein along with its position. Eg. Asn294
  • Note - Additional information about the entry such as curation notes, O-glycosylation subtype, remarks, etc.

Predicted Only

This tab shows predicted glycosylation sites. The following columns are presented in the table:

Text Mining


Source of information

The Glycosylation data is collected and integrated from the resources such as UniProtKB, Glyconnect, UniCarbKB, RCSB PDB, The O-GlcNAc Database.

  • UniProtKB - only reported sites information and predicted information is downloaded from UniProtKB
  • Glyconnect - reported sites with glycan information on known and unknown residues is downloaded from Glyconnect
  • UniCarbKB - reported sites with glycan information on known and unknown residues is downloaded from UniCarbKB
  • RCSB PDB - only reported sites information is downloaded from RCSB PDB
  • The O-GlcNAc Database - Only O-GlcNAcylation reported sites with glycan information on known and unknown residues is downloaded from The O-GlcNAc database

Data access

The collected data is processed and stored in data.glygen.org in the following datasets.

Homo Sapiens (Human) Datasets

  • Human Glycosylation Sites (UniProtKB; GLY_000038)
  • Glycosylation Sites (UniCarbKB [Human proteins]; GLY_000040)
  • Human Glycosylation Sites (RCSB PDB; GLY_000042)
  • Human Glycosylation Sites (GlyConnect; GLY_000329)
  • Human Glycosylation Sites ([GPTwiki]; GLY_000480)
  • Human Glycosylation Sites ([Automatic Literature Mining] [Automatically verified]; GLY_000481)
  • Human O-GlcNAc Glycosylation Sites (MCW; GLY_000517)
  • Human Glycosylation Sites UniCarbKB Glycomics Study (GLY_000611)

Hepatitis C Virus Datasets

  • HCV1a Glycosylation Sites (Literature + UniCarbKB; GLY_000335)
  • HCV1a Glycosylation Sites (UniProtKB; GLY_000382)
  • HCV1b Glycosylation Sites (UniProtKB; GLY_000383)

SARS Coronavirus Datasets

  • SARS-CoV2 Glycosylation Sites (UniProtKB; GLY_000473)
  • Glycosylation Sites (UniCarbKB [SARS CoV 2 proteins]; GLY_000479)
  • SARS-CoV1 Glycosylation Sites (UniProtKB; GLY_000495)
  • SARS-CoV2 Glycosylation sites (UniprotKB; GLY_000473)
  • SARS-CoV1 Glycosylation Sites (Literature; GLY_000612)

Mus musculus (Mouse) Datasets

  • Mouse Glycosylation Sites (UniProtKB; GLY_000039)
  • Glycosylation sites (UniCarbKB [Mouse proteins]; GLY_000041)
  • Mouse Glycosylation Sites (RCSB PDB; GLY_000043)
  • Mouse Glycosylation Sites (GlyConnect; GLY_000330)

Rattus norvegicus (Rat) Datasets

  • Glycosylation sites (UniCarbKB [Rat proteins]; GLY_000221)
  • Rat Glycosylation Sites (UniProtKB; GLY_000224)
  • Rat Glycosylation Sites (RCSB PDB; GLY_000226)
  • Rat Glycosylation Sites (GlyConnect; GLY_000331)

Data harmonization

Data filtering