Protein details/Sequence: Difference between revisions
Jump to navigation
Jump to search
(Created page with "From GlyGen Wiki == Function == This section contains the following information: * '''Sequence''': The protein sequence in FASTA format from UniProtKB database. The Sequence...") |
No edit summary |
||
(4 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
The Sequence section of the [[Protein details]] page in GlyGen displays the canonical protein sequence and offers highlighting of certain annotations such as N-linked Sites, sequon, phosphorylation, etc. when the annotation is selected. | |||
== | ==Sequence== | ||
This section contains the following information: | This section contains the following information: | ||
* '''Sequence''' | *'''Sequence''' - The canonical protein sequence in FASTA format from the UniProtKB database | ||
*'''N-Linked Sites''' - | |||
*'''O-Linked Sites''' - | |||
*'''Variation from mutation''' - | |||
*'''Sequon''' - | |||
*'''Phosphorylation''' - | |||
*'''Glycation''' - | |||
== Source of information == | ==Source of information== | ||
The sequencce data is collected and integrated from '''[https://www.uniprot.org/ UniProtKB], [https://glycam.org/ Glycam],''' and '''[https://glytoucan.org/ GlyToucan]''' databases. | The sequencce data is collected and integrated from '''[https://www.uniprot.org/ UniProtKB], [https://glycam.org/ Glycam],''' and '''[https://glytoucan.org/ GlyToucan]''' databases. | ||
* '''[https://www.uniprot.org/ UniProtKB] -''' protein FASTA sequences gathered from the UniProtKB database. | *'''[https://www.uniprot.org/ UniProtKB] -''' protein FASTA sequences gathered from the UniProtKB database. | ||
* '''[https://glycam.org/ Glycam] -''' Glycan sequences gathered in Glycam IUPAC format for associated glyans. | *'''[https://glycam.org/ Glycam] -''' Glycan sequences gathered in Glycam IUPAC format for associated glyans. | ||
* [https://glytoucan.org/ '''GlyTouCan'''] - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions). | *[https://glytoucan.org/ '''GlyTouCan'''] - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions). | ||
== Data | ==Data access== | ||
The collected data is processed and stored at '''[https://data.glygen.org/ data.glygen.org]''' in the following datasets: | The collected data is processed and stored at '''[https://data.glygen.org/ data.glygen.org]''' in the following datasets: | ||
Homo Sapiens (Human) Datasets | Homo Sapiens (Human) Datasets | ||
* Human Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000002 GLY_000002]) | *Human Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000002 GLY_000002]) | ||
* Human Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000053 GLY_000053]) | *Human Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000053 GLY_000053]) | ||
* Human Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000398 GLY_000398]) | *Human Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000398 GLY_000398]) | ||
Hepatitis C Virus Datasets | Hepatitis C Virus Datasets | ||
* HCV1a Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000346 GLY_000346]) | *HCV1a Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000346 GLY_000346]) | ||
* HCV1b Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000347 GLY_000347]) | *HCV1b Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000347 GLY_000347]) | ||
* HCV1a Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000348 GLY_000348]) | *HCV1a Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000348 GLY_000348]) | ||
* HCV1b Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000349 GLY_000349]) | *HCV1b Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000349 GLY_000349]) | ||
* HCV1a Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000350 GLY_000350]) | *HCV1a Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000350 GLY_000350]) | ||
* HCV1b Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000351 GLY_000351]) | *HCV1b Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000351 GLY_000351]) | ||
Mus musculus (Mouse) Datasets | Mus musculus (Mouse) Datasets | ||
*Mouse Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000012 GLY_000012]) | |||
*Mouse Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000054 GLY_000053]) | |||
* Mouse Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000012 GLY_000012]) | *Mouse Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000399 GLY_000399]) | ||
* Mouse Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000054 GLY_000053]) | |||
* Mouse Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000399 GLY_000399]) | |||
Rattus norvegicus (Rat) datasets | Rattus norvegicus (Rat) datasets | ||
*Rat Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000240 GLY_000240]) | |||
*Rat Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000255 GLY_000255]) | |||
* Rat Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000240 GLY_000240]) | *Rat Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000400 GLY_000400]) | ||
* Rat Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000255 GLY_000255]) | |||
* Rat Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000400 GLY_000400]) | |||
SARS Coronavirus datasets | SARS Coronavirus datasets | ||
* SARS-CoV1 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000411 GLY_000411]) | *SARS-CoV1 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000411 GLY_000411]) | ||
* SARS-CoV2 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000412 GLY_000412]) | *SARS-CoV2 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000412 GLY_000412]) | ||
* SARS-CoV1 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000415 GLY_000415]) | *SARS-CoV1 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000415 GLY_000415]) | ||
* SARS-CoV2 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000416 GLY_000416]) | *SARS-CoV2 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000416 GLY_000416]) | ||
* SARS-CoV1 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000440 GLY_000400]) | *SARS-CoV1 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000440 GLY_000400]) | ||
* SARS-CoV2 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000441 GLY_000441]) | *SARS-CoV2 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000441 GLY_000441]) | ||
Glycan Sequence Datasets | |||
*Glycan Sequences Glycam IUPAC (Glycam; [https://data.glygen.org/GLY_000287 GLY_000287]) | |||
*Glycan Sequences GlycoCT (GlyTouCan; [https://data.glygen.org/GLY_000288 GLY_000288]) | |||
*Glycan Sequences InChI (GlyTouCan, PubChem; [https://data.glygen.org/GLY_000289 GLY_000289]) | |||
*Glycan Sequences IUPAC Extended (GlyTouCan; [https://data.glygen.org/GLY_000290 GLY_000290]) | |||
*Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; [https://data.glygen.org/GLY_000291 GLY_000291]) | |||
*Glycan Sequences WURCS (GlyTouCan; [https://data.glygen.org/GLY_000292 GLY_000292]) | |||
Glycan | *Glycan sequences Byonic (Glycam; [https://data.glygen.org/GLY_000559 GLY_000559]) | ||
==Data harmonization== | |||
==Data filtering== |
Latest revision as of 19:22, 9 December 2021
The Sequence section of the Protein details page in GlyGen displays the canonical protein sequence and offers highlighting of certain annotations such as N-linked Sites, sequon, phosphorylation, etc. when the annotation is selected.
Sequence
This section contains the following information:
- Sequence - The canonical protein sequence in FASTA format from the UniProtKB database
- N-Linked Sites -
- O-Linked Sites -
- Variation from mutation -
- Sequon -
- Phosphorylation -
- Glycation -
Source of information
The sequencce data is collected and integrated from UniProtKB, Glycam, and GlyToucan databases.
- UniProtKB - protein FASTA sequences gathered from the UniProtKB database.
- Glycam - Glycan sequences gathered in Glycam IUPAC format for associated glyans.
- GlyTouCan - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions).
Data access
The collected data is processed and stored at data.glygen.org in the following datasets:
Homo Sapiens (Human) Datasets
- Human Protein Canonical sequences (UniProtKB; GLY_000002)
- Human Protein Isoform sequences (UniProtKB; GLY_000053)
- Human Protein Sequence Info (UniProtKB; GLY_000398)
Hepatitis C Virus Datasets
- HCV1a Protein Canonical sequences (UniProtKB; GLY_000346)
- HCV1b Protein Canonical sequences (UniProtKB; GLY_000347)
- HCV1a Protein Isoform Sequences (UniProtKB; GLY_000348)
- HCV1b Protein Isoform Sequences (UniProtKB; GLY_000349)
- HCV1a Protein Sequence Info (UniProtKB; GLY_000350)
- HCV1b Protein Sequence Info (UniProtKB; GLY_000351)
Mus musculus (Mouse) Datasets
- Mouse Protein Canonical sequences (UniProtKB; GLY_000012)
- Mouse Protein Isoform sequences (UniProtKB; GLY_000053)
- Mouse Protein Sequence Info (UniProtKB; GLY_000399)
Rattus norvegicus (Rat) datasets
- Rat Protein Canonical sequences (UniProtKB; GLY_000240)
- Rat Protein Isoform sequences (UniProtKB; GLY_000255)
- Rat Protein Sequence Info (UniProtKB; GLY_000400)
SARS Coronavirus datasets
- SARS-CoV1 Protein Isoform sequences (UniProtKB; GLY_000411)
- SARS-CoV2 Protein Isoform sequences (UniProtKB; GLY_000412)
- SARS-CoV1 Protein Canonical sequences (UniProtKB; GLY_000415)
- SARS-CoV2 Protein Canonical sequences (UniProtKB; GLY_000416)
- SARS-CoV1 Protein Sequence Info (UniProtKB; GLY_000400)
- SARS-CoV2 Protein Sequence Info (UniProtKB; GLY_000441)
Glycan Sequence Datasets
- Glycan Sequences Glycam IUPAC (Glycam; GLY_000287)
- Glycan Sequences GlycoCT (GlyTouCan; GLY_000288)
- Glycan Sequences InChI (GlyTouCan, PubChem; GLY_000289)
- Glycan Sequences IUPAC Extended (GlyTouCan; GLY_000290)
- Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; GLY_000291)
- Glycan Sequences WURCS (GlyTouCan; GLY_000292)
- Glycan sequences Byonic (Glycam; GLY_000559)