Protein details/Sequence
Jump to navigation
Jump to search
The Sequence section of the Protein details page in GlyGen displays the canonical protein sequence and offers highlighting of certain annotations such as N-linked Sites, sequon, phosphorylation, etc. when the annotation is selected.
Sequence
This section contains the following information:
- Sequence - The canonical protein sequence in FASTA format from the UniProtKB database
- N-Linked Sites -
- O-Linked Sites -
- Variation from mutation -
- Sequon -
- Phosphorylation -
- Glycation -
Source of information
The sequencce data is collected and integrated from UniProtKB, Glycam, and GlyToucan databases.
- UniProtKB - protein FASTA sequences gathered from the UniProtKB database.
- Glycam - Glycan sequences gathered in Glycam IUPAC format for associated glyans.
- GlyTouCan - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions).
Data access
The collected data is processed and stored at data.glygen.org in the following datasets:
Homo Sapiens (Human) Datasets
- Human Protein Canonical sequences (UniProtKB; GLY_000002)
- Human Protein Isoform sequences (UniProtKB; GLY_000053)
- Human Protein Sequence Info (UniProtKB; GLY_000398)
Hepatitis C Virus Datasets
- HCV1a Protein Canonical sequences (UniProtKB; GLY_000346)
- HCV1b Protein Canonical sequences (UniProtKB; GLY_000347)
- HCV1a Protein Isoform Sequences (UniProtKB; GLY_000348)
- HCV1b Protein Isoform Sequences (UniProtKB; GLY_000349)
- HCV1a Protein Sequence Info (UniProtKB; GLY_000350)
- HCV1b Protein Sequence Info (UniProtKB; GLY_000351)
Mus musculus (Mouse) Datasets
- Mouse Protein Canonical sequences (UniProtKB; GLY_000012)
- Mouse Protein Isoform sequences (UniProtKB; GLY_000053)
- Mouse Protein Sequence Info (UniProtKB; GLY_000399)
Rattus norvegicus (Rat) datasets
- Rat Protein Canonical sequences (UniProtKB; GLY_000240)
- Rat Protein Isoform sequences (UniProtKB; GLY_000255)
- Rat Protein Sequence Info (UniProtKB; GLY_000400)
SARS Coronavirus datasets
- SARS-CoV1 Protein Isoform sequences (UniProtKB; GLY_000411)
- SARS-CoV2 Protein Isoform sequences (UniProtKB; GLY_000412)
- SARS-CoV1 Protein Canonical sequences (UniProtKB; GLY_000415)
- SARS-CoV2 Protein Canonical sequences (UniProtKB; GLY_000416)
- SARS-CoV1 Protein Sequence Info (UniProtKB; GLY_000400)
- SARS-CoV2 Protein Sequence Info (UniProtKB; GLY_000441)
Glycan Sequence Datasets
- Glycan Sequences Glycam IUPAC (Glycam; GLY_000287)
- Glycan Sequences GlycoCT (GlyTouCan; GLY_000288)
- Glycan Sequences InChI (GlyTouCan, PubChem; GLY_000289)
- Glycan Sequences IUPAC Extended (GlyTouCan; GLY_000290)
- Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; GLY_000291)
- Glycan Sequences WURCS (GlyTouCan; GLY_000292)
- Glycan sequences Byonic (Glycam; GLY_000559)