Protein details/Sequence: Difference between revisions

From GlyGen Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 30: Line 30:


Mus musculus (Mouse) Datasets
Mus musculus (Mouse) Datasets


*Mouse Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000012 GLY_000012])
*Mouse Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000012 GLY_000012])
Line 37: Line 36:


Rattus norvegicus (Rat) datasets
Rattus norvegicus (Rat) datasets


*Rat Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000240 GLY_000240])
*Rat Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000240 GLY_000240])
Line 51: Line 49:
*SARS-CoV1 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000440 GLY_000400])
*SARS-CoV1 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000440 GLY_000400])
*SARS-CoV2 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000441 GLY_000441])
*SARS-CoV2 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000441 GLY_000441])


Glycan Sequence Datasets  
Glycan Sequence Datasets  

Revision as of 16:49, 6 December 2021

Function

This section contains the following information:

  • Sequence: The protein sequence in FASTA format from UniProtKB database. The Sequence section offers highlighting of certain annotations such as N-linked Sites, Sequon, Phosphorylation etc when the annotation is selected. The Sequence section also offers to view the protein sequence in ProtVista tool that also allows viewing of various annotations.

Source of information

The sequencce data is collected and integrated from UniProtKB, Glycam, and GlyToucan databases.

  • UniProtKB - protein FASTA sequences gathered from the UniProtKB database.
  • Glycam - Glycan sequences gathered in Glycam IUPAC format for associated glyans.
  • GlyTouCan - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions).

Data Access

The collected data is processed and stored at data.glygen.org in the following datasets:

Homo Sapiens (Human) Datasets

  • Human Protein Canonical sequences (UniProtKB; GLY_000002)
  • Human Protein Isoform sequences (UniProtKB; GLY_000053)
  • Human Protein Sequence Info (UniProtKB; GLY_000398)

Hepatitis C Virus Datasets

  • HCV1a Protein Canonical sequences (UniProtKB; GLY_000346)
  • HCV1b Protein Canonical sequences (UniProtKB; GLY_000347)
  • HCV1a Protein Isoform Sequences (UniProtKB; GLY_000348)
  • HCV1b Protein Isoform Sequences (UniProtKB; GLY_000349)
  • HCV1a Protein Sequence Info (UniProtKB; GLY_000350)
  • HCV1b Protein Sequence Info (UniProtKB; GLY_000351)

Mus musculus (Mouse) Datasets

  • Mouse Protein Canonical sequences (UniProtKB; GLY_000012)
  • Mouse Protein Isoform sequences (UniProtKB; GLY_000053)
  • Mouse Protein Sequence Info (UniProtKB; GLY_000399)

Rattus norvegicus (Rat) datasets

  • Rat Protein Canonical sequences (UniProtKB; GLY_000240)
  • Rat Protein Isoform sequences (UniProtKB; GLY_000255)
  • Rat Protein Sequence Info (UniProtKB; GLY_000400)

SARS Coronavirus datasets

  • SARS-CoV1 Protein Isoform sequences (UniProtKB; GLY_000411)
  • SARS-CoV2 Protein Isoform sequences (UniProtKB; GLY_000412)
  • SARS-CoV1 Protein Canonical sequences (UniProtKB; GLY_000415)
  • SARS-CoV2 Protein Canonical sequences (UniProtKB; GLY_000416)
  • SARS-CoV1 Protein Sequence Info (UniProtKB; GLY_000400)
  • SARS-CoV2 Protein Sequence Info (UniProtKB; GLY_000441)

Glycan Sequence Datasets

  • Glycan Sequences Glycam IUPAC (Glycam; GLY_000287)
  • Glycan Sequences GlycoCT (GlyTouCan; GLY_000288)
  • Glycan Sequences InChI (GlyTouCan, PubChem; GLY_000289)
  • Glycan Sequences IUPAC Extended (GlyTouCan; GLY_000290)
  • Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; GLY_000291)
  • Glycan Sequences WURCS (GlyTouCan; GLY_000292)