Protein details/Sequence: Difference between revisions

From GlyGen Wiki
Jump to navigation Jump to search
(Created page with "From GlyGen Wiki == Function == This section contains the following information: * '''Sequence''': The protein sequence in FASTA format from UniProtKB database. The Sequence...")
 
No edit summary
Line 1: Line 1:
From GlyGen Wiki
==Function==
 
== Function ==
This section contains the following information:
This section contains the following information:


* '''Sequence''': The protein sequence in FASTA format from UniProtKB database. The Sequence section offers highlighting of certain annotations such as N-linked Sites, Sequon, Phosphorylation etc when the annotation is selected. The Sequence section also offers to view the protein sequence in ProtVista tool that also allows viewing of various annotations.
*'''Sequence''': The protein sequence in FASTA format from UniProtKB database. The Sequence section offers highlighting of certain annotations such as N-linked Sites, Sequon, Phosphorylation etc when the annotation is selected. The Sequence section also offers to view the protein sequence in ProtVista tool that also allows viewing of various annotations.


== Source of information ==
==Source of information==
The sequencce data is collected and integrated from '''[https://www.uniprot.org/ UniProtKB], [https://glycam.org/ Glycam],''' and '''[https://glytoucan.org/ GlyToucan]''' databases.
The sequencce data is collected and integrated from '''[https://www.uniprot.org/ UniProtKB], [https://glycam.org/ Glycam],''' and '''[https://glytoucan.org/ GlyToucan]''' databases.


* '''[https://www.uniprot.org/ UniProtKB] -''' protein FASTA sequences gathered from the UniProtKB database.  
*'''[https://www.uniprot.org/ UniProtKB] -''' protein FASTA sequences gathered from the UniProtKB database.
* '''[https://glycam.org/ Glycam] -''' Glycan sequences gathered in Glycam IUPAC format for associated glyans.  
*'''[https://glycam.org/ Glycam] -''' Glycan sequences gathered in Glycam IUPAC format for associated glyans.
* [https://glytoucan.org/ '''GlyTouCan'''] - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions).
*[https://glytoucan.org/ '''GlyTouCan'''] - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions).


== Data Access ==
==Data Access==
The collected data is processed and stored at '''[https://data.glygen.org/ data.glygen.org]''' in the following datasets:
The collected data is processed and stored at '''[https://data.glygen.org/ data.glygen.org]''' in the following datasets:


Homo Sapiens (Human) Datasets
Homo Sapiens (Human) Datasets


* Human Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000002 GLY_000002])
*Human Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000002 GLY_000002])
* Human Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000053 GLY_000053])
*Human Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000053 GLY_000053])
* Human Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000398 GLY_000398])
*Human Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000398 GLY_000398])


Hepatitis C Virus Datasets
Hepatitis C Virus Datasets


* HCV1a Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000346 GLY_000346])
*HCV1a Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000346 GLY_000346])
* HCV1b Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000347 GLY_000347])
*HCV1b Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000347 GLY_000347])
* HCV1a Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000348 GLY_000348])
*HCV1a Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000348 GLY_000348])
* HCV1b Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000349 GLY_000349])
*HCV1b Protein Isoform Sequences (UniProtKB; [https://data.glygen.org/GLY_000349 GLY_000349])
* HCV1a Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000350 GLY_000350])
*HCV1a Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000350 GLY_000350])
* HCV1b Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000351 GLY_000351])
*HCV1b Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000351 GLY_000351])


Mus musculus (Mouse) Datasets
Mus musculus (Mouse) Datasets




 
*Mouse Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000012 GLY_000012])
* Mouse Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000012 GLY_000012])
*Mouse Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000054 GLY_000053])
* Mouse Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000054 GLY_000053])
*Mouse Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000399 GLY_000399])
* Mouse Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000399 GLY_000399])


Rattus norvegicus (Rat) datasets
Rattus norvegicus (Rat) datasets




 
*Rat Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000240 GLY_000240])
* Rat Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000240 GLY_000240])
*Rat Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000255 GLY_000255])
* Rat Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000255 GLY_000255])
*Rat Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000400 GLY_000400])
* Rat Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000400 GLY_000400])


SARS Coronavirus datasets
SARS Coronavirus datasets


* SARS-CoV1 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000411 GLY_000411])
*SARS-CoV1 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000411 GLY_000411])
* SARS-CoV2 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000412 GLY_000412])
*SARS-CoV2 Protein Isoform sequences (UniProtKB; [https://data.glygen.org/GLY_000412 GLY_000412])
* SARS-CoV1 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000415 GLY_000415])
*SARS-CoV1 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000415 GLY_000415])
* SARS-CoV2 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000416 GLY_000416])
*SARS-CoV2 Protein Canonical sequences (UniProtKB; [https://data.glygen.org/GLY_000416 GLY_000416])
* SARS-CoV1 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000440 GLY_000400])
*SARS-CoV1 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000440 GLY_000400])
* SARS-CoV2 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000441 GLY_000441])
*SARS-CoV2 Protein Sequence Info (UniProtKB; [https://data.glygen.org/GLY_000441 GLY_000441])




Line 60: Line 56:
Glycan Sequence Datasets  
Glycan Sequence Datasets  


* Glycan Sequences Glycam IUPAC (Glycam; [https://data.glygen.org/GLY_000287 GLY_000287])
*Glycan Sequences Glycam IUPAC (Glycam; [https://data.glygen.org/GLY_000287 GLY_000287])
* Glycan Sequences GlycoCT (GlyTouCan; [https://data.glygen.org/GLY_000288 GLY_000288])
*Glycan Sequences GlycoCT (GlyTouCan; [https://data.glygen.org/GLY_000288 GLY_000288])
* Glycan Sequences InChI (GlyTouCan, PubChem; [https://data.glygen.org/GLY_000289 GLY_000289])
*Glycan Sequences InChI (GlyTouCan, PubChem; [https://data.glygen.org/GLY_000289 GLY_000289])
* Glycan Sequences IUPAC Extended (GlyTouCan; [https://data.glygen.org/GLY_000290 GLY_000290])
*Glycan Sequences IUPAC Extended (GlyTouCan; [https://data.glygen.org/GLY_000290 GLY_000290])
* Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; [https://data.glygen.org/GLY_000291 GLY_000291])
*Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; [https://data.glygen.org/GLY_000291 GLY_000291])
* Glycan Sequences WURCS (GlyTouCan; [https://data.glygen.org/GLY_000292 GLY_000292])
*Glycan Sequences WURCS (GlyTouCan; [https://data.glygen.org/GLY_000292 GLY_000292])


* Glycan sequences Byonic (Glycam; [https://data.glygen.org/GLY_000559 GLY_000559])
*Glycan sequences Byonic (Glycam; [https://data.glygen.org/GLY_000559 GLY_000559])

Revision as of 16:36, 6 December 2021

Function

This section contains the following information:

  • Sequence: The protein sequence in FASTA format from UniProtKB database. The Sequence section offers highlighting of certain annotations such as N-linked Sites, Sequon, Phosphorylation etc when the annotation is selected. The Sequence section also offers to view the protein sequence in ProtVista tool that also allows viewing of various annotations.

Source of information

The sequencce data is collected and integrated from UniProtKB, Glycam, and GlyToucan databases.

  • UniProtKB - protein FASTA sequences gathered from the UniProtKB database.
  • Glycam - Glycan sequences gathered in Glycam IUPAC format for associated glyans.
  • GlyTouCan - Glycan sequences in IUPAC extended format for associated glycans (GlyTouCan Accessions).

Data Access

The collected data is processed and stored at data.glygen.org in the following datasets:

Homo Sapiens (Human) Datasets

  • Human Protein Canonical sequences (UniProtKB; GLY_000002)
  • Human Protein Isoform sequences (UniProtKB; GLY_000053)
  • Human Protein Sequence Info (UniProtKB; GLY_000398)

Hepatitis C Virus Datasets

  • HCV1a Protein Canonical sequences (UniProtKB; GLY_000346)
  • HCV1b Protein Canonical sequences (UniProtKB; GLY_000347)
  • HCV1a Protein Isoform Sequences (UniProtKB; GLY_000348)
  • HCV1b Protein Isoform Sequences (UniProtKB; GLY_000349)
  • HCV1a Protein Sequence Info (UniProtKB; GLY_000350)
  • HCV1b Protein Sequence Info (UniProtKB; GLY_000351)

Mus musculus (Mouse) Datasets


  • Mouse Protein Canonical sequences (UniProtKB; GLY_000012)
  • Mouse Protein Isoform sequences (UniProtKB; GLY_000053)
  • Mouse Protein Sequence Info (UniProtKB; GLY_000399)

Rattus norvegicus (Rat) datasets


  • Rat Protein Canonical sequences (UniProtKB; GLY_000240)
  • Rat Protein Isoform sequences (UniProtKB; GLY_000255)
  • Rat Protein Sequence Info (UniProtKB; GLY_000400)

SARS Coronavirus datasets

  • SARS-CoV1 Protein Isoform sequences (UniProtKB; GLY_000411)
  • SARS-CoV2 Protein Isoform sequences (UniProtKB; GLY_000412)
  • SARS-CoV1 Protein Canonical sequences (UniProtKB; GLY_000415)
  • SARS-CoV2 Protein Canonical sequences (UniProtKB; GLY_000416)
  • SARS-CoV1 Protein Sequence Info (UniProtKB; GLY_000400)
  • SARS-CoV2 Protein Sequence Info (UniProtKB; GLY_000441)


Glycan Sequence Datasets

  • Glycan Sequences Glycam IUPAC (Glycam; GLY_000287)
  • Glycan Sequences GlycoCT (GlyTouCan; GLY_000288)
  • Glycan Sequences InChI (GlyTouCan, PubChem; GLY_000289)
  • Glycan Sequences IUPAC Extended (GlyTouCan; GLY_000290)
  • Glycan Sequences SMILES Isomeric (GlyTouCan, PubChem; GLY_000291)
  • Glycan Sequences WURCS (GlyTouCan; GLY_000292)