Protein details

From GlyGen Wiki
Jump to navigation Jump to search

In the GlyGen portal the collected information of a protein is presented in the protein information or protein details page. This webpage is subdivided into sections presenting different pieces of information, such as glycosylation data, protein function, protein expression, references to other databases for a protein and publications about this protein.

Information sections

The collected information is presented in different section using mainly table representation or text. However there are also interactive sections that can be manipulated by users.

General

The General section of the Protein Details page contains identifying information about the protein from reference databases and sources. The Gene Name and Gene Location describe the gene that encodes the protein and the location with a link to the chromosome visualizer in Ensembl Gene. Additional fields retrieved from UniProtKB and NCBI RefSeq include UniProtKB ID, UniProtKB Accession, Protein Length, UniProtKB Entry Name, Chemical Mass, RefSeq Accession, RefSeq Name and Organism.

Glycosylation

Glycosylation is presented in GlyGen portal in tables in four different tabs of the Glycosylation section. The first tab "Reported Sites with Glycan" shows a list of glycosylation sites with glycan structures. These glycans can either be defined glycan structures, structures with missing information or compositions. The second tab "Reported Sites" list all glycosylation sites that have been extracted from other databases. These sites have been reported to be glycosylated but the glycan structure has not been identified or reported. The third tab "Predicted Only" shows the sites that have been predicted using different tools. These sites as well do not have glycan structures. The last tab "Text Mining" shows data derived from text mining by automatically extracting site information from PubMed abstracts.

Phosphorylation

The Phosphorylation section of the Protein Details page provides a list of phosphorylation sites that have been extracted from other databases. The Kinase Protein and Kinase Gene columns list the enzymes responsible for phosphorylation, if available. Annotations for experimentally determined phosphorylation sites are retrieved from UniProtKB.

Glycation

The Glycation section of the Protein Details page in GlyGen provides a detailed list of sites with non-enzymatic, covalently linked glucose residues. This section contains information about the type of attachment and the site of glycation. Annotations for experimentally determined glycation sites are retrieved from UniProtKB.

Names

The Names section of the Protein Details page in GlyGen lists the recommended full name of the gene and protein from the UniProtKB database. All other names that are used to represent the gene or protein are listed as synonyms.

Function

The Function section of the Protein Details page in GlyGen lists information about the biological function of the protein. This information is retrieved from the UniProtKB database and from the Gene Summary and GeneRIF sections of the NCBI RefSeq database.

Sequence

Single Nucleotide Variation

Single nucleotide variation is presented in GlyGen portal in tables in two different tabs of the Single Nucleotide Variation section. These are most commonly nonsynonymous mutations which cause a different amino acid to be produced at a given position. The first tab "Disease associated Mutations" shows a list of mutations in the genetic sequence that result in a disease. The second tab "Non-disease associated Mutations" lists all mutations in the genetic sequence that are not associated with a disease. This information is retrieved from the BioMuta database and from EBI-EMBL-UniProtKB.

Mutagenesis

Mutagenesis section describes the effect of the experimental mutation of one or more amino acid(s) on the biological properties of the protein. The mutagenesis information is retrieved from UniProtKB database. experiments in which a limited number of amino acid residues are altered are described whereas gross alterations in protein structure, such as the deletion of hundreds of amino acids, are not described. The change in amino acid and position and the effect(s) of the mutation on the protein, the cell or the complete organism is reported in the section.

GO Annotations

Gene ontology is a precisely defined, common controlled vocabulary for describing the roles of genes and gene products in any organism and includes three categories: Molecular function, Biological process, and Cellular component.

Glycan Ligands

PTM Annotation

PTM annotation section describes post-translational modification (PTMs) and its effects on the protein. The PTM annotation is retrieved from PTM UniProtKB database.

Post translational modifications (PTM) define covalent and chemical modifications of a protein residue. PTM Annotation in the UniProtKB provides more context and description about the PTMs occurring on a protein.

Proteoform Annotation

The Proteoform Annotation section of the Protein Details page in GlyGen provides a list with the proteoform or glycoform with a detailed description obtained from Protein Ontology (PRO).

Pathway

Pathway section contains pathway information for a given protein from Reactome and KEGG databases. The pathway information is retrieved from UniProtKB.

Synthesized Glycans

Isoforms

Homologs

Disease

Expression Tissue

Expression Disease

Cross References

History

Publications

URL pattern

Programmatic access

Download options