The General section of the Protein details page in GlyGen provides identifying information about the protein from reference databases and sources.


This section provides the following information:

  • Gene Name - Name of the gene that encodes the protein. Eg. EGFR
  • Gene Location - A fixed position on a chromosome where a gene is located. The gene location contains chromosome number along with gene start and end position on the chromosome. The gene location information is retrieved from ENSEMBL database. Eg Chromosome: 7 (55,019,021 - 55,211,628)
  • UniProtKB ID - The entry name or mnemonic identifier assigned to the protein accession that consists of up to 11 uppercase alphanumeric characters for UniProtKB/Swiss-Prot entry and up to 16 uppercase alphanumeric characters for UniProtKB/TrEMBL entry with a defined naming convention. Eg. EGFR_HUMAN
  • UniProtKB Accession - Accession assigned to the protein in the UniProtKB database. Eg. P00533-1
  • Protein Length - Proteins are made up of a combination of 20 amino acids. The protein length indicates the number of amino acids constituting a given protein sequence. The protein length information is retrieved from UniProtKB database. Eg. 793
  • UniProtKB Entry Name - The UniProtKB entry name is the recommended protein name as per UniProtKB database. Eg. Epidermal growth factor receptor
  • Chemical Mass - Molecular weight of a given protein, provided in Daltons (unified atomic mass unit). Molecular weight information is retrieved from UniProtKB database. Eg. 134,277 Da
  • RefSeq Accession - Accession assigned by NCBI Reference sequence database, formatted with two characters followed by an underscore. RefSeq accessions mapped to UniProtKB accessions are retrieved from UniProtKB database. Eg. NP_001333828.1
  • RefSeq Name - Name assigned by NCBI Reference Sequence Database. Eg. epidermal growth factor receptor isoform g precursor
  • Organism - Name of the organism that is the source of the protein sequence. The NCBI Taxonomy ID for the organism is also provided in the Organism field. Eg. Homo sapiens (Human) [9606]

Source of Information

The General data is collected and integrated from UniProtKB, NCBI RefSeq, and ENSEMBL databases.