Protein details/General: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
|||
Line 17: | Line 17: | ||
==Source of Information== | ==Source of Information== | ||
The General data is collected and integrated from '''[https://www.uniprot.org/ UniProtKB]''', '''[https://www.ncbi.nlm.nih.gov/ NCBI RefSeq]''', and '''[https://useast.ensembl.org/ ENSEMBL]''' databases. | |||
==Data Access== | ==Data Access== |
Revision as of 19:38, 15 November 2021
The General section of the Protein Information page in GlyGen provides identifying information about the protein from reference databases and sources.
General
This section provides the following information:
- Gene Name - Name of the gene that encodes the protein. Eg. EGFR
- Gene Location - A fixed position on a chromosome where a gene is located. The gene location contains chromosome number along with gene start and end position on the chromosome. The gene location information is retrieved from ENSEMBL database. Eg Chromosome: 7 (55,019,021 - 55,211,628)
- UniProtKB ID - The entry name or mnemonic identifier assigned to the protein accession that consists of up to 11 uppercase alphanumeric characters for UniProtKB/Swiss-Prot entry and up to 16 uppercase alphanumeric characters for UniProtKB/TrEMBL entry with a defined naming convention. Eg. EGFR_HUMAN
- UniProtKB Accession - Accession assigned to the protein in the UniProtKB database. Eg. P00533-1
- Protein Length - Proteins are made up of a combination of 20 amino acids. The protein length indicates the number of amino acids constituting a given protein sequence. The protein length information is retrieved from UniProtKB database. Eg. 793
- UniProtKB Entry Name - The UniProtKB entry name is the recommended protein name as per UniProtKB database. Eg. Epidermal growth factor receptor
- Chemical Mass - Molecular weight of a given protein, provided in Daltons (unified atomic mass unit). Molecular weight information is retrieved from UniProtKB database. Eg. 134,277 Da
- RefSeq Accession - Accession assigned by NCBI Reference sequence database, formatted with two characters followed by an underscore. RefSeq accessions mapped to UniProtKB accessions are retrieved from UniProtKB database. Eg. NP_001333828.1
- RefSeq Name - Name assigned by NCBI Reference Sequence Database. Eg. epidermal growth factor receptor isoform g precursor
- Organism - Name of the organism that is the source of the protein sequence. The NCBI Taxonomy ID for the organism is also provided in the Organism field. Eg. Homo sapiens (Human) [9606]
Source of Information
The General data is collected and integrated from UniProtKB, NCBI RefSeq, and ENSEMBL databases.