Data release notes

From GlyGen Wiki
Jump to navigation Jump to search

The GlyGen datasets collection is updataed every 2 weeks.

Version 1.8.22

Main article: Data release notes/1.8.22

This version was released at April 1st, 2021

  • F protein (P0C045-1) was removed from Hepatitis C virus (genotype 1a, isolate H) proteome
  • Hepatitis C Virus (genotype 1a, isolate H) Tax ID changed from 11108 to 63746.
  • NCBI GeneID and Refseq datasets for SARS-CoV2 created
  • Accession history files created for tracking protein and glycan accessions
  • Mouse and rat disease datasets created to add disease data in the disease section
  • Data entries for O-GlcNAc data now point to The O-GlcNAc resource instead of the dataset.
  • human_proteoform_glycosylation_sites_o_glcnac_mcw updated with new O-GlcNAc data
  • n-sequon and n-sequon type fields added to the glycosylation datasets for data validation
  • Removed predicted glycosylation data from the UniProtKB data when reported glycosylation data is present for the same data entry.
  • Update to the Glycomotif alignment logic, motif keyword, publications
  • Update to the GlyConnect data, Automatic Literature Data, Disease data from Genomics England
  • Update to the GlyTouCan to ChEBI mapping method.
  • Addition of new glycan type and subtypes
  • Integration of GlycoTree alignment infrastructure
  • addition of semantic names from GlycoMotif
  • Addition of ~1800 new GlyTouCan accessions
  • Addition of new sections in glycan detail pages:
    • subsumption(related glycans through GNOme)
    • expression(glycans expressed in cell-line/Tissue)
    • History (release no. where the glycan was introduced in GlyGen)
  • Addition of glycan keywords (Motif Group), Synonyms, Reducing End information to the Motif detail page.
  • Addition of citations from NCFG data for Asparagine-linked glycans
  • Addition of GNOme and SandBox references
  • Addition of Disease data from Glycosmos
  • Addition of evidence to the "Biosynthetic Enzyme" section on the glycan detail page.
  • Addition of Rhea and Reactome cross-references to glycan detail pages
  • Addition of GlyGen links in ChEBI and GlyTouCan databases.  

Version 1.5.36

Main article: Data release notes/1.5.36

Related API version: 1.5.43

This version was released at July 20th, 2020

  1. Added O-GlcNAc data extracted from the literature by Stephanie Olivier’s group (GLYDS000518)
  2. Added germline and somatic variation data that has effect on glycosite (loss of glycosylation site and gain of glycosylation sequon) to the mutation section
  3. Added the literature extracted glycosite for SARS-CoV1 M protein
  4. Added glycosylation subtypes to the *_proteoform_glycosylation_sites_uniprotkb.csv
  5. Added species annotation via subsumption (for human, mouse). Rat and HCV to follow in the next release.
  6. Added updated GlyConnect data. (additional o-GlcNAc sites)
  7. Added glycosylation sites through text mining (first iteration).

Version 1.5.18

Main article: Data release notes/1.5.18

Related API version: 1.5.26

This version was released at April 15th, 2020

  1. Added UniProtKB Gene synonyms (search and details)
  2. Added RefSeq Gene names and synonyms
  3. Added RefSeq Protein synonyms
  4. Added UniProtKB Protein synonyms
  5. Updated the Fasta headers of the protein sequences that now resembles the fasta header of UniProtKB sequences
  6. Updated BioMuta data with addition of comments that shows which filters were passed
  7. Created new datasets:mutation literature mining dataset, dbSNP somatic and germline mutations datasets.
  8. Added GlyGen to Pharos Xref in the protein detail cross-reference section
  9. Added HCV 1a and 1b, SARS-CoV1 and 2 proteomes
  10. Added the MIM disease name where DO names were not available
  11. Updated glycan species annotations.
  12. Added Human, Mouse, Rat glycosylation data from GlyConnect.
  13. Added HCV1a glycosylation data from 1 publication.
  14. Added human glycosylation data from 2 publications.
  15. Added glycan x-refs to MatrixDB, GlycoEpitope
  16. Added MatrixDB Protein-GAGs interaction data. (at GlyGen Data)
  17. Included GlyTouCan-composition accessions.
  18. Added protein x-refs to GlycoProtDB
  19. Retired GlycO and GlycomeDB xrefs .
  20. Added SNFG glycans (at GlyGen Data)
  21. Added animated GIF and .mp4 video of 3D model of SARS-CoV-2 spike glycoprotein. (at GlyGen Data)
  22. Updated the synthesized glycan list from Dr. Boons group.
  23. Included additional 2 FAQs: How do I find a GlyTouCan Boons accession for my glycan composition? and How can I convert my glycan sequence to different formats (e.g IUPAC, WURCS, GlycoCT, LinearCode, etc.)?

Version 1.0

Main article: Data release notes/1.0

This version was released at Nov 22, 2019

  • Isoform Alignment.
  • Homolog Alignment.
  • New Usecase added in the quick search.
  • Composition Search.
  • Go ID search.
  • PMID search.
  • Batch search on advanced protein and glycan search page.
  • Multi-select option for amino acids on advanced glycoprotein search.
  • Multi-select option for organisms on advanced glycan search.
  • Integrate subsumption browser.

External links

https://data.glygen.org/history