Batch retrieval tool

From GlyGen Wiki
Revision as of 16:12, 9 May 2025 by Sujeetk (talk | contribs)
Jump to navigation Jump to search

GlyGen is developing a new tool in 2.9 (release at the end of August 2025) that allows enriching tables of proteins/genes with additional information derived from GlyGen. The tool will map user provided protein identifiers (UniProtKB Accession, Protein Names, RefSeq ID, and Gene symbol) with canonical protein accession and add user requested information in the form of additional columns to these identifiers. The generated table can be downloaded as an Excel like file and can be used to merge the additional information into an existing data table. As an example, this can be used to enrich protein tables generated by mass spec software tools with additional information to help interpret and preprocess the data. e.g., by adding columns that identify the location of the protein (subcellular, membrane-bound) or the functional classification of proteins (kinases, transporters ...).

Please see examples below:

1. Annotate UniProtKB Accession list with GO Name -> membrane.

Input Table
UniProtKB Accession
Q8NE71-1
O60220-1
Q9NX14-1
Q14699-1
O75832-1
K7GQZ4-1
P55058-1
Output Table
UniProt Accession GO Name
Q8NE71-1 membrane
O60220-1 membrane
Q9NX14-1 membrane
Q14699-1 membrane
O75832-1 membrane
K7GQZ4-1
P55058-1

2. Annotate UniProtKB Accession list with GO ID -> GO:0016020.

Input Table
UniProtKB Accession
Q86UP2-1
P08575-3
Q13308-1
O95196-1
Q8NC56-1
K7GQZ4-1
P55058-1
Output Table
UniProt Accession GO Name
Q8NE71-1 GO:0016020
O60220-1 GO:0016020
Q9NX14-1 GO:0016020
Q14699-1 GO:0016020
O75832-1 GO:0016020
K7GQZ4-1
P55058-1