Batch retrieval tool

From GlyGen Wiki
Revision as of 20:48, 9 May 2025 by Mazumder (talk | contribs)
Jump to navigation Jump to search

GlyGen is developing a new tool in 2.9 (release at the end of August 2025) that allows enriching tables of proteins/genes with additional information derived from GlyGen. The tool will map user provided protein identifiers (UniProtKB Accession, Protein Names, RefSeq ID, and Gene symbol) with canonical protein accession and add user requested information in the form of additional columns to these identifiers. The generated table can be downloaded as an Excel like file and can be used to merge the additional information into an existing data table. As an example, this can be used to enrich protein tables generated by mass spec software tools with additional information to help interpret and preprocess the data. e.g., by adding columns that identify the location of the protein (subcellular, membrane-bound) or the functional classification of proteins (kinases, transporters ...).

Please see annotation examples below:

1. Annotate UniProtKB Accession list with GO Name -> membrane.

Input List
UniProtKB Accession
Q8NE71
O60220-1
Q9NX14-1
Q14699
Q86UP2-1
Q96EK5-1
P55058-1
XYZXX-1
Output Table
UniProtKB Accession UniProtKB mapping GO Name
Q8NE71-1 Yes membrane
O60220-1 Yes membrane
Q9NX14-1 Yes membrane
Q14699-1 Yes membrane
Q86UP2-1 Yes membrane
Q96EK5-1 Yes
P55058-1 Yes
XYZXX-1 NO

2. Annotate UniProtKB Accession list with GO ID -> GO:0016020.

Input List
UniProtKB Accession
Q86UP2-1
P08575-3
Q13308-1
O95196-1
Q8NC56-1
K7GQZ4-1
P55058-1
Output Table
UniProtKB Accession GO ID
Q86UP2-1 GO:0016020
P08575-3 GO:0016020
Q13308-1 GO:0016020
O95196-1 GO:0016020
Q8NC56-1 GO:0016020
K7GQZ4-1
P55058-1