Batch retrieval tool: Difference between revisions

From GlyGen Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
GlyGen is developing a new tool in 2.9 (release at the end of August 2025) that allows enriching tables of proteins/genes with additional information derived from GlyGen. The tool will map user provided protein identifiers (UniProt Accession, Protein Names, RefSeq ID, and Gene symbol) with canonical protein accession and add user requested information in the form of additional columns to these identifiers. The generated table can be downloaded as an Excel like file and can be used to merge the additional information into an existing data table. As an example, this can be used to enrich protein tables generated by mass spec software tools with additional information to help interpret and preprocess the data. e.g., by adding columns that identify the location of the protein (subcellular, membrane-bound) or the functional classification of proteins (kinases, transporters ...).
GlyGen is developing a new tool in 2.9 (release at the end of August 2025) that allows enriching tables of proteins/genes with additional information derived from GlyGen. The tool will map user provided protein identifiers (UniProtKB Accession, Protein Names, RefSeq ID, and Gene symbol) with canonical protein accession and add user requested information in the form of additional columns to these identifiers. The generated table can be downloaded as an Excel like file and can be used to merge the additional information into an existing data table. As an example, this can be used to enrich protein tables generated by mass spec software tools with additional information to help interpret and preprocess the data. e.g., by adding columns that identify the location of the protein (subcellular, membrane-bound) or the functional classification of proteins (kinases, transporters ...).


Examples:
Please see annotation examples below:


1. Tool input
1. Annotate UniProtKB Accession list with GO Name -> membrane.


<div style=display:inline-grid>
<div style=display:inline-grid>
{| class="wikitable"
{| class="wikitable"
|+ Input Table
|+ Input List
|-
|-
! UniProt Accession
! UniProtKB Accession
|-
|-
| Q8NE71-1
| Q8NE71
|-
|-
| O60220-1
| O60220-1
Line 17: Line 17:
| Q9NX14-1
| Q9NX14-1
|-
|-
| Q14699-1
| Q14699
|-
|-
| O75832-1
| Q86UP2-1
|-
|-
| K7GQZ4-1
| Q96EK5-1
|-
|-
| P55058-1
| P55058-1
|-
| XYZXX-1
|}
|}
</div>
</div>
Line 35: Line 37:
|+ Output Table
|+ Output Table
|-
|-
! UniProt Accession
! UniProtKB Accession
! UniProtKB Mapping
! GO Name
! GO Name
|-
|-
| Q8NE71-1
| Q8NE71-1
| Yes
| membrane
| membrane
|-
|-
| O60220-1
| O60220-1
| Yes
| membrane
| membrane
|-
|-
| Q9NX14-1
| Q9NX14-1
| Yes
| membrane
| membrane
|-
|-
| Q14699-1
| Q14699-1
| Yes
| membrane
| membrane
|-
|-
| O75832-1
| Q86UP2-1
| Yes
| membrane
| membrane
|-
| Q96EK5-1
| Yes
|
|-
| P55058-1
| Yes
|
|-
| XYZXX-1
| NOT FOUND
|
|}
</div>
2. Annotate UniProtKB Accession list with GO ID -> GO:0016020.
<div style=display:inline-grid>
{| class="wikitable"
|+ Input List
|-
! UniProtKB Accession
|-
| Q86UP2-1
|-
| P08575-3
|-
| Q13308-1
|-
| O95196-1
|-
| Q8NC56-1
|-
| K7GQZ4-1
|-
| P55058-1
|}
</div>
<div style="display:inline-grid">
{| style="vertical-align:center"
| ⇒
|}
</div>
<div style=display:inline-grid>
{| class="wikitable"
|+ Output Table
|-
! UniProtKB Accession
! UniProtKB Mapping
! GO ID
|-
| Q86UP2-1
| Yes
| GO:0016020
|-
| P08575-3
| Yes
| GO:0016020
|-
| Q13308-1
| Yes
| GO:0016020
|-
| O95196-1
| Yes
| GO:0016020
|-
| Q8NC56-1
| Yes
| GO:0016020
|-
|-
| K7GQZ4-1
| K7GQZ4-1
|  
| Yes
|
|-
|-
| P55058-1
| P55058-1
|  
| Yes
|
|}
|}
</div>
</div>

Latest revision as of 22:22, 9 May 2025

GlyGen is developing a new tool in 2.9 (release at the end of August 2025) that allows enriching tables of proteins/genes with additional information derived from GlyGen. The tool will map user provided protein identifiers (UniProtKB Accession, Protein Names, RefSeq ID, and Gene symbol) with canonical protein accession and add user requested information in the form of additional columns to these identifiers. The generated table can be downloaded as an Excel like file and can be used to merge the additional information into an existing data table. As an example, this can be used to enrich protein tables generated by mass spec software tools with additional information to help interpret and preprocess the data. e.g., by adding columns that identify the location of the protein (subcellular, membrane-bound) or the functional classification of proteins (kinases, transporters ...).

Please see annotation examples below:

1. Annotate UniProtKB Accession list with GO Name -> membrane.

Input List
UniProtKB Accession
Q8NE71
O60220-1
Q9NX14-1
Q14699
Q86UP2-1
Q96EK5-1
P55058-1
XYZXX-1
Output Table
UniProtKB Accession UniProtKB Mapping GO Name
Q8NE71-1 Yes membrane
O60220-1 Yes membrane
Q9NX14-1 Yes membrane
Q14699-1 Yes membrane
Q86UP2-1 Yes membrane
Q96EK5-1 Yes
P55058-1 Yes
XYZXX-1 NOT FOUND

2. Annotate UniProtKB Accession list with GO ID -> GO:0016020.

Input List
UniProtKB Accession
Q86UP2-1
P08575-3
Q13308-1
O95196-1
Q8NC56-1
K7GQZ4-1
P55058-1
Output Table
UniProtKB Accession UniProtKB Mapping GO ID
Q86UP2-1 Yes GO:0016020
P08575-3 Yes GO:0016020
Q13308-1 Yes GO:0016020
O95196-1 Yes GO:0016020
Q8NC56-1 Yes GO:0016020
K7GQZ4-1 Yes
P55058-1 Yes