GlycoSiteMiner FAQ: Difference between revisions
(2 intermediate revisions by the same user not shown) | |||
Line 5: | Line 5: | ||
=== How to identify glycosylation sites predicted from GSM on GlyGen? === | === How to identify glycosylation sites predicted from GSM on GlyGen? === | ||
You can check if a protein has predicted glycosylation sites by visiting its protein details page and selecting the '''Glycosylation''' tab from the left-hand menu. This will take you to the Glycosylation section, where several sub-tabs are available. Predicted site information is located in the '''Text Mining''' tab, which lists all predicted glycosylation sites along with details about the tool used to generate each prediction. | You can check if a protein has predicted glycosylation sites by visiting its protein details page and selecting the '''Glycosylation''' tab from the left-hand menu. This will take you to the [https://glygen.org/protein/P48048-1#Glycosylation Glycosylation section], where several sub-tabs are available. Predicted site information is located in the '''Text Mining''' tab, which lists all predicted glycosylation sites along with details about the tool used to generate each prediction. | ||
=== How to use the GlycoSiteMiner curation tool? === | |||
The curation tool is only available to GlyGen and GlySpace team members and class/workshop attendees. You can log in to the tool [https://data.glygen.org/glycositeminer here]. Request login credentials from GlyGen team member. Once you are logged in, you will see three files: high_confidence_sites.csv, medium_confidence_sites.csv, low_confidence_sites.csv. Click on one of the files and start curating the LLM/ML results. The same abstract can be reviewed by multiple people. This allows us to compare their results, resolve differences, and decide which abstracts need extra attention. |
Latest revision as of 21:02, 11 September 2025
The frequently asked questions are a collection of user questions related to the GlyGen GlycoSiteMiner tool.
What is GlycoSiteMiner?
GlycoSiteMiner (GSM) is an automated literature-mining pipeline designed to extract experimentally verified, protein sequence–specific glycosylation sites from PubMed abstracts. Using advanced ML/AI algorithms, it filters out false positives and ensures data accuracy. Applied to over 33 million PubMed abstracts, GlycoSiteMiner has uncovered over a thousand new sequence-specific glycosylation sites that were not previously available in the GlyGen resource. For details, see GlycoSiteMiner: an ML/AI-assisted literature mining-based pipeline for extracting glycosylation sites from PubMed abstracts.
How to identify glycosylation sites predicted from GSM on GlyGen?
You can check if a protein has predicted glycosylation sites by visiting its protein details page and selecting the Glycosylation tab from the left-hand menu. This will take you to the Glycosylation section, where several sub-tabs are available. Predicted site information is located in the Text Mining tab, which lists all predicted glycosylation sites along with details about the tool used to generate each prediction.
How to use the GlycoSiteMiner curation tool?
The curation tool is only available to GlyGen and GlySpace team members and class/workshop attendees. You can log in to the tool here. Request login credentials from GlyGen team member. Once you are logged in, you will see three files: high_confidence_sites.csv, medium_confidence_sites.csv, low_confidence_sites.csv. Click on one of the files and start curating the LLM/ML results. The same abstract can be reviewed by multiple people. This allows us to compare their results, resolve differences, and decide which abstracts need extra attention.