Extended GAL File: Difference between revisions

From GlyGen Wiki
Jump to navigation Jump to search
Line 40: Line 40:
The number of drops or times pin contacts made per spot (drops / contacts / strikes), for example 3 drops and 1 contact. If the dispenses column is empty, the default value will be used. Users can use ''#N/A'' to indicate that the dispense information is “not available”.
The number of drops or times pin contacts made per spot (drops / contacts / strikes), for example 3 drops and 1 contact. If the dispenses column is empty, the default value will be used. Users can use ''#N/A'' to indicate that the dispense information is “not available”.


===Carrier===
===Carrier ''(option)''===
Carrier reagent name used in the formulation. If the default carrier is defined in the header, the empty carrier column is considered as the default. ''#N/A'' can be used for indicating that the carrier information is “not available”.
 
===Method===
===Method===
===Reference===
===Reference===

Revision as of 18:08, 13 December 2021

The extended GAL file is a file format that upgrades the GAL file with additional information required to exchange glycan array slide layout data and metadata. The extended GAL is compatible with the normal GAL file allowing the use in the scanner software as well as for the data exchange.

Motivation

While for gene arrays the 5 columns in the GAL file are sufficient to describe location and identify of the feature/spot it is not for glycan array. The molecule printed on the spot can not be simply described by an identifier or name alone since it consists of multiple parts. For example a linker and a glycan or a protein, up to multiple linkers and multiple glycans. It is also possible to have mixtures of glycoconjugates on a single spot. In addition to use the file for exchanging minimum reporting requirements it is also necessary to store depended metadata for each spot as well. To compensate the shortcomings of a GAL file an extended version was developed which is compatible with the original file and its use in an array experiment.

Format

The extended GAL file follows the format of the GAL file but adds additional columns and rows to header and record section.

Record section

For the ID column of the file the feature ID from the array data repository are used. If multiple features have been printed on the same spot these IDs are separated by "||". The repsotiory feature ID describes a molecule completely including all its parts (glycans, linker, protein etc.).

Beyond the common columns of the GAL file several additional columns have been added to the extented GAL file to allow capturing of glycan information and metadata in the file. The columns are added after the ID column of the GAL file in the described order.


Group (optional)

Identification number of features in this array. The average intensity value is calculated based on the intensity values of the feature, which have the same group number. For example, this is used for identifying features which printing solutions were prepared in the different dates, which were printed in the different dates (such as reprinting) etc.

Concentration

This is the concentration of each feature on the spot. The default concentration value will be used for the empty concentration column. #N/A can be used to represent "not available" concentration value.

Available units of the concentration are:

  • fmol/spot
  • ul/spot
  • mM
  • uM
  • ug/ml
  • mg/ml

Ratio (optional)

If more than one feature is printed on the spot, this column is used to specify the ratio between the features (e.g., 1:0.1 or 2:2:1). The order of the ration corresponds to the order of features in the ID column. If more than one feature is printed and the ration column is empty the ratio is considered as unknown. If only one feature is printed an empty column is equivalent to 100%.

Buffer

This is the buffer composition of the solution used for printing. If this is empty, the default buffer composition information will be used. #N/A can be used to indicate that the buffer composition information is “not available”.

Volumne (optional)

The volume of solution deposited per spot per drop (if known). The default volume will be used for the empty volume column. #N/A can be used for representing “unknown” or “not available” volume. Available units of the volume are uL, nL and pL.

Dispenses

The number of drops or times pin contacts made per spot (drops / contacts / strikes), for example 3 drops and 1 contact. If the dispenses column is empty, the default value will be used. Users can use #N/A to indicate that the dispense information is “not available”.

Carrier (option)

Carrier reagent name used in the formulation. If the default carrier is defined in the header, the empty carrier column is considered as the default. #N/A can be used for indicating that the carrier information is “not available”.

Method

Reference

Comment (optional)

Indentification number of features in this array. The intensity values of the features, which have the same group number, are used for calculating the average intensity value.

PrintingFlags