Exclusion list file

From GlyGen Wiki
Jump to navigation Jump to search

The exclusion list file is used to provide a list of features that where excluded from raw data when generating the processed data. It also provides the reason for excluding the feature. The file is an Excel table that may consist of multiple sheet. However one sheet has to be labeled "Technical" and another "Filtered data". Both sheets have to follow the format instructions below. Only these sheet will be used to extract data from and load them to the repository.

File Format

The exclusion list file is a standard Excel spreadsheet in *.xlsx format. It must contain at least two sheets labeled "Technical" and "Filtered data". These two sheets will be used to extract the exclusion information. Any other sheet will be ignored.

Technical exclusion

Example of the exclusion list for technical reasons.

The sheet labeled "Technical" includes the exclusion that have been removed due to technical reasons. There are three static columns "Spot issues", "Artifact" and "Missing spot" that are required. If none of the features fits these categories or if no features were filtered for technical reasons these columns can be left empty except for the headline. If features are filtered out based on these reasons they are list line by line in the column. Features are specified by using their "Internal ID" giving during the feature creation. After the three static columns additional columns can be added if other reasons are applicable. For these reason a short description (no more than 1024 characters) should be given in the headline.

Spot issues Signals from misprinted or misshapen spot.

Artifact Signals caused by defect on slide (Artifact on slide).

Missing spot Lack of signals from a probe missed (Missing spots due to the printer faulty).

Filtered data

Example of the exclusion list for user defined reasons.

The sheet labeled "Filtered data" includes the exclusion that have been removed due to user decisions. There are three static columns "Probe unqualified", "Unrelated feature" and "Lack of Signals" that are required. If none of the features fits these categories or if no features were filtered these columns can be left empty except for the headline. If features are filtered out based on these reasons they are list line by line in the column. Features are specified by using their "Internal ID" giving during the feature creation. After the three static columns additional columns can be added if other reasons are applicable. For these reason a short description (no more than 1024 characters) should be given in the headline.

Probe unqualified Signals from a questionable probe (Probe did not pass QC).

Unrelated feature Glycan probes of unrelated studies (Signals from probes of unrelated studies).

Lack of Signals Lack of signals from non-arrayed area on slide (Empty).