BAM
A BAM file is in binary format for saving sequence alignment and gene annotation data. SAW count
BAM adds custom tags in the BAM optional field to record read coordinates, CID and MID information. Annotation information is added to BAM in the tag field.
Tags
Custom tags are described in BAM custom tags.
Cx:i
x coordinate of the Coordiante ID.
Cy:i
y coordinate of the Coordiante ID.
UR:Z
The hexadecimal representation of uncorrected binary-encoded MID.
XF:i
Mapping region on the reference genome. Valid value: 0=EXONIC, 1=INTRONIC, 2=INTERGENIC, 3=rRNA.
GI:Z
Annotated gene ID.
GE:Z
Annotated gene name.
GS:Z
‘+’ or ‘-’, indicating forward/reverse strand respectively.
UB:Z
The hexadecimal representation of count corrected binary-encoded MID.
Example of the raw BAM:
Example of the annotated BAM:
Statistics for alignment
After alignment of FASTQ reads, a statistic file, recording details and output information will be saved in /STEREO_ANALYSIS_WORKFLOW/ALIGNMENT/<lane>.CIDMap.stat
.
Number of CID in chip mask
Number of CIDs in the chip mask file
Number of unique CID in FASTQ
Number of unique CIDs in FASTQs
Number of total reads
Number of total reads in FASTQs
Q10 in CID %
Ratio of Q10 CID bases
Q20 in CID %
Ratio of Q20 CID bases
Q30 in CID %
Ratio of Q30 CID bases
Number of mapped CID
Number of reads mapped to CID
% of mapped CID
Ratio of reads mapped to CID
Number of exactly mapped CID
Number of reads exactly mapped to CID
% of exactly mapped CID
Ratio of reads exactly mapped to CID
Number of CID with mismatch
Number of reads mapped to CID with mismatch
% of CID with mismatch
Ratio of reads mapped to CID with mismatch
Q10 in RNA %
Ratio of Q10 RNA bases
Q20 in RNA %
Ratio of Q20 RNA bases
Q30 in RNA %
Ratio of Q30 RNA bases
Number of reads with polyA
Number of reads with polyA sequence
% of reads with polyA
Ratio of reads with polyA sequence
Number of short reads (trim polyA)
Number ot short reads after trimming polyA sequence
% of short reads (trim polyA)
Ration ot short reads after trimming polyA sequence
Number of reads with adapter
Number of reads with adapter sequence
% of reads with adapter
Ration of reads with adapter sequence
Number of short reads (trim adapter)
Number of short reads after trimming adapter sequence
% of short reads (trim adapter)
Ratio of short reads after trimming adapter sequence
Number of reads filtered with DNB
Number of reads with DNB sequence
% of reads filtered with DNB
Ratio of reads with DNB sequence
Q10 in clean RNA %
Ratio of Q10 RNA bases after filtering
Q20 in clean RNA %
Ratio of Q20 RNA bases after filtering
Q30 in clean RNA %
Ratio of Q30 RNA bases after filtering
Q10 in MID %
Ratio of Q10 MID bases
Q20 in MID %
Ratio of Q20 MID bases
Q30 in MID %
Ratio of Q30 MID bases
Number of low quality MID
Number of MID with low quality bases
% of low quality MID
Ratio of MID with low quality bases
Number of MID with N
Number of MID with N base
% of MID with N
Ratio of MID with N base
Number of MID in specific sequence
Number of MID mapped to specific sequences
% of MID with specific sequence
Ratio of MID mapped to specific sequences
Q10 in clean MID %
Ratio of Q10 MID bases after filtering
Q20 in clean MID %
Ratio of Q20 MID bases after filtering
Q30 in clean MID %
Ratio of Q30 MID bases after filtering
Number of exact MID
Number of reads exactly mapped to MID
% of exact MID
Ratio of reads exactly mapped to MID
Number of inexact MID
Number of reads inexactly mapped to MID
% of inexact MID
Ratio of reads inexactly mapped to MID
Statistics for annotation
After annotation of reads, a statistic file, recording details and output information, will be saved in /STEREO_ANALYSIS_WORKFLOW/ANNOTATION/*.bam.summary.stat
.
Number of total reads
Number for total reads aligned to genome
Number of reads to be annotated
Number of reads that will be annotated with GTF/GFF annotation database
% of reads to be annotated
% of reads that will be annotated with GTF/GFF annotation database
Number of uniquely mapped reads to be annotated
Number of reads to be annotated which are uniquely mapped to genome
% of uniquely mapped reads to be annotated
Ratio of reads to be annotated which are uniquely mapped to genome
Number of multi-mapped reads to be annotated
Number of reads to be annotated which are multi-mapped to genome
% of multi-mapped reads to be annotated
Ratio of reads to be annotated which are multi-mapped to genome
Number of multi-mapped reads
Number of reads multi-mapped to genome
Number of reads mapped to transcriptome
Number of reads mapped to transcriptome, including exon and intron regions.
% of reads mapped to transcriptome
% of reads mapped to transcriptome, including exonic and intronic regions.
Number of unique captures (on CID, gene and MID)
Number of unique captures for reads, based on CID, gene and MID information
% of unique captures (on CID, gene and MID)
% of unique captures for reads, based on CID, gene and MID information
Number of duplicated reads
Number of duplicated captures for reads, based on CID, gene and MID information
% of duplicated reads
% of duplicated captures for reads, based on CID, gene and MID information
Number of reads to be annotated
Number of reads that will be annotated with GTF/GFF annotation database
Number of reads mapped to exonic regions
Number of reads mapped to exonic regions
% of reads mapped to exonic regions
% of reads mapped to exonic regions
Number of reads mapped to intronic regions
Number of reads mapped to intronic regions
% of reads mapped to intronic regions
% of reads mapped to intronic regions
Number of reads mapped to intergenic regions
Number of reads mapped to intergenic regions
% of reads mapped to intergenic regions
% of reads mapped to intergenic regions
Number of reads mapped antisense to gene
Number of reads mapped antisense to gene
% of reads mapped antisense to gene
% of reads mapped antisense to gene
Number of reads mapped to rRNA
Numder of reads mapped to rRNA regions
Number of rRNA reads in uniquely mapped
Numder of uniquely mapped reads mapped to rRNA regions
% of rRNA reads in uniquely mapped
% of uniquely mapped reads mapped to rRNA regions
Number of rRNA reads in multi-mapped
Numder of multi-mapped reads mapped to rRNA regions
% of rRNA reads in multi-mapped reads
% of multi-mapped reads mapped to rRNA regions
Last updated