realign outputs
Overview of output structure
The SAW realign
pipeline runs in a directory named by --id
(or by --sn
in the absence of --id
). Output files are classified into several folders, in the outs/
directory.
The exact output files generated from the analysis depend on:
the version of SAW used
which pipeline was used,
SAW count
orSAW realign
whether input the microscope image(s)
the specific parameters added to the analysis
Spatial Gene Expression
After performing SAW realign
on Stereo-seq T FF and Stereo-seq N FFPE kits, the following files can be found under the outs/
directory:
bam/
Files in BAM format.
annotated_bam/
BAM file after alignment and annotation.
<SN>.*.bam
Indexed BAM file containing position-sorted reads mapped to CIDs, aligned to the genome, and annotated with GTF/GFF.
<SN>.*.bam.csi
Index for <SN>.*.bam
.
image/
Images are generated from automatic or manual workflows.
<SN>_<stainType>_regist.tif
The panoramic image aligned with raw.gef
matrix.
<SN>_<stainType>_tissue_cut.tif
The tissue segmentation image, based on the aligned panoramic image.
<SN>_<stainType>_mask.tif
The cell segmentation image, based on the aligned panoramic image.
<SN>_<stainType>_mask_edm_dis_<distance>.tif
The adjusted image, based on the cell segmentation image
feature_expression/
Feature expression matrices in HDF5 format at different dimensions.
<SN>.raw.gef
Feature expression matrix includes the whole information over a complete chip region. It only has bin1 expression counts.
<SN>.tissue.gef
Feature expression matrix under the tissue coverage region. It is also a visualization GEF which includes expression counts for bin1, 5, 10, 20, 50, 100, 150, 200.
<SN>.cellbin.gef
Cellbin feature expression matrix records the information of cells individually, including the centroid coordinate, boundary coordinates, expression of genes, and cell area.
<SN>.adjusted.cellbin.gef
Cellbin expression matrix with cell border expanding, based on <SN>_<stain_type>_mask_edm_dis_<distance>.tif
.
<SN>.merge.barcodeReadsCount.txt
A mapped CID list file with read counts for each CID, including three columns (x, y, count).
<SN>_raw_barcode_gene_exp.txt
An annotated list file with the information of coordinate, gene, MID, read counts, which is prepared to be a sampling file that performs sequence saturation.
analysis/
Secondary analysis files.
<SN>.bin200_1.0.h5ad
An AnnData H5AD records preprocessing, filtering, normalization, dimensionality reduction, clustering and differential expression analysis, based on <SN>.tissue.gef
.
This output H5AD is named in the format of <SN>.<binN>_<leiden_res>.h5ad
. In the file name, <SN>
stands for the Stereo-seq chip serial number, <N>
for bin size, and <leiden_res>
for the resolution of Leiden clustering.
bin200_marker_features.csv
Format-integrated differential expression analysis results, using <SN>.tissue.gef
of bin200.
<SN>.cellbin_1.0.h5ad
An AnnData H5AD records preprocessing, filtering, normalization, dimensionality reduction, clustering and differential expression analysis, using <SN>.cellbin.gef
.
cellbin_marker_features.csv
Format-integrated differential expression analysis results, using <SN>.cellbin.gef
.
<SN>.cellbin_1.0.adjusted.h5ad
An AnnData H5AD records preprocessing, filtering, normalization, dimensionality reduction, clustering and differential expression analysis, using <SN>.adjusted.cellbin.gef
.
cellbin_adjusted_marker_features.csv
Format-integrated differential expression analysis results, using <SN>.adjusted.cellbin.gef
.
<SN>.report.tar.gz
Analysis summary report of metrics and plots in HTML format.
report.html
HTML file, involved in <SN>.report.tar.gz
.
visualization.tar.gz
StereoMap visualization file to presentation and manual processing.
<SN>.stereo
A manifest file in JSON format includes experiment and pipeline information, basic analysis statistics, and references to image and spatial matrix files in the SAW output visualization file folder.
visualization.tar.gz
visualization.tar.gz
The compressed visualization TAR file integrates all the output results needed by StereoMap for visualization. The contents of an unpacked one are listed:
.stereo
.stereo
.stereo
is a manifest file in JSON format that records
information about the task in SAW pipelines,
information of the tissue sample,
basic analysis statistics,
records of image files and expression data for StereoMap exploration.
*More details about these files can be found in other parts of Outputs.
Last updated