SAW commands
To display descriptions of a list of subcommands, run saw --help | -h
. Check the software version with saw --version
.
The snakemake is used for procedure construction and performing analysis.
SAW count
Count gene expression reads and generate expression matrices from the Stereo-seq chip.
Usage: saw count [Parameters] --id <ID> --sn <SN> --omics <OMICS> --kit-version <TEXT> --sequencing-type <TEXT>--reference <PATH> --image <IMG> --fastqs <PATH>
saw count --h | --help
--id <ID>
(Optional, default to None) A unique task id ([a-zA-Z0-9_-]+) which will be displayed as the output folder name and the title of HTML report. If the parameter is absent, --sn
will play the same role.
--sn <SN>
(Required, default to None) SN (serial number) of the Stereo-seq chip.
--omics <OMICS>
(Required, default to "transcriptomics") Omics information.
--kit-version <TEXT>
--sequencing-type <TEXT>
(Required, default to None) Sequencing type of FASTQs which is recorded in the sequencing report.
--chip-mask <MASK>
(Required, default to None) Stereo-seq chip mask file.
--organism <TEXT>
(Optional, default to None) Organism type of sample, usually referring to species.
--tissue <TEXT>
(Optional, default to None) Physiological tissue of sample.
--reference <PATH>
--ref-libraries <CSV>
--fastqs <PATH>
(Required, default to None) Path(s) to folder(s), containing all needed FASTQs. If FASTQs are stored in multiple directories, use it as: --fastqs=/path/to/directory1,/path/to/directory2,...
. Notice that all FASTQ files under these directories will be loaded for analysis.
--microorganism-detect
(Optional, default to None) Whether to perform analysis related to microorganisms. Notice that the detection only works for FFPE assay currently.
--uniquely-mapped-only
(Optional, default to None) Only annotate on uniquely mapped reads during read annotation.
--rRNA-remove
--clean-reads-fastq
(Optional, default to None) Whether to output the Clean Reads (before RNA alignment) in FASTQ format, which have undergone CID mapping, RNA filtering, and MID filtering.
--unmapped-STAR-fastq
(Optional, default to None) Whether to output unmapped reads in FASTQ format.
--unmapped-fastq
(Optional, default to None) Whether to output unmapped reads in FASTQ format (not including "too many loci" reads from STAR).
--image <TIFF>
(Optional, default to None) TIFF image for QC (quality control), combined with expression matrix for analysis.
Name rule for input TIFF :
a. <SN>_<stain_type>.tif
b. <SN>_<stain_type>.tiff
c. <SN>_<stain_type>.TIF
d. <SN>_<stain_type>.TIFF
<stainType> includes:
a. ssDNA
b. DAPI
c. HE (referring to H&E)
d. <IF_name1>_IF, <IF_name2>_IF, ...
--image-tar <TAR>
(Optional, default to None) The compressed image .tar.gz
file from StereoMap has been through prepositive QC (quality control).
--output <PATH>
(Optional, default to None) Set a specific output directory for the run.
--threads-num <NUM>
(Optional, default to 8) Allowed local cores to run the pipeline.
--memory <NUM>
(Optional, default to detected) Allowed local memory to run the pipeline.
--gpu-id <NUM>
(Optional, default to -1) Set GPU id, according to GPU resources in the computing environment. Default to -1
, which means running the pipeline using the CPU.
-h, --help
(Optional, default to None) Print help information.
SAW makeRef
Prepare for a reference used in SAW count
. GTF/GFF and FASTA files or additional specific rRNA FASTA files are needed.
Usage: saw makeRef [Parameters] --mode <MODE> --fasta <FASTA> --gtf <GTF/GFF> --genome <PATH>
saw makeRef -h | --help
--mode <MODE>
(Required, default to "STAR") Set the mode to build index files, used for the alignment. There are three modes, including STAR, Bowtie2 and Kraken2 for specific analysis scenarios.
--fasta <FASTA>
(Optional, default to None) Path to FASTA, to build index files. When it comes to multiple FASTAs, they will be integrated in order of input beforehand.
--rRNA-fasta <FASTA>
(Optional, default to None) Path to rRNA FASTA that will be added to --fasta
file, with the elimination of redundant rRNA fragments.
--gtf <GTF/GFF>
(Optional, default to None) Path to input GTF/GFF to build index files.
--basename <TEXT>
(Optional, default to "host") Basename for Bowtie2 index files when set mode=Bowtie2
. If not specified, "host" will be used, which straightforwardly means removing host information in the next step.
--database <DATABASE>
(Optional, default to None) Path to Kraken2 reference database. If the parameter works, output index files will be saved in the same directory level.
--genome <PATH>
(Optional, default to detected) Path to the output reference genome with index information.
--params-csv <CSV>
--threads-num <INT>
(Optional, default to 8) Set the number of threads to use.
-h, --help
(Optional, default to None) Print help information.
SAW checkGTF
Check whether an annotation file (GTF/GFF) is in the standard format, used in SAW count
. In addition, extract specific information from GTF/GFF.
Usage: saw checkGTF [Parameters] --input-gtf <GTF/GFF> --attribute <key:value> --output-gtf <GTF/GFF>
saw checkGTF -h | --help
--input-gtf <GTF/GFF>
(Required, default to None) Path to input GTF/GFF, for a necessary format check.
--attribute <key:value>
(Optional, default to None) Extract specific annotation information from GTF/GFF. Input as <gene_biotype:protein_coding>.
--output-gtf <GTF/GFF>
(Required, default to None) Path to output GTF/GFF after a necessary check, or additional filtration when performing --attribute
.
-h, --help
(Optional, default to None) Print help information.
SAW realign
Accept the manually processed compressed image file to restart the analysis with adjusted images. In the absence of images, lasso GeoJSON from StereoMap will be available.
Usage: saw realign [Parameters] --id <ID> --sn <SN> --count-data <PATH> --realigned-image-tar <TAR>
saw realign -h | --help
-id <ID>
(Optional, default to None) A unique task id ([a-zA-Z0-9_-]+) which will be displayed as the output folder name and the title of HTML report. If the parameter is absent, --sn
will play the same role.
--sn <SN>
(Required, default to None) SN (serial number) of the Stereo-seq chip.
--count-data <PATH>
(Required, default to None) Output folder of the corresponding SAW count
result, which mainly contains the expression matrices and other related datasets.
--realigned-image-tar <TAR>
(Required, default to None) Compressed image file from StereoMap, which has been manually processed, including stitching, tissue segmentation, cell segmentation, calibration and registration.
--lasso-geojson <GEOJSON>
(Optional, default to None) Lasso GeoJSON from StereoMap is used for tissue segmentation when the analysis is without images. It is incompatible with --realigned-image-tar
.
--adjusted-distance <INT>
(Optional, default to 10) Outspread distance based on the cellular contour of the cell segmentation image, in pixels. Default to 10. If --adjusted-distance=0
, the pipeline will not expand the cell border.
--no-matrix
(Optional, default to None) Whether to output feature expression matrices.
--no-report
(Optional, default to None) Whether to output HTML report.
--output <PATH>
(Optional, default to None) Set a specific output directory for the run.
--threads-num <NUM>
(Optional, default to 8) Set the number of threads to use.
-h, --help
(Optional, default to None) Print help information.
SAW reanalyze
Perform secondary analysis, including clustering, differential expression analysis and lasso.
Usage: saw reanalyze [Parameters] --gef <GEF> --bin-size <INT> --marker --output <PATH>
saw reanalyze -h | --help
--gef <GEF>
(Optional, default to None) Input bin GEF file for analysis.
--cellbin-gef <GEF>
(Optional, default to None) Input cellbin GEF file for analysis.
--bin-size <INT or LIST>
(Optional, default to 200) Bin size for analysis.
--Leiden-resolution <FLOAT>
(Optional, default to 1.0) The resolution parameter controls the coarseness of the clustering when performing Leiden. Higher values lead to more clusters.
--marker
(Optional, default to None) Whether to perform differential expression analysis.
--count-data <PATH>
(Optional, default to None) Output folder of the corresponding SAW count
result, which mainly contains the expression matrices and other related datasets.
--diffexp-geojson <GEOJSON>
(Optional, default to None) GeoJSON from StereoMap to analyze differential expression.
--lasso-geojson <GEOJSON>
(Optional, default to None) GeoJSON from StereoMap to lasso sub expression matrices of targeted regions.
--output <PATH>
(Optional, default to None) Path to the output folder, to save analysis results.
--threads-num <NUM>
(Optional, default to 8) Set the number of threads to use.
-h, --help
(Optional, default to None) Print help information.
SAW convert
Carry out file format conversions. There are several modules under the pipeline to implement the analysis.
Usage: saw convert gef2gem [Parameters] --gef <GEF> --bin-size <INT> --marker --gem <GEM>
saw convert -h | --help
--threads-num <NUM>
(Optional, default to 8) Set the number of threads to use.
-h, --help
(Optional, default to None) Print help information.
Matrix related
gef2gem
gef2gem
--gef <GEF>
(Required, default to None) Path to input bin GEF file.
--bin-size <INT>
(Optional, default to 1) Bin size used during conversion.
--cellbin-gef <GEF>
(Optional, default to None) Path to input cellbin GEF file.
--gem <GEM>
(Optional, default to None) Path to output GEM file.
--cellbin-gem <GEM>
(Optional, default to None) Path to output cellbin GEM file.
gem2gef
gem2gef
--gem <GEM>
(Optional, default to None) Path to input GEM file.
--gef <GEF>
(Optional, default to None) Path to output bin GEF file.
--cellbin-gem <GEM>
(Optional, default to None) Path to input cellbin GEM file.
--cellbin-gef <GEF>
(Optional, default to None) Path to output cellbin GEF file.
bin2cell
bin2cell
--gef <GEF>
(Required, default to None) Path to input bin GEF file.
--image <TIFF>
(Required, default to None) Path to the image of cell segmentation.
--cellbin-gef <GEF>
(Required, default to None) Path to output cellbin GEF file.
--cellbin-gem <GEM>
(Optional, default to None) Path to output cellbin GEM file.
gef2h5ad
gef2h5ad
--gef <GEF>
(Optional, default to None) Path to input bin GEF file.
--bin-size <INT>
(Optional, default to 20) Bin size used during conversion.
--cellbin-gef <GEF>
(Optional, default to None) Path to input cellbin GEF file.
--h5ad <H5AD>
(Required, default to None) Path to output AnnData H5AD file.
gem2h5ad
gem2h5ad
--gem <GEM>
(Optional, default to None) Path to input GEM file.
--bin-size <INT>
(Optional, default to 20) Bin size used during conversion.
--cellbin-gem <GEM>
(Optional, default to None) Path to input cellbin GEM file.
--h5ad <H5AD>
(Required, default to None) Path to output AnnData H5AD file.
gef2img
gef2img
--gef <GEF>
(Required, default to None) Path to input bin GEF.
--bin-size <INT>
(Required, default to 1) Bin size used to plot expression heatmap.
--image <TIFF>
(Required, default to None) Path to output heatmap image.
visualization
visualization
--gef <GEF>
(Required, default to None) Path to input raw bin GEF file.
--bin-size <INT>
(Required, default to 1,5,10,20,50,100,150,200) Bin sizes used during conversion.
--visualization-gef <GEF>
(Required, default to None) Path to output visualization GEF file.
Image related
tar2img
tar2img
--image-tar <TAR>
(Required, default to None) Path to input image compressed tar file.
--image <PATH>
(Required, default to None) Path to output folder of images.
img2rpi
img2rpi
--image <TIFF>
(Required, default to None) Path to images, please note that the order of input images, corresponding to --layers
names.
--layers <TEXT>
(Required, default to None) Layer names, recorded in the output RPI file, should correspond to images individually. Layer names can be set arbitrarily, but follow the format of <stain_type>/<image_type>
, like DAPI/TissueMask
.
--rpi <RPI>
(Required, default to None) Path to output RPI file.
merge
merge
--image <TIFF>
(Required, default to None) Path to input images (up to 3), to be merged into one image, in the color order of R-G-B.
--merged-image <TIFF>
(Required, default to None) Path to output multichannel image.
overlay
overlay
--image <TIFF>
(Required, default to None) Path to image, used to be the base one.
--template <TXT>
(Required, default to None) Point information of matrix template.
--overlaid-image <TIFF>
(Required, default to None) Path to output overlaid image, with the cover of a template.
Last updated