# realign outputs

## Overview of output structure

&#x20;The `SAW realign` pipeline runs in a directory named by `--id` (or by `--sn` in the absence of `--id`). Output files are classified into several folders, in the `outs/` directory.

The exact output files generated from the analysis depend on:

* the version of SAW used
* which pipeline was used, `SAW count` or `SAW realign`
* whether input the microscope image(s)
* the specific parameters added to the analysis

## Spatial Gene Expression

After performing `SAW realign` on Stereo-seq T FF and Stereo-seq N FFPE kits, the following files can be found under the `outs/` directory:

| Directory/File Name                            | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`bam/`**                                     | Files in BAM format.                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `annotated_bam/`                               | BAM file after alignment and annotation.                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `<SN>.*.bam`                                   | Indexed BAM file containing position-sorted reads mapped to CIDs, aligned to the genome, and annotated with GTF/GFF.                                                                                                                                                                                                                                                                                                                                                                      |
| `<SN>.*.bam.csi`                               | Index for `<SN>.*.bam`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| **`image/`**                                   | Images are generated from automatic or manual workflows.                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `<SN>_<stainType>_regist.tif`                  | The panoramic image aligned with `raw.gef` matrix.                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| `<SN>_<stainType>_tissue_cut.tif`              | The tissue segmentation image, based on the aligned panoramic image.                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `<SN>_<stainType>_mask.tif`                    | The cell segmentation image, based on the aligned panoramic image.                                                                                                                                                                                                                                                                                                                                                                                                                        |
| `<SN>_<stainType>_mask_edm_dis_<distance>.tif` | The adjusted image, based on the cell segmentation image                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| **`feature_expression/`**                      | Feature expression matrices in HDF5 format at different dimensions.                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `<SN>.raw.gef`                                 | Feature expression matrix includes the whole information over a complete chip region. It only has bin1 expression counts.                                                                                                                                                                                                                                                                                                                                                                 |
| `<SN>.tissue.gef`                              | Feature expression matrix under the tissue coverage region. It is also a visualization GEF which includes expression counts for bin1, 5, 10, 20, 50, 100, 150, 200.                                                                                                                                                                                                                                                                                                                       |
| `<SN>.cellbin.gef`                             | Cellbin feature expression matrix records the information of cells individually, including the centroid coordinate, boundary coordinates, expression of genes, and cell area.                                                                                                                                                                                                                                                                                                             |
| `<SN>.adjusted.cellbin.gef`                    | Cellbin expression matrix with cell border expanding, based on `<SN>_<stain_type>_mask_edm_dis_<distance>.tif`.                                                                                                                                                                                                                                                                                                                                                                           |
| `<SN>.merge.barcodeReadsCount.txt`             | A mapped CID list file with read counts for each CID, including three columns (x, y, count).                                                                                                                                                                                                                                                                                                                                                                                              |
| `<SN>_raw_barcode_gene_exp.txt`                | An annotated list file with the information of coordinate, gene, MID, read counts, which is prepared to be a sampling file that performs sequence saturation.                                                                                                                                                                                                                                                                                                                             |
| **`analysis/`**                                | Secondary analysis files.                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| `<SN>.bin200_1.0.h5ad`                         | <p>An AnnData H5AD records preprocessing, filtering, normalization, dimensionality reduction, clustering and differential expression analysis, based on <code>\<SN>.tissue.gef</code>. </p><p>This output H5AD is named in the format of <code>\<SN>.\<binN>\_\<leiden\_res>.h5ad</code>. In the file name, <code>\<SN></code> stands for the Stereo-seq chip serial number, <code>\<N></code> for bin size, and <code>\<leiden\_res></code> for the resolution of Leiden clustering.</p> |
| `bin200_marker_features.csv`                   | Format-integrated differential expression analysis results, using `<SN>.tissue.gef` of bin200.                                                                                                                                                                                                                                                                                                                                                                                            |
| `<SN>.cellbin_1.0.h5ad`                        | An AnnData H5AD records preprocessing, filtering, normalization, dimensionality reduction, clustering and differential expression analysis, using `<SN>.cellbin.gef`.                                                                                                                                                                                                                                                                                                                     |
| `cellbin_marker_features.csv`                  | Format-integrated differential expression analysis results, using `<SN>.cellbin.gef`.                                                                                                                                                                                                                                                                                                                                                                                                     |
| `<SN>.cellbin_1.0.adjusted.h5ad`               | An AnnData H5AD records preprocessing, filtering, normalization, dimensionality reduction, clustering and differential expression analysis, using `<SN>.adjusted.cellbin.gef`.                                                                                                                                                                                                                                                                                                            |
| `cellbin_adjusted_marker_features.csv`         | Format-integrated differential expression analysis results, using `<SN>.adjusted.cellbin.gef`.                                                                                                                                                                                                                                                                                                                                                                                            |
| **`<SN>.report.tar.gz`**                       | Analysis summary report of metrics and plots in HTML format.                                                                                                                                                                                                                                                                                                                                                                                                                              |
| `report.html`                                  | HTML file, involved in `<SN>.report.tar.gz`.                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| **`visualization.tar.gz`**                     | StereoMap visualization file to presentation and manual processing.                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `<SN>.stereo`                                  | A manifest file in JSON format includes experiment and pipeline information, basic analysis statistics, and references to image and spatial matrix files in the SAW output visualization file folder.                                                                                                                                                                                                                                                                                     |

{% hint style="info" %}
Expression-related data is from the last `SAW count` output directory, through `--count-data` parameter.
{% endhint %}

## `visualization.tar.gz`

The compressed visualization TAR file integrates all the output results needed by StereoMap for visualization. The contents of an unpacked one are listed:

```
visualization
├── C04144D5.adjusted.cellbin.gef
├── C04144D5.bin200_1.0.h5ad
├── C04144D5.cellbin_1.0.adjusted.h5ad
├── C04144D5.rpi
├── ssDNA_matrix_template.txt
├── C04144D5_SC_20240509_174202_4.0.0.tar.gz
├── C04144D5.stereo
└── C04144D5.tissue.gef
```

### `.stereo`

`.stereo` is a manifest file in JSON format that records

* information about the task in SAW pipelines,
* information of the tissue sample,&#x20;
* basic analysis statistics,&#x20;
* records of image files and expression data for **StereoMap** exploration.

*\*More details about these files can be found in other parts of Outputs.*
