# Format conversion

This tutorial will show how to implement basic format conversions using the complementary pipeline `SAW convert`. To make this utility more straightforward and concise, several pipelines have been created under `SAW convert`. The sub-pipeline is usually named as "A2B", which signifies the switching from A-form to B-form (or from A-dimension to B-dimension).

Select the one you need for format conversion.

{% hint style="success" %}
To make this pipeline more intuitive and user-friendly, **the examples based on real demo datasets** are given after the function description individually. Please do not restrict yourself to the files and their filenames used in the examples, as they are provided for reference purposes only.
{% endhint %}

## Matrix related

### gef2gem

Conversion from a [bin GEF](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#bin-gef) to a bin [GEM](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#gene-expression-matrix-gem).

Bin GEM is a type of text file that primarily contains gene information, spatial coordinates, and MID counts. The feature expression recorded in a GEM file has only one type of bin size, so you have to set `--bin-size` for the conversion.

```sh
saw convert gef2gem \
    --gef=/path/to/input/bin/GEF \
    --bin-size=1 \
    --gem=/path/to/output/bin/GEM
    
##example test
saw convert gef2gem \
    --gef=./C04144D5.tissue.gef \
    --bin-size=20 \
    --gem=./C04144D5.bin20.tissue.gem
```

Conversion from a [cellbin GEF](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#cell-bin-gef) to a cellbin [GEM](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#gene-expression-matrix-gem).

{% hint style="warning" %}
During the conversion from a cellbin GEF to cellbin GEM, it is important to note that the corresponding bin GEF is necessary to obtain DNB information.&#x20;

**Due to the format limitations of cellbin GEF**, a bare amount of expression information may be lost during the conversion. If you plan to generate a cellbin GEM, please use the improved [`bin2cell`](#bin2cell) instead. &#x20;
{% endhint %}

```sh
saw convert gef2gem \
    --cellbin-gef=/path/to/input/cellbin/GEF \
    --gef=/path/to/input/bin/GEF \
    --cellbin-gem=/path/to/output/cellbin/GEM
```

### gem2gef

Conversion from a bin [GEM](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#gene-expression-matrix-gem) to a [bin GEF](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#bin-gef).

{% hint style="info" %}
If your input a GEM is of bin1, the output GEF will be a visualization GEF that includes expression counts of \[bin1, 5, 10, 20, 50, 100, 150, 200].

If your input GEM is not of bin1, the output GEF will contain the expression counts of that specific bin size.
{% endhint %}

```sh
saw convert gem2gef \
    --gem=/path/to/input/GEM \
    --gef=/path/to/output/GEF
    
##example test
saw convert gem2gef \
    --gem=./C04144D5.tissue.gem \
    --gef=./C04144D5.tissue.gef
```

Conversion from a cellbin [GEM](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#gene-expression-matrix-gem) to a [cellbin GEF](/saw-user-manual-v8.2/advanced/expression-matrix-format.md#cellbin-gef).

```sh
saw convert gem2gef \
    --cellbin-gem=/path/to/input/cellbin/GEM \
    --cellbin-gef=/path/to/output/cellbin/GEF
    
##example test
saw convert gem2gef \
    --gem=./C04144D5.adjusted.cellbin.gem \
    --gef=./C04144D5.adjusted.cellbin.gef
```

### bin2tissue

Extract tissue-coverage expression information from a raw bin GEF.

{% hint style="info" %}
The tissue segmentation mask is essential for defining the tissue boundaries of a sample, enabling the generation of an expression matrix at the tissue dimension.&#x20;
{% endhint %}

```sh
saw convert bin2tissue \
    --gef=/path/to/input/GEF \
    --image=/path/to/cell/segmentation/image \
    --output=/path/to/output/directory
    
##example test
saw convert bin2tissue \
    --gef=./C04144D5.raw.gef \
    --image=./C04144D5_ssDNA_tissue_cut.tif \
    --output=./tissue_area_result
```

If microscope images were not captured during the experimental process, this sub-module could still be applied directly to extract tissue segmentation results based on the transcriptomic expression matrix.

```sh
saw convert bin2tissue \
    --gef=/path/to/input/GEF \
    --output=/path/to/output/directory
    
##example test
saw convert bin2tissue \
    --gef=./C04144D5.tissue.gef \
    --output=./tissue_area_result
```

Under the output directory, there are a `bin1_img_tissue_cut.tif` of tissue segmentation and `<SN>.tissue.gef`.

### bin2cell

Extract cellbin expression information from a raw bin GEF.

{% hint style="info" %}
A cell segmentation mask is used to delineate the boundaries of individual cells, which is then utilized to generate an expression matrix at the cell dimension.
{% endhint %}

```sh
saw convert bin2cell \
    --gef=/path/to/input/GEF \
    --image=/path/to/cell/segmentation/image \
    --cellbin-gef=/path/to/output/cellbin/GEF

##example test    
saw convert bin2cell \
    --gef=./C04144D5.tissue.gef \
    --image=./C04144D5_ssDNA_mask.tif \  ## or C04144D5_ssDNA_mask_edm_dis_10.tif
    --cellbin-gef=./C04144D5.cellbin.gef   ## or C04144D5.adjusted.cellbin.gef
```

{% hint style="success" %}
From SAW 8.2, `bin2cell` offers an improved method to generate a cellbin GEM directly from a raw bin GEF. It is highly recommended that you prioritize this conversion method.
{% endhint %}

```sh
saw convert bin2cell \
    --gef=/path/to/input/bin/GEF \  ## if concerned about tissue area, use <SN>.tissue.gef
    --image=/path/to/input/cell/segmentation/TIFF \
    --cellbin-gef=/path/to/output/cellbin/GEF \
    --cellbin-gem=/path/to/output/cellbin/GEM

##example test
saw convert bin2cell \
    --gef=./C04144D5.tissue.gef \
    --image=./C04144D5_ssDNA_mask.tif \  ## or C04144D5_ssDNA_mask_edm_dis_10.tif
    --cellbin-gef=./C04144D5.cellbin.gef \  ## or C04144D5.adjusted.cellbin.gef
    --cellbin-gem=./C04144D5.cellbin.gem  ## or C04144D5.adjusted.cellbin.gem
```

### visualization

Conversion from a raw GEF to a visualization GEF.&#x20;

A raw GEF typically records the spatial expression matrix of bin1 in a sparse matrix format to reduce file size. Due to the 500nm high precision of Stereo-seq chip, the amout of expression matrix data generated is too large. Unless necessary, the software only retains the bin1 dimension data in a shuffle matrix. **For GEF to be visualized in StereoMap**, it is necessary to have comprehensive information on various bin sizes, usually with a bin list of \[ 1, 5, 10, 20, 50, 100, 150, 200].

```sh
saw convert visualization \
    --gef=/path/to/input/GEF \
    --bin-size=1,5,10,20,50,100,150,200 \
    --visualization-gef=/path/to/output/visualization/GEF
    
##example test
saw convert visualization \
    --gef=./C04144D5.raw.gef \
    --bin-size=1,5,10,20,50,100,150,200 \
    --visualization-gef=./C04144D5.gef
```

### gef2h5ad

Conversion from a bin GEF to an AnnData H5AD.

{% hint style="info" %}
[AnnData H5AD](https://anndata.readthedocs.io/en/latest/index.html#) is a widely used data format for downstream analysis. And AnnData package version >= 0.8.0.
{% endhint %}

```sh
saw convert gef2h5ad \
    --gef=/path/to/input/GEF \
    --bin-size=20 \
    --h5ad=/path/to/output/h5ad
    
##example test
saw convert gef2h5ad \
    --gef=./C04144D5.tissue.gef \
    --bin-size=20 \
    --h5ad=./C04144D5.bin20.h5ad
```

Conversion from a cellbin GEF to an AnnData H5AD.

```bash
saw convert gef2h5ad \
    --cellbin-gef=/path/to/input/cellbin/GEF \
    --h5ad=/path/to/output/h5ad
    
##example test
saw convert gef2h5ad \
    --cellbin-gef=./C04144D5.cellbin.gef \
    --h5ad=./C04144D5.cellbin.h5ad
```

### gem2h5ad

Conversion from a bin GEM to an AnnData H5AD.

```sh
saw convert gem2h5ad \
    --gem=/path/to/input/bin/GEM \
    --bin-size=20 \
    --h5ad=/path/to/output/h5ad

##example test
saw convert gem2h5ad \
    --gem=./C04144D5.tissue.gem \
    --bin-size=20 \
    --h5ad=./C04144D5.bin20.h5ad
```

Conversion from a cellbin GEM to an AnnData H5AD.

```sh
saw convert gem2h5ad \
    --cellbin-gem=/path/to/input/cellbin/GEM \
    --h5ad=/path/to/output/h5ad

##example test
saw convert gem2h5ad \
    --cellbin-gem=./C04144D5.cellbin.gem \
    --h5ad=./C04144D5.cellbin.h5ad
```

### gef2rds

Conversion from a bin GEF to a RDS file.

{% hint style="info" %}
The RDS file format is a serialized data structure that saves and loads [Seurat](https://satijalab.org/seurat/) objects in R.&#x20;
{% endhint %}

```sh
saw convert gef2rds \
    --gef=/path/to/input/bin/GEF \
    --bin-size=20 \
    --rds=/path/to/output/seurat/rds
 
##example test  
saw convert gef2rds \
    --gef=./C04144D5.tissue.gef \
    --bin-size=20 \
    --rds=./C04144D5.bin20.tissue.rds
```

Conversion from a cellbin GEF to a RDS file, for analysis in Seurat

```bash
saw convert gef2rds \
    --cellbin-gef=/path/to/input/cellbin/GEF \
    --rds=/path/to/output/seurat/rds
    
##example test  
saw convert gef2rds \
    --cellbin-gef=./C04144D5.cellbin.gef \
    --rds=./C04144D5.cellbin.rds
```

### gem2rds

Conversion from a bin GEM to a RDS file.

```sh
saw convert gem2rds \
    --gem=/path/to/input/bin/GEM \
    --bin-size=20 \
    --rds=/path/to/output/seurat/rds
    
##example test  
saw convert gem2rds \
    --gem=./C04144D5.tissue.gem \
    --bin-size=20 \
    --rds=./C04144D5.bin20.tissue.rds
```

Conversion from a cellbin GEM to a RDS file, for analysis in Seurat.

```bash
saw convert gem2rds \
    --cellbin-gem=/path/to/input/cellbin/GEM \
    --rds=/path/to/output/seurat/rds
    
##example test  
saw convert gem2rds \
    --cellbin-gem=./C04144D5.cellbin.gem \
    --rds=./C04144D5.cellbin.rds
```

### h5ad2rds

Conversion from an AnnData H5AD to a RDS file, for analysis in Seurat.

```sh
saw convert h5ad2rds \
    --h5ad=/path/to/input/anndata/h5ad \
    --rds=/path/to/output/rds/file
    
##example test  
saw convert h5ad2rds \
    --h5ad=./C04144D5.cellbin.h5ad \
    --rds=./C04144D5.cellbin.rds
```

### gef2img

Plot a heatmap of a bin GEF.

It supports using the feature expression matrix to generate a grayscale image heatmap of the spatial expression.

```sh
saw convert gef2img \
    --gef=/path/to/input/GEF \
    --bin-size=1 \
    --image=/path/to/output/heatmap/TIFF/image

##example test  
saw convert gef2img \
    --gef=./SS200000135TL_D1.raw.gef \
    --bin-size=1 \
    --image=./res_bin1.tif
```

<figure><img src="/files/supuZGghw66KyKkjzJL6" alt="" width="348"><figcaption></figcaption></figure>

## Image related

### tar2img

Extract TIFF images from an image `.tar.gz` file. Usually including a microscope image aligned with the matrix, a tissue segmentation image and a cell segmentation image, if required algorithmic or manual processing results are recorded in the image `.tar.gz` file.

```sh
saw convert tar2img \
    --image-tar=/path/to/input/image/tar \
    --image=/path/to/output/folder

##example test
saw convert tar2img \
    --image-tar=./SS200000135TL_D1_SC_20240711_105908_4.1.0.tar.gz \
    --image=./SS200000135TL_D1_image_results
```

### img2rpi

Conversion from TIFF images to an RPI file, used in StereoMap.

{% hint style="info" %}
Layer names can be set arbitrarily, but follow the format of `<stain_type>/<image_type>`, like `DAPI/TissueMask`. For the image of cell segmentation, we recommend you setting the layer name with a prefix of "CellMask", so that StereoMap display cell borders directly.
{% endhint %}

```sh
saw convert img2rpi \
    --image=/path/to/input/image1,/path/to/input/image2,/path/to/input/image3... \
    --layers=<stain_type>/Image,<stain_type>/TissueMask,<stain_type>/CellMask... \
    --rpi=/path/to/output/rpi
    
##example test
saw convert img2rpi \
    --image=./SS200000135TL_D1_ssDNA_regist.tif,./SS200000135TL_D1_ssDNA_mask.tif \
    --layers=ssDNA/Image,ssDNA/CellMask... \
    --rpi=./SS200000135TL_D1.rpi
```

### merge

Merge images (up to three) into one image.

{% hint style="info" %}
Note that the order of the image input represents its color channel, R-G-B.
{% endhint %}

```sh
saw convert merge \
    --image=/path/to/input/image1,/path/to/input/image2,/path/to/input/image3 \ 
    --merged-image=/path/to/output/multichannel/image

##example test 
saw convert merge \
    --image=./SS200000135TL_D1_ssDNA_regist.tif,./SS200000135TL_D1_ssDNA_tissue_cut.tif \ 
    --merged-image=./SS200000135TL_D1.merged.tif
```

Merged image of microscopy image `SS200000135TL_D1_ssDNA_regist.tif` and tissue segmentation mask file `SS200000135TL_D1_ssDNA_tissue_cut.tif` to evaluate the performance of tissue segmentation.

<figure><img src="/files/382UghXbdg0CoK2ghmED" alt="" width="278"><figcaption></figcaption></figure>

Part of the merged image of the microscopy image `SS200000135TL_D1_ssDNA_regist.tif` and cell segmentation mask file `SS200000135TL_D1_ssDNA_mask.tif` to evaluate the performance of cell segmentation.

<figure><img src="/files/30JjRMQfHlz4pc4NZThK" alt="" width="375"><figcaption></figcaption></figure>

### overlay

Stack the template points onto the image, to check whether the image template crosspoints derived by image QC are accurate.

{% hint style="info" %}
The matrix template file, `<stain_type>_matrix_template.txt`, can be found in `visualization.tar.gz`.
{% endhint %}

```sh
saw convert overlay \
    --image=/path/to/input/image \
    --template=/path/to/input/template/txt \
    --overlaid-image=/path/to/output/overlaid/image
  
##example test  
saw convert overlay \
    --image=./SS200000135TL_D1_regist.tif \
    --template=./ssDNA_matrix_template.txt \
    --overlaid-image=./SS200000135TL_D1.overlay.tif
```

Stack the matrix template onto `SS200000135TL_D1_ssDNA_regist.tif` image to verify the registration outcome.

<figure><img src="/files/rL42NZWmYvXvhWYnYSbe" alt="" width="375"><figcaption></figcaption></figure>

<figure><img src="/files/gnVK737P0fdFKLDrkxoL" alt="" width="375"><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://stereotoolss-organization.gitbook.io/saw-user-manual-v8.2/tutorials/format-conversion.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
