Analysis

Clustering

SAW count, realign and reanalyze outputs spatial cluster results in AnnData H5AD format.

These files record preprocessing, dimensionality reduction, clustering and differential expression analysis.

circle-info

Spatial clustering and UMAP projection in the H5AD can be visualized in StereoMap.

Here is an example for displaying information recorded in an H5AD:

$ h5dump -n ./C04144D5/outs/analysis/C04144D5.bin20_1.0.h5ad
HDF5 "./C04144D5/outs/analysis/C04144D5.bin20_1.0.h5ad" {
FILE_CONTENTS {
 group      /
 group      /X
 dataset    /X/data
 dataset    /X/indices
 dataset    /X/indptr
 group      /layers
 group      /layers/log1p
 dataset    /layers/log1p/data
 dataset    /layers/log1p/indices
 dataset    /layers/log1p/indptr
 group      /obs
 dataset    /obs/_index
 group      /obs/leiden
 dataset    /obs/leiden/categories
 dataset    /obs/leiden/codes
 dataset    /obs/n_genes_by_counts
 group      /obs/orig.ident
 dataset    /obs/orig.ident/categories
 dataset    /obs/orig.ident/codes
 dataset    /obs/pct_counts_mt
 dataset    /obs/total_counts
 dataset    /obs/x
 dataset    /obs/y
 group      /obsm
 dataset    /obsm/X_pca
 dataset    /obsm/X_umap
 dataset    /obsm/spatial
 group      /obsp
 group      /obsp/connectivities
 dataset    /obsp/connectivities/data
 dataset    /obsp/connectivities/indices
 dataset    /obsp/connectivities/indptr
 group      /obsp/distances
 dataset    /obsp/distances/data
 dataset    /obsp/distances/indices
 dataset    /obsp/distances/indptr
 group      /raw
 group      /raw/X
 dataset    /raw/X/data
 dataset    /raw/X/indices
 dataset    /raw/X/indptr
 group      /raw/var
 dataset    /raw/var/_index
 dataset    /raw/var/mean_umi
 dataset    /raw/var/n_cells
 dataset    /raw/var/n_counts
 group      /raw/var/real_gene_name
 dataset    /raw/var/real_gene_name/categories
 dataset    /raw/var/real_gene_name/codes
 group      /raw/varm
 group      /uns
 dataset    /uns/bin_size
 dataset    /uns/bin_type
 group      /uns/gene_exp_leiden
 dataset    /uns/gene_exp_leiden/1
 ...
 dataset    /uns/gene_exp_leiden/_index
 group      /uns/hvg
 dataset    /uns/hvg/method
 group      /uns/hvg/params
 dataset    /uns/hvg/source
 group      /uns/key_record
 dataset    /uns/key_record/cluster
 dataset    /uns/key_record/gene_exp_cluster
 dataset    /uns/key_record/hvg
 dataset    /uns/key_record/marker_genes
 dataset    /uns/key_record/neighbors
 dataset    /uns/key_record/pca
 dataset    /uns/key_record/umap
 dataset    /uns/leiden_resolution
 dataset    /uns/merged
 group      /uns/neighbors
 dataset    /uns/neighbors/connectivities_key
 dataset    /uns/neighbors/distances_key
 group      /uns/neighbors/params
 dataset    /uns/neighbors/params/method
 dataset    /uns/neighbors/params/metric
 dataset    /uns/neighbors/params/n_neighbors
 dataset    /uns/omics
 dataset    /uns/pca_variance_ratio
 group      /uns/rank_genes_groups
 dataset    /uns/rank_genes_groups/logfoldchanges
 group      /uns/rank_genes_groups/mean_count
 dataset    /uns/rank_genes_groups/mean_count/1
 ...
 dataset    /uns/rank_genes_groups/mean_count/_index
 dataset    /uns/rank_genes_groups/names
 group      /uns/rank_genes_groups/params
 dataset    /uns/rank_genes_groups/params/corr_method
 dataset    /uns/rank_genes_groups/params/groupby
 dataset    /uns/rank_genes_groups/params/layer
 dataset    /uns/rank_genes_groups/params/method
 dataset    /uns/rank_genes_groups/params/reference
 dataset    /uns/rank_genes_groups/params/use_raw
 group      /uns/rank_genes_groups/pts
 dataset    /uns/rank_genes_groups/pts/1
 ...
 dataset    /uns/rank_genes_groups/pts/_index
 group      /uns/rank_genes_groups/pts_rest
 dataset    /uns/rank_genes_groups/pts_rest/1
 ...
 dataset    /uns/rank_genes_groups/pts_rest/_index
 dataset    /uns/rank_genes_groups/pvals
 dataset    /uns/rank_genes_groups/pvals_adj
 dataset    /uns/rank_genes_groups/scores
 dataset    /uns/resolution
 dataset    /uns/result_keys
 group      /uns/sn
 dataset    /uns/sn/_index
 dataset    /uns/sn/batch
 dataset    /uns/sn/sn
 group      /var
 dataset    /var/_index
 dataset    /var/dispersions
 dataset    /var/dispersions_norm
 dataset    /var/highly_variable
 dataset    /var/mean_umi
 dataset    /var/means
 dataset    /var/n_cells
 dataset    /var/n_counts
 group      /var/real_gene_name
 dataset    /var/real_gene_name/categories
 dataset    /var/real_gene_name/codes
 group      /varm
 group      /varp
 }
}

Check datasets

Observe the information via Python Jupyter Notebook. Usually, AnnData package is used to check the metadata stored in an H5AD file.

When you input the variable adata, it will return a description of the data content contained within the adata object.

  • obs: Observations metadata (cell-level annotations). A table (DataFrame) storing information about cells, such as total_counts namely MID, n_genes_by_counts namely gene type, leiden cluster labels and spatial coordinates.

  • var: Variables metadata (gene-level annotations). A table (DataFrame) storing information about genes, such as gene names, whether they are highly variable, or their biological functions.

adata.X is a preprocessed and normalized gene expression matrix, typically represented as a sparse matrix.

Type adata.obs to observe the Observations metadata:

Clusters out from Leiden algorithm are adata.obs['leiden'].

Type adata.var to observe the Variables metadata:

Also, you can obtain the results of dimensionality reduction analysis.

Differential expression analysis

SAW count, realign and reanalyze outputs differential expression analysis results in CSV format.

circle-info

There are two kinds of results related to differential expression analysis, namely find_marker_genes.csv and <bin_size>_marker_features.csv.

  • find_marker_genes.csv is the original output file.

  • <bin_size>_marker_features.csv is a formatted CSV that records mean MID counts, L2FC, adjusted p-value, and expression ratio of marker features for each cluster.

For each feature per cluster, the following terms were computed:

  • Mean MID Count

  • Log2 fold change of expression

  • Adjusted p-value (confidence level of feature expression in current cluster relative to others)

  • Expression proprotion. Cluster 1 % of expressed = 1 means this feature is expressed in all cells/bins in the current cluster

To obtain differential expression analysis from an H5AD file, read it with anndata package as well. Information of rank genes is stored in adata.uns['rank_genes_groups'], which is a dictionary object. Retrieve the required information based on the specified key.

The output CSV under /outs/analysis records the format-integrated differential expression analysis results.

circle-info

Differential expression analysis result <bin_size>_marker_features.csv can be loaded in StereoMapand Excel.

Multiomics clustering

If you perform SAW reanalyze on the Stereo-CITE T FF sample, its jointly multi-omics clustering result will be saved in H5MU.

Here is an example for displaying information recorded in an H5MU:

Last updated