Obtain rRNA information
rRNA (ribosomal RNA)
What is rRNA?
rRNA (ribosomal RNA) is the primary component of ribosomes, the molecular machines responsible for cell protein synthesis. rRNA interacts with mRNA and tRNA to catalyze peptide bond formation, playing a crucial role in the process of translation. rRNA can be classified into the following types based on function and location:
5S rRNA: found in the large ribosomal subunit, primarily involved in maintaining ribosome structure.
16S rRNA (prokaryotes) / 18S rRNA (eukaryotes): found in the small ribosomal subunit, responsible for mRNA recognition and translation initiation.
23S rRNA (prokaryotes) / 28S rRNA (eukaryotes): found in the large ribosomal subunit, involved in peptide bond formation and translation elongation.
5.8S rRNA (eukaryotes): found in the large ribosomal subunit, working with 28S and 5S rRNA to maintain ribosome function.
Presence in RNA sequencing experiments
In RNA sequencing (RNA-seq) experiments, rRNA is present mainly due to the following reasons:
High abundance of rRNA: rRNA constitutes 80%-90% of total cellular RNA, making it the most abundant RNA type.
Non-specific capture in experimental steps: during RNA extraction and library preparation, rRNA may be non-specifically captured and included in the sequencing library.
No/Incomplete rRNA removal: without the use of specific rRNA removal kits, or even such kits are used, some rRNA may still remain.
rRNA is the most abundant RNA type in cells and is inevitably present in RNA-seq experiments. So that its sequences occupy a significant portion of the sequencing data. Using rRNA removal kits during these experiments to remove rRNA can reduce sequencing depth requirements, thereby lowering costs.
rRNA sequences do not contain information about target gene expression and may interfere with the quantification of target gene expression and differential expression analysis. To enhance the effective utilization of sequencing data and improve the accuracy of data analysis, it is necessary to remove rRNA during both experimental and computational steps.
Obtain from RNAcentral
RNAcentral is a comprehensive non-coding RNA (ncRNA) database developed by the European Bioinformatics Institute (EBI). It integrates ncRNA data from multiple expert databases (e.g., Ensembl, GENCODE, miRBase, Rfam) to provide a unified reference platform for ncRNA research
Search rRNA information
The following three search methods are provided on the homepage:
"Text search" searches the RNA sequences based on the provided keywords.
"Sequence search" aligns the input unknown fragments with databases to retrieve specific RNA information.
"Genome browser" provides a genome browser, where analysts can select a species, specify a chromosome location, and view the distribution of genes and sequences within a target interval.
"Text search" is recommended for rRNA information. When you have some details about the name, species, tissue type, sequence length, RNA type (such as 5S, 18S, etc.) or other text information of the target rRNA, type them into the search window. In summary, select the appropriate qualifiers based on your analysis requirements.
When searching for 18S rRNA of homo sapiens (human), several rRNA records will be displayed. The database from which the RNA is sourced is indicated below each search record. Download the needed rRNA information in FASTA file format.

A downloaded rRNA-related FASTA is compressed as *.fasta.gz. Remember to gunzip the file first.
Compiled rRNA index files
For easy use, the STOmics R&D team has compiled common rRNA information for Homo sapiens (human) and Mus musculus (mouse). You can directly download the STAR and Bowtie2 index files, which include rRNA information, from our datasets.
File size: 28.03GB md5sum: 6fa47b14dc26321d1cab691baee4fb2f
File size: 31.47GB md5sum: a86ceda324fa300d18f48b77502e5274
Remove rRNA
If you plan to remove rRNA fragments during SAW analysis, make sure of the following settings:
having added specific rRNA information to the transcriptomic reference.
using
--rRNA-removeparameter to startSAW countanalysis.
Add rRNA information to reference
Use --rRNA-FASTA to mark the input rRNA information specifically, which will be added to --fasta after redundancy removal.
Key steps of the processing:
Step 1: given the rRNA fragments of --rRNA-fasta are short and highly repetitive so that the pipeline will remove their redundancy first.
Step 2: add rRNA information to --fasta file(s), with the suffix '_rRNA' on the chromosome, like '1_rRNA', to distinguish rRNA ones from the basic genome.
Step 3: build index files using the genome integrated with de-duplicated rRNA information.
Run count analysis
Let's take a simple analysis of FFPE data as an example:
Last updated