RNA velocity is a computational framework that allows inference of the future state of individual cells in single-cell RNA-seq experiments. While RNA-seq data provides a static snapshot of RNA transcription, it reveals little about dynamic processes that occur during development or in response to stimuli. Addressing this challenge, La Manno and collaborators (2018) introduced RNA velocity, leveraging the observation that nascent (unspliced) RNA exhibits different dynamics compared to mature (spliced) RNA. Based on the analysis of the ratio of spliced and unspliced RNA, the direction of a cell state transition is predicted in the high-dimensional gene expression space, offering insights into cellular trajectories.
At its core, RNA velocity models transcription activity using splicing kinetics equations, assuming steady state or dynamical models. A simpler analogy is imagining a UMAP plot as a dynamic movie where cells move from one state to another over time. In this context, RNA velocity predicts those transitions. However, RNA velocity does not measure transcription directly but instead leverages the comparison of unspliced and spliced RNA to make inferences.
Unspliced molecules represent RNA that is being synthesized; the higher this proportion, the more active the transcription of particular genes. The steady-state RNA (spliced) represents the pool of RNA already synthesized. Based on the first, the magnitude of the second can be calculated, and transitions can be predicted accordingly. A useful analogy is motion in physics: the steady-state levels of RNA correspond to a cell’s current position, while nascent (unspliced) RNA reflects its acceleration. With these two components, RNA velocity calculates a cell’s future position (state) at time t + 1.
This method has multiple applications in diverse research areas, including:
Data used in this tutorial corresponds to PBMCs from a healthy individual, described here.
mamba create -n RNAvel python=3.11 numpy=1.23 pandas=1.5 scanpy=1.9 pyroe salmon alevin-fry
mamba activate RNAvel
# Genome
wget https://ftp.ensembl.org/pub/current_fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
# GTF
wget https://ftp.ensembl.org/pub/current_gtf/homo_sapiens/Homo_sapiens.GRCh38.113.chr_patch_hapl_scaff.gtf.gz
pyroe make-splici "$GENOME" "$GTF" 151 hs_GRCh38_113_splici_python --flank-trim-length 5 --filename-prefix splici
The above command will produce a folder named hs_GRCh38_113_splici_python
, containing files like:
clean_gtf.gtf
gene_id_to_name.tsv
splici_fl146.fa
splici_fl146_t2g_3col.tsv