RNA Seq: Principle and Workflow of RNA Sequencing

CD Genomics
12 Sept 202109:28

Summary

TLDRThis video delves into RNA-Seq, a next-generation sequencing technique that uncovers RNA presence and quantity in biological samples. It traces RNA-Seq's evolution from Sanger sequencing to Illumina's dominance and emerging third-generation technologies. RNA-Seq offers high-throughput, single-base resolution profiling, overcoming microarray and Sanger sequencing limitations. The video outlines the workflow, from RNA isolation to data generation, and touches on challenges like PCR bias. It also discusses the importance of quality assessment and the technology's ability to detect sequence variations and gene regulation.

Takeaways

  • 🧬 **RNA-Seq Definition**: RNA sequencing (RNA-Seq) uses next-generation sequencing (NGS) to analyze the presence and quantity of RNA in a sample at a specific developmental stage or physiological condition.
  • 📈 **Development of RNA-Seq**: RNA-Seq evolved from Sanger sequencing and microarrays to become dominant with the advent of NGS technologies like Illumina and Roche 454.
  • 🔎 **Advantages of RNA-Seq**: RNA-Seq offers high-throughput RNA profiling at single base resolution with low background noise, surpassing microarrays and Sanger sequencing in sensitivity and specificity.
  • 🧪 **Third-Generation Sequencing**: Third-generation sequencing technologies like Pacific Biosciences provide long-read sequencing, useful for identifying new transcripts and isoforms without fragmentation.
  • 📉 **Challenges in RNA-Seq**: Challenges include short read biases and PCR amplification biases that can affect gene expression quantitation.
  • 🧪 **Workflow of RNA-Seq**: The workflow involves converting RNA into cDNA fragments, attaching sequencing adapters, generating sequence data, and aligning reads to a reference genome or transcriptome.
  • 🌟 **Types of Reads**: RNA-Seq classifies reads into exonic, junction, and poly-A reads to generate a base resolution expression profile.
  • 🧪 **Sample Collection**: Total RNA is isolated and processed, often involving ribosomal RNA depletion to enrich non-ribosomal RNA species.
  • 🔬 **Fragmentation**: RNA molecules are fragmented into smaller pieces for sequencing, with cDNA fragmentation providing information about the 3' ends and RNA fragmentation accessing the transcript body.
  • 📊 **Bioinformatics Analysis**: The first step in RNA-Seq analysis is quality assessment, followed by read mapping or assembly, and gene expression level inference.
  • 🔍 **Applications**: RNA-Seq is used to detect differential expression, SNPs, fusion genes, and post-transcriptional gene regulation.

Q & A

  • What is RNA-seq and what does it reveal about biological samples?

    -RNA-seq, short for RNA sequencing, uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a specific developmental stage or physiological condition. It is a powerful tool for analyzing the dynamic cellular transcriptome, which is essential for interpreting the functional elements of the genome, revealing molecular constituents of cells and tissues, and understanding development and disease.

  • What technological advancements have influenced the development of RNA-seq?

    -The development of RNA-seq has been influenced by several technological advancements, including Sanger sequencing technology, microarrays, and the advent of next-generation sequencing technologies. The first RNA-seq paper was published in 2006 using Roche 454 technology, and the technology matured with the advent of Illumina technology in 2008. Third-generation sequencing technologies, such as Pacific Biosciences' long-read sequencing, are also emerging in RNA-seq.

  • How does third-generation sequencing technology impact RNA-seq?

    -Third-generation sequencing technology, such as Pacific Biosciences' long-read sequencing, allows for full-length transcript sequencing without the need for fragmentation. This is useful for identifying new transcripts and novel isoforms, alternative splicing sites, fusion gene expression, and allelic expression more accurately.

  • What are the advantages of RNA-seq over traditional microarray and Sanger sequencing?

    -RNA-seq has several advantages over traditional microarray and Sanger sequencing, including high throughput RNA profiling at single base resolution, low background noise, the ability to map transcribed regions and gene expression simultaneously, and the capability to distinguish different isoforms and allelic expression. It also requires a lower amount of RNA and is less costly.

  • What are the challenges faced by RNA-seq?

    -RNA-seq faces challenges such as short read biases and PCR biases. Short read lengths were a concern, but Illumina sequencing technology has increased read length and throughput. PCR amplification can impact the accuracy of gene expression quantitation, but amplification-free technologies and PCR-free methods for Illumina sequencing have been developed to address this.

  • What is the workflow of RNA-seq using high-throughput sequencing technology?

    -The workflow of RNA-seq involves converting long RNAs into a library of cDNA fragments through RNA or DNA fragmentation, attaching sequencing adapters to each cDNA fragment, and generating sequence data from both ends in a high-throughput manner. The resulting sequence reads are aligned with the reference genome or transcriptome and classified into exonic, junction, and poly-A reads to generate a base resolution expression profile.

  • How is total RNA isolated for RNA-seq?

    -Total RNA is usually isolated via organic extraction or silica membranes of spin columns. The total RNA sample is then processed either by direct selection of polyA RNA or by selective removal of ribosomal RNA, as ribosomal RNA is often not the research focus and can greatly reduce the coverage of useful transcripts.

  • Why is ribosomal RNA depletion preferred over polyA RNA selection?

    -Ribosomal RNA depletion is preferred over polyA RNA selection because it enriches all non-ribosomal RNA species, including tRNA, ncRNA, non-polyA mRNA, and pre-processed RNA. This approach captures a broader range of RNA species compared to polyA RNA selection, which may miss some RNA transcripts that lack polyA tails.

  • How are larger RNA molecules prepared for deep sequencing technologies?

    -Larger RNA molecules need to be fragmented into smaller pieces (200 to 500 nt) before deep sequencing technologies. cDNA fragmentation is biased towards the identification of sequences from the 3' ends of transcripts, while RNA fragmentation provides access to the precise identity of the transcript body.

  • What are the different platforms that dominate RNA-seq?

    -RNA-seq is currently dominated by three different platforms: Illumina, Ion Torrent (Thermo Fisher Scientific), and Pacific Biosciences' SMRT. These platforms offer varying read lengths and technologies for sequencing.

  • How does RNA-seq enable the detection of differential expression and other genomic features?

    -RNA-seq allows for the detection of differential expression across treatments or conditions by normalizing the differences between samples. It also enables the identification of SNPs, fusion genes, and post-transcriptional gene regulation, such as RNA editing, degradation, and translation.

Outlines

00:00

🔬 Introduction to RNA Sequencing (RNA-seq) and its Evolution

This paragraph introduces RNA sequencing (RNA-seq), explaining its role in using next-generation sequencing (NGS) to detect the presence and quantity of RNA in biological samples. RNA-seq helps analyze the dynamic transcriptome of cells, crucial for understanding gene functions, cell composition, and disease mechanisms. The paragraph also discusses the historical development of RNA-seq, starting from early transcriptomics methods like SAGE and microarrays, and how NGS surpassed microarrays. The first RNA-seq paper was published in 2006, with the technology becoming dominant by 2008, driven by advancements in Illumina technology and the promise of third-generation sequencing technologies like Pacific Biosciences for more comprehensive transcript identification.

05:01

🧬 RNA-seq’s Advantages and Challenges

Here, the technological advantages of RNA-seq over older methods like microarray and Sanger sequencing are discussed. RNA-seq offers high throughput, single-base resolution, low background noise, and cost efficiency. It enables accurate mapping of gene expression and identification of different isoforms and allelic expressions. However, challenges like short read lengths and PCR biases are mentioned, though improvements in sequencing technology, such as longer reads and amplification-free techniques, are mitigating these issues. Third-generation sequencing is highlighted for full-length transcript sequencing, further improving RNA-seq's accuracy.

📊 RNA-seq Workflow Overview

This section details the typical RNA-seq workflow, starting from the conversion of long RNAs into cDNA fragments. The cDNA fragments are then sequenced, and the sequence reads are aligned with reference genomes or transcriptomes. The paragraph outlines the different types of reads generated—exonic, junction, and poly(A)—which help create a base-resolution gene expression profile. The workflow includes RNA extraction methods like organic extraction or silica membranes and highlights the importance of isolating non-ribosomal RNA to improve coverage of relevant transcripts. Challenges with fragmentation and bias are also mentioned, alongside the benefits of using RNA fragmentation for more even coverage across transcripts.

🧪 Ribosomal RNA Depletion and Fragmentation Methods

This paragraph focuses on ribosomal RNA depletion, a key step in RNA-seq for enriching non-ribosomal RNA species like tRNA, non-coding RNA, and mRNA. It explains two popular depletion methods: hybridization with biotin-labeled anti-rRNA probes, followed by removal with magnetic beads, and selective degradation of rRNA by exonucleases. The paragraph also discusses the importance of fragmentation, which breaks down larger RNA molecules into smaller pieces for sequencing. Different methods of fragmentation provide varying biases, with RNA fragmentation covering more of the transcript body, while cDNA fragmentation focuses on transcript ends.

🧬 Preserving Strand-Specific Information and RNA-seq Platforms

This section explains how strand-specific information, crucial for understanding transcriptional direction, is often lost in classic NGS protocols. To counter this, methods such as pre-treating RNA samples with sodium bisulfite or directly ligating RNA adapters have been developed. The paragraph also reviews the major RNA-seq platforms, including Illumina, Ion Torrent, Roche 454, and Pacific Biosciences, comparing their read lengths. Longer reads are valuable for studying complex transcriptomes, revealing exon connectivity and sequence variations, which helps in understanding gene expression and regulation more comprehensively.

🖥️ Bioinformatics and Data Analysis in RNA-seq

This paragraph emphasizes the bioinformatics analysis required for RNA-seq. The first step involves quality assessment, where low-quality and adapter sequences are removed to ensure accurate results. Once the sequence reads are filtered and mapped to the reference genome, gene expression levels are inferred. The data can be used to generate a transcriptome map on a genome-wide scale, allowing the detection of differential gene expression across different treatments or conditions. Additional RNA-seq capabilities include identifying SNPs, fusion genes, and post-transcriptional gene regulation mechanisms like RNA editing and degradation.

Mindmap

Keywords

💡RNA-Seq

RNA-Seq, short for RNA sequencing, is a powerful tool that uses next-generation sequencing (NGS) to reveal the presence and quantity of RNA in a biological sample at a specific developmental stage or physiological condition. It is essential for interpreting the functional elements of the genome, revealing the molecular constituents of cells and tissues, and understanding development and disease. In the script, RNA-Seq is the central theme, with the video providing a detailed introduction to its development and workflow.

💡Next Generation Sequencing (NGS)

NGS is a collective term for modern sequencing technologies that allow for the sequencing of an entire genome at a much faster pace and lower cost than traditional Sanger sequencing. The script mentions that RNA-Seq utilizes NGS to analyze the cellular transcriptome, which is crucial for understanding gene expression and regulation.

💡Transcriptome

The transcriptome refers to the complete set of RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA produced in the cells of an organism. In the context of the video, RNA-Seq is used to analyze the continuously changing cellular transcriptome, which is essential for understanding gene expression and regulation.

💡Sanger Sequencing

Sanger sequencing is a method of DNA sequencing that was first used for transcriptomics. The script mentions that Sanger sequencing enabled methods like Serial Analysis of Gene Expression (SAGE), which was one of the first attempts to quantify gene expression on a global basis.

💡Microarrays

Microarrays are a technology that uses complementary probe hybridization to measure the expression levels of thousands of genes simultaneously. The script discusses how microarrays quickly emerged and dominated the field of transcriptomics profiling for the next decade before being surpassed by NGS technologies.

💡Illumina Technology

Illumina is a sequencing platform that uses NGS technology and is widely used in RNA-Seq. The script notes that the era of RNA-Seq dominance began in 2008 with the maturity of Illumina technology, which has steadily increased read length and throughput since its introduction.

💡Third Generation Sequencing

Third generation sequencing technologies, such as Pacific Biosciences' long-read sequencing, are characterized by their ability to sequence full-length transcripts without the need for fragmentation. The script highlights that this technology is useful for identifying new transcripts and accurately identifying isoforms, alternative splicing sites, fusion gene expression, and allelic expression.

💡Isoforms

Isoforms are different versions of a protein that are produced by alternative splicing of a single gene. The script mentions that RNA-Seq can be used to distinguish different isoforms, which is crucial for understanding the complexity of gene expression and regulation.

💡Allelic Expression

Allelic expression refers to the expression of one of the two alleles of a gene. The script notes that RNA-Seq can be used to distinguish allelic expression, which is important for understanding gene regulation and the role of genetic variation in phenotypic diversity.

💡PolyA RNA

PolyA RNA refers to mRNA molecules that have a polyadenine tail, which is a characteristic feature of most eukaryotic mRNAs. The script explains that the oligo-dT based mRNA purification procedure is widely used in eukaryotes, highlighting its importance in the workflow of RNA-Seq.

💡Ribosomal RNA Depletion

Ribosomal RNA depletion is a method used to selectively remove ribosomal RNA from a sample before sequencing. The script mentions that this approach is preferred because it enriches all non-ribosomal RNA species, including tRNA, ncRNA, nonpolyA mRNA, and pre-processed RNA, which are often the focus of research.

Highlights

RNA-Seq is a powerful tool to analyze the cellular transcriptome, essential for understanding development and disease.

Sanger sequencing and microarrays were early methods for transcriptomics, but have been surpassed by next-generation sequencing.

The first RNA-Seq paper using Roche 454 technology was published in 2006, marking the beginning of RNA-Seq dominance.

Illumina technology matured in 2008, further advancing RNA-Seq capabilities.

Third-generation sequencing technologies like Pacific Biosciences offer long-read sequencing, useful for identifying new transcripts and isoforms.

RNA-Seq provides high throughput RNA profiling at single base resolution with low background noise.

Compared to microarray and Sanger sequencing, RNA-Seq requires less RNA and is less costly.

Challenges for RNA-Seq include short read and PCR biases, which newer technologies are addressing.

Long paired and strand-specific reads are used for higher map ability and de novo assembly of transcriptomes.

The workflow of RNA-Seq involves converting long RNAs into cDNA fragments, attaching sequencing adapters, and generating sequence data.

RNA samples are processed by direct selection of polyA RNA or by selective removal of ribosomal RNA.

Ribosomal RNA depletion is preferred for enriching all non-ribosomal RNA species.

RNA fragmentation is crucial for sequencing larger RNA molecules into smaller pieces suitable for deep sequencing technologies.

Strand specificity can be maintained through various methods, including sodium bisulfite treatment and direct ligation of RNA adapters.

RNA-Seq platforms include Illumina, Ion Torrent, Roche 454, and Pacific Biosciences, each with different read lengths and capabilities.

RNA-Seq allows for the detection of differential expression across treatments or conditions.

Quality assessment is a critical first step in RNA-Seq bioinformatics analysis, ensuring accurate results.

RNA-Seq enables the identification of SNPs, fusion genes, and post-transcriptional gene regulation.

For more information on RNA-Seq services, visit the website www.cdgenomics.com.

Transcripts

play00:00

welcome back to the cd genomics's next

play00:02

generation sequencing video series the

play00:05

topic here is the development and

play00:07

workflow of rna sec

play00:09

we will give you a detailed introduction

play00:11

to rna sec including what rna sec is the

play00:15

development of it and the workflow of it

play00:19

rna sec the abbreviation of rna

play00:22

sequencing utilizes next generation

play00:24

sequencing ngs to reveal the presence

play00:28

and quantity of rna in a biological

play00:30

sample at a specific developmental stage

play00:32

or physiological condition rna sec is a

play00:36

powerful tool to analyze the

play00:38

continuously changing cellular

play00:39

transcriptome which is essential for

play00:42

interpreting the functional elements of

play00:44

the genome revealing the molecular

play00:46

constituents of cells and tissues and

play00:49

understanding development and disease

play00:52

sanger sequencing technology was first

play00:55

used for transcriptomics which enabled

play00:57

methods such as sage serial analysis of

play01:00

gene expression

play01:02

sage was one of the first attempts to

play01:04

quantify gene expression on a global

play01:06

basis

play01:08

almost simultaneously microarrays

play01:10

utilizing complementary probe

play01:12

hybridization quickly emerged and came

play01:15

to dominate the field of transcriptomics

play01:17

profiling for the next decade

play01:20

the advent of next generation

play01:22

technologies has enabled the sequencing

play01:24

approach to surpass microarray approach

play01:27

in 2006 the first rna sec paper was

play01:31

published by utilizing roche 454

play01:33

technology the era of rna sect dominance

play01:37

began in 2008 with the maturity of

play01:40

illuminate technology

play01:42

despite the popularization of the ngs

play01:44

technologies the application of third

play01:47

generation sequencing in rna sec is on

play01:50

its way

play01:51

pacific biosciences long read sequencing

play01:54

technology can easily cover complete

play01:56

transcript without the need of

play01:58

fragmentation which is useful to

play02:00

identify new transcripts and new

play02:02

entrants thereby accurately identifying

play02:05

isoforms alternative splicing sites

play02:08

fusion gene expression and allelic

play02:11

expression

play02:13

rna sec is often referred to as rna

play02:16

profiling using the next or third

play02:18

generation sequencing technologies

play02:20

compared to traditional microarray and

play02:22

sanger sequencing it has multiple

play02:25

advantages in technology specifications

play02:28

applications and practical issues rna

play02:31

sec is a high throughput rna profiling

play02:34

technology at single base resolution

play02:37

with low background noise

play02:39

it can be used to simultaneously map

play02:41

transcribed regions and gene expression

play02:44

as well as distinguish different

play02:46

isoforms and allelic expression

play02:48

compared to microarray and sanger

play02:50

sequencing it requires a lower amount of

play02:53

rna and cost less

play02:55

however rna sec is faced with several

play02:58

challenges such as short read and pcr

play03:01

biases

play03:02

short read used to be one concern

play03:05

but illumina sequencing technology has

play03:07

steadily increased read length and

play03:09

throughput since its introduction in

play03:11

2007.

play03:13

long paired and strand specific reads

play03:15

are commonly used for higher levels of

play03:17

map ability and de novo assembly of

play03:19

transcriptomes

play03:20

furthermore the third generation

play03:23

sequencing technology such as pacific

play03:26

biosciences smart enables full-length

play03:29

transcripts sequencing

play03:31

another concern is the impact of pcr

play03:33

amplification on the accuracy of gene

play03:36

expression quantitation via rna sec

play03:39

helicos and some of the third sequencer

play03:41

use an amplification free technology

play03:44

there are also pcr-free methods for

play03:47

illumina sequencing

play03:51

the workflow of rna sec by utilizing

play03:54

high-throughput sequencing technology is

play03:57

illustrated in the left figure

play03:59

briefly long rnas are first converted

play04:02

into a library of cdna fragments through

play04:05

rna or dna fragmentation

play04:08

sequencing adapters are then attached to

play04:10

each cdna fragment and sequence data are

play04:14

generated in a high throughput manner

play04:16

from both ends the resulting sequence

play04:18

reads are subsequently aligned with the

play04:20

reference genome or transcriptome and

play04:23

are classified into three types exonic

play04:26

reads junction reads and polio and reads

play04:29

a base resolution expression profile can

play04:32

be generated by using these three types

play04:34

of sequence reads

play04:36

following sample collection total rna is

play04:39

usually isolated via organic extraction

play04:42

or silica membranes of spin columns

play04:44

total rna sample is subsequently

play04:47

processed either by direct selection of

play04:49

polya rna or by selective removal of

play04:52

ribosomal rna because the abundant

play04:55

ribosomal rna is usually not the

play04:57

research focus and greatly reduces the

play05:00

coverage of the useful transcript

play05:03

oligodt based mrna purification

play05:06

procedure is widely used in eukaryotes

play05:09

however some rna transcripts that lack

play05:12

the polya tails are missed compared to

play05:15

the polya rna selection ribosomal rna

play05:18

depletion approach is preferred because

play05:21

it enriches all non-ribosomal rna

play05:23

species

play05:24

including trna ncrna nonpolio mrna and

play05:30

pre-processed rna

play05:32

there are two popular ribosomal rna

play05:34

depletion methods one is the

play05:36

hybridization of ribosomal rna with

play05:39

biotin labeled anti-ribosomal rna probes

play05:42

followed by removal with streptavid and

play05:45

counted magnetic beads

play05:47

the other is the selective degradation

play05:49

of ribosomal rna by a5 prime to three

play05:52

prime exonuclease that specifically

play05:54

recognizes ribosomal rna with a five

play05:57

prime phosphate

play05:59

fragmentation is subsequently conducted

play06:02

to reach the desired length for

play06:03

different ngs technologies some small

play06:06

rnas such as micrornas pee-wee

play06:09

interacting rnas and short interfering

play06:12

rnas can be directly sequenced without

play06:15

fragmentation

play06:17

larger rna molecules need to be

play06:19

fragmented into smaller pieces 200 to

play06:22

500 nt

play06:24

before deep sequencing technologies

play06:27

cdna fragmentation is usually strongly

play06:30

biased towards the identification of

play06:32

sequences from the three prime ends of

play06:34

transcripts while rna fragmentation has

play06:37

little bias over the transcript but is

play06:39

depleted for transcript ends

play06:42

therefore cdna fragmentation provides

play06:45

valuable information about the precise

play06:48

identity of these ends and rna

play06:50

fragmentation provides access to precise

play06:53

identity of the transcript body

play06:56

in the classic ngs protocols adapters

play06:59

are ligated on to share double-stranded

play07:01

dna fragments however a major drawback

play07:05

of this approach is the loss of

play07:06

information on transcriptional direction

play07:09

the pre-treatment of the rna samples

play07:11

with sodium bisulfate can convert the

play07:14

citrine into uridine widespread c to t

play07:17

transition thereby marks the coding

play07:19

stand of each transcript

play07:21

some other methods that maintain strand

play07:23

specificity have been proposed such as

play07:26

direct ligation of rna adapters to the

play07:29

rna sample before reverse transcription

play07:34

the rna sec is currently dominated by

play07:37

three different platforms alumina ion

play07:40

torrid roche 454 and pacific biosciences

play07:44

smart

play07:45

read lengths range from 200 to 600 bp

play07:48

for illumina 400 bp for ion torrent 400

play07:53

to 700 bp for four five four pyro

play07:56

sequencing system and 15 to 20 kilobits

play07:59

for pacific biosciences smart platforms

play08:03

longer reads or paired and short reads

play08:05

can reveal connectivity between multiple

play08:07

exons rna sec is a powerful method to

play08:10

study complex transcriptomes and reveal

play08:13

sequence variations in the transcribed

play08:16

regions

play08:18

quality assessment is the first step for

play08:20

the bioinformatics analysis of rna sec

play08:24

which ensures a coherent final result by

play08:26

removal of low-quality sequences

play08:29

over-represented sequences and adapter

play08:31

sequences once all reads have been

play08:34

filtered and mapped or assembled gene

play08:36

expression levels can thus be inferred

play08:39

leading to a genome-scale transcriptome

play08:41

map in terms of quality and quantity

play08:44

rna sec also allows detecting

play08:46

differential expression across

play08:48

treatments of conditions

play08:50

normalization has to be conducted to

play08:52

adjust the differences between samples

play08:55

such as library size and gene-specific

play08:57

features

play08:58

furthermore rna sec enables us to

play09:01

identify snps fusion genes and

play09:05

post-transcriptional gene regulation

play09:07

such as rna editing degradation and

play09:10

translation

play09:12

in the end if you want more information

play09:14

about accurate and reliable rna sex

play09:16

service please visit our website www

play09:21

cd

play09:23

we are more than happy to be of

play09:25

assistance

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
RNA-SeqGenomicsTranscriptomicsGene ExpressionSequencing TechBioinformaticsMolecular BiologyGenetic ResearchBiological SamplesDisease Analysis
هل تحتاج إلى تلخيص باللغة الإنجليزية؟