The Beginner's Guide to RNA-Seq - #ResearchersAtWork Webinar Series

Applied Biological Materials - abm
31 Jul 201936:09

Summary

TLDREl webinar presenta una introducción a RNA-seq, una técnica avanzada para estudiar la expresión génica a través de la secuenciación de alta capacidad. Cubre desde conceptos básicos hasta el diseño de experimentos, análisis y interpretación de datos. Se destaca la importancia de la normalización de datos y se mencionan estudios significativos que han utilizado RNA-seq, como el Proyecto Cancer Genome Atlas. Además, se ofrece un descuento promocional y se invita a la audiencia a participar en la sección de preguntas y respuestas.

Takeaways

  • 📘 El webinar comenzó con una introducción a RNA-seq para aquellos recién llegados, enfocándose en el uso constante y sistemático de los productos y servicios de la empresa.
  • 🌐 Se compartieron detalles sobre cómo, tras el webinar, se proporcionarían diapositivas y grabaciones para los asistentes, y se señaló la importancia de registrarse para recibir esta información.
  • 🔬 Se presentó a los oradores, Dr. Christopher y Battery, expertos en aplicaciones científicas y desarrollo de productos con experiencia en áreas como la biología celular y la terapia génica.
  • 🏢 Se habló sobre Applied Biological Materials (ABM), fundada en 2004, con sede en Vancouver, Canadá, y su misión de catalizar descubrimientos científicos en la vida y el desarrollo de medicamentos.
  • 📈 Se mencionó el crecimiento de ABM y sus instalaciones, destacando su compromiso de ofrecer un servicio de clase mundial a sus clientes.
  • 🧬 Se exploraron los antecedentes de RNA-seq, desde el northern blot en 1977 hasta el desarrollo de microarrays y la aparición de RNA-seq, que permite la detección de ARN y el estudio de la expresión génica.
  • 🔍 Se discutieron las ventajas de RNA-seq, como la alta resolución y la capacidad de secuenciar transcriptomas nuevos sin datos previos, en comparación con métodos anteriores.
  • 🧬 Se abordó la importancia de la selección de ARN en la preparación de la muestra para RNA-seq, ya sea a través de enriquecimiento de ARN poli-A o la eliminación de ARN ribosomal.
  • 🔬 Se explicó el flujo de trabajo general de RNA-seq, incluyendo la preparación de la librería, la PCR de enlace, la secuenciación por síntesis y el análisis de datos.
  • 📊 Se mencionaron las técnicas de análisis y normalización de datos RNA-seq, como el uso de fragments por kilobase de transcripto por millones de lecturas mapeadas (FPKM) para comparar niveles de expresión genética.
  • 🌟 Se destacaron estudios como ENCODE y el Cancer Genome Atlas Project, que han utilizado RNA-seq para entender mejor los mecanismos subyacentes de la enfermedad y el cáncer.
  • 💡 Se ofreció un código promocional del 25% de descuento en el paquete de bioinformática de secuenciación de ARN y se invitó a los asistentes a contactar a ABM para obtener más recursos y soporte técnico.

Q & A

  • ¿Qué es RNA-seq y cómo se relaciona con la regulación de la expresión génica?

    -RNA-seq es una técnica que permite identificar y cuantificar la expresión deARN en una muestra biológica en un momento dado, lo que permite investigar los cambios en la expresión génica y detectar eventos de esplicación alternativa y nuevos genes.

  • ¿Qué es el propósito de los webinars de Applied Biological Materials (ABM)?

    -Los webinars de ABM están diseñados para proporcionar capacitación constante y sistemática que ayuda a mantener a los participantes informados sobre los productos y servicios de la empresa.

  • ¿Qué áreas de investigación tiene experiencia Christopher, el especialista en aplicaciones científicas de ABM?

    -Christopher tiene casi 10 años de experiencia en investigación y diseño experimental con investigadores de todo el mundo en áreas que van desde la biología celular y el desarrollo hasta la validación de CRISPR y la ingeniería de líneas celulares.

  • ¿Qué es el propósito de la sesión de preguntas y respuestas al final del webinar?

    -La sesión de preguntas y respuestas permite a los participantes hacer preguntas relacionadas con el contenido del webinar y recibir respuestas directas de los expertos para aclarar dudas y obtener información adicional.

  • ¿Cuál es la importancia de la normalización de datos en el análisis de RNA-seq?

    -La normalización de datos es crucial para ajustar las diferencias en la cantidad de lecturas por muestra y por gen, lo que permite comparar la expresión génica de manera justa y precisa entre muestras y condiciones experimentales distintas.

  • ¿Qué es FPKM y cómo ayuda a normalizar los datos de RNA-seq?

    -FPKM (Fragments Per Kilobase of transcript per Million mapped reads) es una métrica que ayuda a normalizar los datos de RNA-seq, considerando la cantidad de fragmentos de cDNA y la longitud del gen, para estimar la expresión génica de manera relativa.

  • ¿Cómo se realiza la selección de ARN mensajero (mRNA) durante la preparación de la muestra para RNA-seq?

    -La selección de mRNA se puede realizar a través de la enriquecimiento con polia (poly A enrichment) o la eliminación de ARN ribosomal (rRNA depletion), dependiendo de los objetivos del estudio y el tipo de muestra.

  • ¿Qué es la profundidad de secuenciación y por qué es importante en RNA-seq?

    -La profundidad de secuenciación se refiere a la cantidad total de lecturas de ARN que se obtienen en un experimento RNA-seq. Es importante porque determina la cantidad de información que se puede obtener sobre la expresión génica y la calidad de los resultados.

  • ¿Qué es el Proyecto Genoma Atlas del Cáncer y cómo ha utilizado RNA-seq en sus investigaciones?

    -El Proyecto Genoma Atlas del Cáncer es un esfuerzo para analizar miles de muestras de pacientes con cáncer utilizando RNA-seq, con el objetivo de entender mejor los mecanismos subyacentes de la transformación y progresión del cáncer.

  • ¿Cómo se puede utilizar RNA-seq en la medicina personalizada y qué impacto puede tener en las enfermedades genéticas?

    -RNA-seq puede ampliar el trabajo en medicina personalizada al proporcionar información detallada sobre la expresión génica y la regulación en respuesta a diferentes condiciones y tratamientos, lo que puede tener un impacto significativo en el diagnóstico y el tratamiento de enfermedades genéticas.

Outlines

00:00

😀 Introducción al webinar sobre RNA-seq

El primer párrafo presenta el inicio del webinar y agradece la paciencia de los asistentes. Se da la bienvenida y se menciona el tema principal, RNA-seq, para aquellos recién llegados. Los webinars tienen como objetivo proporcionar capacitación constante y sistemática sobre productos y servicios. Después del webinar, se compartirán diapositivas y la grabación. Se destaca la opción de hacer preguntas a través de una cuenta de Google y se menciona un enlace de registro para asegurar la recepción de material adicional. Se presenta un esquema de los temas a tratar, desde la expresión génica hasta la interpretación de experimentos RNA-seq. Finalmente, se presentan los oradores, Dr. Christopher y Battery, expertos en aplicaciones científicas y desarrollo de productos con amplia experiencia en investigación.

05:03

🌟 Desarrollo histórico y ventajas de la técnica RNA-seq

Este párrafo detalla la evolución de las técnicas para estudiar la expresión génica, desde el northern blot en 1977 hasta la aparición de RNA-seq. Se discuten las limitaciones de los métodos anteriores, como la necesidad de conocer previamente las secuencias de ADN o ARN, y se contrastan con las ventajas de RNA-seq, que incluyen una alta resolución y la capacidad de secuenciar transcriptomas sin datos previos. Además, se menciona el aumento exponencial en las publicaciones sobre RNA-seq desde su introducción, lo que demuestra su utilización y accesibilidad para los investigadores.

10:04

🔬 Consideraciones para el diseño de un experimento RNA-seq

El tercer párrafo se enfoca en aspectos importantes a considerar al diseñar un experimento RNA-seq. Se discute la composición del ARN total en las células y la necesidad de enriquecer la muestra con el material de interés, como el ARN mensajero, mediante técnicas como la enriquecimiento con polia o la depuración de ARN. También se abordan las decisiones a tomar en cuanto a la profundidad de secuenciación y el uso de secuenciación de un extremo o ambos extremos, dependiendo de los objetivos del proyecto.

15:05

🛠️ Flujo de trabajo general de RNA-seq y tecnologías de secuenciación

Aquí se describe el flujo de trabajo general de un proyecto RNA-seq, desde la preparación de la biblioteca hasta el análisis de datos. Se mencionan los pasos de fragmentación, ligación de adaptadores, generación de clusters y secuenciación por síntesis. Se destaca la importancia de la PCR puente y se explica cómo se evitan los problemas de sobre-o sub-agrupación. Además, se introducen las tecnologías de control de calidad, como el qubit, el Agilent bioanalyzer y el qPCR, y se describe cómo se utiliza la tecnología de secuenciación por síntesis para leer las secuencias ADN.

20:07

📊 Análisis y interpretación de datos RNA-seq

El quinto párrafo se centra en el análisis y la interpretación de los datos obtenidos a través de RNA-seq. Se explica el proceso de conversión de datos en formato fastq y se discuten las etapas de alineación de lecturas, normalización y generación de visualizaciones como mapas de calor y análisis de componentes principales. Se introduce el uso de métricas como las piezas por kilobase de transcripto por millones de lecturas mapeadas (FPKM) para la normalización de datos y se describen las diferentes técnicas de análisis, como la anotación funcional de genes diferencialmente expresados.

25:07

🏥 Aplicaciones de RNA-seq en la investigación y la medicina

Este párrafo presenta ejemplos de estudios que han utilizado RNA-seq, como el Proyecto Encode, el Proyecto Model Organismo Encode y el Atlas del Genome del Cáncer. Se destaca cómo RNA-seq ha permitido avances en la medicina personalizada y en la comprensión de las bases moleculares de enfermedades como el cáncer. Se ofrece un código promocional para un descuento en el paquete de bioinformática de secuenciación de ARN y se mencionan los recursos educativos y el soporte técnico que ofrece ABM.

30:08

📞 Preguntas y respuestas sobre RNA-seq y servicios de ABM

En el último párrafo, se abordan preguntas comunes relacionadas con el proceso de RNA-seq, como el tiempo que toma la preparación de la biblioteca y el proceso de secuenciación, la posibilidad de realizar análisis con datos brutos proporcionados por el investigador y el manejo de muestras que no pasan la calidad. Se menciona la capacidad de ABM de realizar análisis personalizados y se ofrece una visión general de los recursos y el soporte que la empresa proporciona a sus clientes.

Mindmap

Keywords

💡ARN

ARN, o Acido Ribonucleico, es una molécula presente en todas las células vivas que desempeña un papel crucial en la transmisión de la información genética y en la síntesis de proteínas. En el video, se discute cómo el ARN se utiliza en el análisis de expresión génica, donde se puede estudiar su presencia y cantidad en una muestra biológica para entender los cambios en la expresión génica a lo largo del tiempo.

💡RNA-seq

RNA-seq es una técnica de secuenciación de próxima generación que permite medir la cantidad y la variedad de ARN en una muestra de tejido o células. Es fundamental para entender la expresión génica y se menciona en el video como el tema principal del webinar, donde se ofrece una introducción a esta tecnología y se discuten su aplicación y análisis.

💡Secuenciación de próxima generación (NGS)

La Secuenciación de Próxima Generación, o NGS, es una tecnología que permite secuenciar millones de fragmentos de ADN o ARN simultáneamente. En el video, se presenta NGS como el método subyacente detrás de la técnica RNA-seq, permitiendo el estudio detallado de la expresión génica y la identificación de eventos de esplicación alternativa.

💡Diseño del experimento

El diseño del experimento es un aspecto clave en la realización de un estudio RNA-seq, como se destaca en el video. Esto incluye la consideración de factores como el tipo de muestra, la profundidad de secuenciación y el uso de técnicas como la enrichment de ARN o la eliminación deARN para ribosoma, que son esenciales para obtener datos confiables y significativos.

💡Análisis de datos

El análisis de datos es un componente esencial en el proceso de RNA-seq, que se aborda en el video. Incluye la normalización de datos para corregir diferencias en la cantidad de lecturas por muestra y por gen, lo que permite comparar la expresión génica entre muestras de manera justa. El análisis también puede involucrar técnicas como mapas de calor y análisis de componentes principales.

💡Normalización de datos

La normalización de datos es un paso crítico en el análisis de RNA-seq, mencionado en el video, que corrige las diferencias en la cantidad de lecturas por muestra y por gen. El término FPKM, que se utiliza para describir la expresión de un gen en términos de fragmentos por kilobase de transcripción por millón de lecturas mapeadas, es un ejemplo de cómo se realiza esta normalización.

💡Annotación funcional

La anotación funcional se refiere al proceso de identificar los roles biológicos de los genes y cómo se relacionan con procesos o caminos moleculares específicos. En el video, se menciona como una parte del análisis de RNA-seq, permitiendo a los investigadores entender el significado biológico de los cambios en la expresión génica.

💡Estudio ENCODE

El Proyecto ENCODE es un esfuerzo en curso para identificar las regiones reguladoras en el genoma humano y se menciona en el video como un ejemplo de aplicación del RNA-seq. Este proyecto ha utilizado la tecnología de RNA-seq para generar un mapa detallado de los elementos funcionales en el genoma.

💡Transcriptoma

El transcriptoma hace referencia al conjunto completo de ARN producido por un organismo, incluyendo mRNA, ARNr y otros tipos de ARN. En el video, el término se utiliza para describir el objetivo de la secuenciación de RNA-seq, que es obtener una visión completa de la expresión génica en un momento dado.

💡Bioinformática

La bioinformática es el campo interdisciplinario que se enfoca en el análisis y la interpretación de datos biológicos, como los generados por la RNA-seq. En el video, se menciona la promoción de un paquete de bioinformática para el análisis de datos de RNA-seq, destacando la importancia de esta disciplina para la interpretación de los resultados.

Highlights

Webinar begins with an introduction to RNA-seq for new attendees, outlining the format and benefits of the training.

Post-webinar resources include PowerPoint slides and recorded webinar for continued learning.

Google account sign-in enables live Q&A during the webinar through the chat box.

Registration ensures receipt of slides and post-webinar question responses.

Webinar topics include gene expression background, NGS intro, RNA-seq experiment design, workflow analysis, and case studies.

Speaker introductions feature Dr. Christopher and Battery, specialists with extensive research experience.

ABM's mission is to catalyze scientific discoveries in life sciences and drug development.

Historical context of gene examination methods leads to the current RNA-seq technology.

Advantages of RNA-seq include single nucleotide resolution and ability to sequence new transcriptomes without prior data.

Illumina's 2007 advent correlates with an exponential increase in RNA-seq publications, indicating widespread use.

Next-generation sequencing (NGS) basics are explained, including read types and considerations for project goals.

RNA-seq experiment design considerations include enriching for mRNA and choosing sequencing depth and read length.

General RNA-seq workflow involves library preparation, bridge PCR, sequencing, and analysis.

Sequencing by synthesis technology and quality control measures are discussed for ensuring accurate results.

Data analysis post-sequencing includes converting raw data to fastq, alignment, normalization, and various analytical techniques.

FPKM is introduced as a metric for normalizing gene expression data across samples and gene lengths.

Examples of RNA-seq applications include large-scale projects like ENCODE and its impact on personalized medicine.

ABM offers promo codes, resources, and support for RNA sequencing and bioinformatics packages.

Webinar concludes with a Q&A session addressing practical questions about RNA-seq processes and services.

Transcripts

play00:01

hello everyone thank you all for your

play00:04

patience and we'll be starting the

play00:06

webinar up now so good morning and thank

play00:11

you for taking the time to join our

play00:12

webinar today today's topic will provide

play00:15

you with an introduction to RNA seek for

play00:18

those of you who are new to our webinars

play00:20

they're designed to provide constant and

play00:22

systematic training that will help keep

play00:24

you in whelmed formed on our products

play00:26

and services following the webinar we'll

play00:29

be sharing a copy of the PowerPoint

play00:31

slides as well as the recorded webinar

play00:33

itself I would like to point out that if

play00:36

you're signed into a Google account you

play00:38

can ask any questions you may have in

play00:40

the chat box to the right of the screen

play00:42

please know that if you did not register

play00:44

earlier there's a registration link in

play00:47

the video description

play00:48

this will ensure that you get a copy of

play00:50

the slides and answers to any questions

play00:52

post it in the chat during the webinar

play00:56

to help frame this webinar outline here

play01:00

are the topics that we'll be covering

play01:01

first will be we'll begin with some

play01:04

background on gene expression followed

play01:07

by an intro into NGS after that we'll

play01:10

discuss some of the important

play01:11

considerations when designing your RNA

play01:13

seek experiment followed by

play01:15

understanding the workflow analysis and

play01:18

interpretation of your experiment

play01:20

finally we'll take some time to review

play01:23

some projects that have shown the power

play01:25

and versatility of RNA seek before we

play01:31

begin talking about RNA seek I'd like to

play01:33

take a moment to introduce your speakers

play01:35

joining me today is my colleague dr.

play01:37

Chris Christopher's mention a scientific

play01:40

application specialist here at ABM after

play01:44

completing his PhD at UBC where he

play01:46

studied stem cell regulation he joined

play01:49

ABM with a passion for helping

play01:51

scientists achieve their goals with

play01:54

nearly 10 years of experience in

play01:56

research and experimental design with

play01:58

researchers around the world and in

play02:04

areas ranging from cell and

play02:05

developmental biology to CRISPR

play02:07

validation and cell line engineering he

play02:10

can assist with nearly every product

play02:12

with M nearly any

play02:13

project you have from initial set up to

play02:16

post sequencing data analysis alongside

play02:19

him you have myself battery battery a

play02:22

product development specialist with over

play02:24

five years research experience in areas

play02:27

such as drug delivery lipid biosynthesis

play02:29

and gene therapy obtained over the

play02:31

course of my graduate degree at SFU our

play02:34

main goal is to work closely with

play02:36

clients and provide them with the

play02:37

support that ABM has been renowned for

play02:41

before we begin with the core content of

play02:46

the webinar I'd like to take a moment to

play02:48

talk about applied biological materials

play02:50

and what our goals are

play02:53

ABM was founded in 2004 and has been

play02:57

driven to catalyze scientific

play02:58

discoveries in the field of life science

play03:01

and drug development for the last 15

play03:03

years headquartered in Vancouver Canada

play03:05

we are one of the fastest growing

play03:07

biotech companies in the region and

play03:10

since our inception we have worked hard

play03:12

to become known as a reliable source for

play03:13

clients such as yourselves this hard

play03:18

work has allowed us to expand our

play03:20

facilities beginning with a branch in

play03:22

Chang su Province in China in 2013 and a

play03:26

new facility in Bellingham that will be

play03:28

opening up later in 2019 these

play03:31

expansions have put us in a position to

play03:33

work with each and every one of you and

play03:34

provide the world-class service you all

play03:36

deserve with our team of passionate and

play03:40

trained scientists iBM is dedicated to

play03:43

empowering researchers with the latest

play03:44

innovations for all their scientific

play03:46

needs now before we discuss the core

play03:51

details of this webinar I feel it's

play03:53

important to discuss some of the work

play03:54

that led us to RNA seek as well as it's

play03:57

important to enter research in general

play04:01

when studying the role of genes in

play04:04

development of disease there are three

play04:06

distinct levels we can use for

play04:07

examination we can study them at the DNA

play04:10

level by studying mutations to determine

play04:13

their effects on gene at the RNA level

play04:15

by studying gene expression and at the

play04:18

protein level by examining folding

play04:20

patterns of genes the first link in the

play04:24

chain leading to RNA seek is the

play04:26

northern bloc

play04:27

developed in 1977 by James Alwyn David

play04:31

Kemp and George stark at Stanford

play04:34

University

play04:34

this tool was extremely useful and that

play04:37

it allowed for the study of gene

play04:39

expression via RNA detection following

play04:44

the northern blot was rtq PCR which was

play04:48

developed in the early 80s by Kary

play04:50

Mullis and this technique allows for the

play04:52

detection characterization and

play04:54

quantification of RNA transcripts

play04:59

finally the last step prior to RNA seek

play05:02

was the microarray which was developed

play05:05

by Patrick Brown at Stanford University

play05:08

this tool is impressive and then it can

play05:10

be used to simultaneously study the

play05:12

expression levels of thousands of genes

play05:14

at one time and with that we finally

play05:18

come to the present RNA seek which

play05:21

allows us to reveal the presence and

play05:23

quantity of RNA in a biological sample

play05:26

at any given moment as well as to look

play05:29

at the changes in gene expression over

play05:30

time now when considering these three

play05:35

prior methods we can see that while they

play05:37

all had strengths such as the low

play05:39

reagent cost and the ability to be

play05:41

easily done in the comfort of your own

play05:43

lab there are some items that we should

play05:45

consider a key note is the fact that

play05:48

each system required prior knowledge of

play05:51

the chain or the mRNA in order to be

play05:53

used making it impossible to use these

play05:56

methods to study novel genes with RNA

play05:59

seek on the other hand the pros are

play06:02

considerable well the initial cost of

play06:05

RNA seek is high relative to the

play06:08

previous methods the use of single

play06:10

nucleotide resolution alongside the

play06:12

ability to sequence new transcriptomes

play06:15

without prior data is a huge bonus when

play06:18

coupled with its massively high

play06:20

throughput this creates a winning

play06:22

combination that would be an asset to

play06:24

any researcher now with the advent of

play06:28

Illumina in 2007 we can see the near

play06:32

exponential increase in the number of

play06:34

RNA seek publications which demonstrates

play06:36

the widespread use and accessibility of

play06:39

this tool to researcher

play06:41

and now we find ourselves in a position

play06:44

to answer the question what is RNA seek

play06:49

in essence RNA seek is a technique that

play06:53

allows us to begin with cells or tissues

play06:55

and examine the expressed genes by

play06:58

taking advantage of next-generation

play07:00

sequencing we can learn about changes in

play07:03

gene expression and identify novel

play07:05

splicing events and genes with this

play07:09

brief background I'd like to pass things

play07:11

over to Christopher who will begin with

play07:13

a brief overview of - the topics for

play07:14

today's talk but first a quick reminder

play07:17

to everyone listening to please post any

play07:19

questions you may have in the in the

play07:22

chat box do during the course of the

play07:24

webinar

play07:28

thank you boshy for that brief

play07:30

introduction and thanks for joining us

play07:32

today for our webinar

play07:33

I'm gonna go briefly into an

play07:35

introduction to next-generation

play07:36

sequencing to give you a better idea of

play07:38

how the technology works before going

play07:41

into more details about RNA seek so

play07:44

next-generation sequencing can be used

play07:46

for a variety of purposes including

play07:47

whole genome sequencing studying changes

play07:50

in gene expression using RNA seek or

play07:53

performing metagenomic studies with

play07:55

environmental samples the basic workflow

play07:59

for all next-generation sequencing is

play08:01

the same where you take input material

play08:03

whether it's DNA or RNA fragment it to

play08:06

be of a similar size ligate sequencing

play08:10

adapters that can then bind to the

play08:12

sequencer before undergoing

play08:14

high-throughput sequencing now there are

play08:19

a couple important terms to know for all

play08:21

next-generation sequencing approaches

play08:23

the first is the read a read is a

play08:26

sequence of nucleotides that will be

play08:28

sequenced so on the right you can see

play08:31

that there's a double-stranded DNA

play08:32

molecule and when it's sequenced it will

play08:36

sequence along one strand which gives

play08:39

you the sequencing read for the

play08:40

nucleotides that are present we'd refer

play08:44

this as single and sequencing since

play08:45

you're only synthesizing one strand in

play08:48

this case the sense strand alternatively

play08:51

you can also use paired-end sequencing

play08:53

to read a fragment

play08:54

both strands of the molecule when you're

play09:00

trying to decide which option to pick

play09:02

for your project it's important to

play09:04

consider what your actual project goal

play09:06

is for instance single end sequencing is

play09:09

generally sufficient to study changes in

play09:11

gene expression whereas paired end

play09:15

sequencing is more useful for whole

play09:17

genome sequencing studying alternative

play09:20

splicing or de novo transcriptome

play09:22

studies next the read can vary in length

play09:27

the read length is considered to be the

play09:29

number of nucleotides that are sequenced

play09:31

per read to give you an idea of the read

play09:34

length sizes typically for RNA seek 75

play09:37

nucleotides is a common read length

play09:39

which would be great for studying gene

play09:41

expression or resequencing samples 150

play09:44

nucleotides is a longer read length

play09:46

that's more suitable for assembling new

play09:48

transcriptomes

play09:49

or whole genome sequencing for

play09:50

eukaryotes and even longer reads such as

play09:54

300 nucleotides are more suitable for

play09:55

amplicon seek as well as metagenomic

play09:58

studies next once you have a sequencing

play10:04

read you still have to figure out how it

play10:06

aligns to a reference sequence so in

play10:10

this example you can see that there's a

play10:11

reference sequence in dark blue along

play10:13

the bottom of the image and then the

play10:15

read and light blue which maps to a

play10:17

specific location in that reference

play10:19

sequence if we were to look at a

play10:23

specific nucleotide such as this G here

play10:25

highlighted with a red box you can see

play10:28

that there are two reads in light blue

play10:30

that map to this eight and the reference

play10:32

sequence because there are two reads

play10:35

covering this we describe this as 2 X

play10:38

coverage because you cover that

play10:39

nucleotide with two different readings

play10:41

if we look at another regions such as

play10:44

this C residue you can see that there

play10:46

are four reads that map to this specific

play10:48

location this would give it a coverage

play10:52

of 4x oftentimes there are also sites

play10:56

which don't have any reads that mapped

play10:58

them at all such as this a residue and

play11:01

for this site you could describe it as

play11:03

having zero x coverage now when you take

play11:08

all

play11:08

of these sites and add the levels of

play11:10

coverage for each nucleotide together

play11:12

you can make an idea of the coverage for

play11:15

that site so in this example there are

play11:18

six reads for these three bases G a and

play11:21

C when you divide it by the number of

play11:25

nucleotides you would get 2x coverage so

play11:27

you could say that the sequencing depth

play11:29

is 6 reads with an average of 2x

play11:30

coverage for these nucleotides now for

play11:36

most samples you would typically

play11:37

sequence millions of reads to make sure

play11:39

as much of the entire transcript time as

play11:41

possible is sequenced typically bigger

play11:44

genomes or transcript rooms require more

play11:46

reads so for instance bacteria which

play11:49

have a genome size of about 5 million

play11:51

base pairs would require fewer reads and

play11:53

less sequencing than mammals that would

play11:55

have a 3 billion base pair genome or

play11:57

plants which would even have much larger

play11:58

genome sizes in terms of RNA seek for

play12:03

bacteria this would look like 8 million

play12:04

reads per sample versus 20 million reads

play12:07

or 40 million reads per sample from

play12:08

mammals and plants next I'm gonna go

play12:14

over a couple important things to

play12:15

consider when designing your rna-seq

play12:16

experiment first most of the RNA that's

play12:21

present in a cell is not actually

play12:23

messenger RNA that is what most

play12:25

researchers want to sequence for

play12:27

instance if you take the total RNA

play12:29

present and look at the breakdown of

play12:31

what it's composed of you'll find that

play12:33

about 85 percent of it is ribosomal RNA

play12:36

which most researchers typically do not

play12:37

want a sequence next about 10 to 12

play12:41

percent is transfer RNA again which most

play12:44

researchers are not interested in

play12:45

sequencing for their projects mRNA

play12:48

itself is typically about 2 to 3 percent

play12:50

of the total RNA present in a cell and

play12:52

then an even smaller percentage than

play12:54

this is composed in micro rna's long

play12:58

non-coding rnas circular RNAs and other

play13:00

sequences so if your starting point is

play13:05

this large population of total RNA you

play13:08

have to find a way to enrich for what

play13:10

you actually want a sequence in that

play13:11

sample there are two basic ways that we

play13:14

can do this the first of which is poly a

play13:17

enrichment where we use special beads

play13:19

that can bind to the poly a tails on

play13:21

mrnh

play13:21

and scripts to pull them out of the

play13:23

population of total RNA and enriched for

play13:26

poly a sequences alternatively if you

play13:30

want to study mRNAs as well as small

play13:33

RNAs like micro RNAs

play13:34

you can use a treatment called our RNA

play13:37

depletion which will selectively remove

play13:40

the are RNA sequences from the total RNA

play13:42

population if you're working with

play13:47

eukaryotic samples you can use either

play13:49

option poly a enrichment or RNA

play13:51

depletion if you're working with

play13:53

prokaryotic samples however you

play13:55

absolutely have to use an RNA depletion

play13:58

treatment because prokaryotic cells

play14:00

typically do not have poly a tails on

play14:02

their mRNA transcripts next you need to

play14:06

consider how you're designing your

play14:08

project and what your goal is if your

play14:10

goal is to Nauvoo transcriptome assembly

play14:12

for a species which previously hasn't

play14:15

had a sequence transcriptome you

play14:18

typically want greater sequencing depth

play14:20

and longer reads to help with the

play14:21

assembly next if you only want to study

play14:25

changes in gene expression to see if a

play14:27

particular gene increases or decreases

play14:30

in response to a stimulus using single

play14:33

end sequencing is typically sufficient

play14:36

finally if you want to identify novel

play14:38

transcripts or new alternative splicing

play14:40

events you would typically want to use a

play14:43

greater sequencing depth and paired end

play14:45

sequencing next I'm gonna go over

play14:50

briefly the general RNA seek workflow

play14:52

for most projects this is highlighted

play14:56

here and I don't want you to focus too

play14:58

much on this but there are four basic

play15:00

steps beginning with library preparation

play15:02

before a sample is input into the

play15:04

sequencer followed by bridge PCR

play15:07

sequencing by synthesis and finally

play15:09

analysis at the outset of the project

play15:13

you'd have to input your starting

play15:15

material assess its quality and convert

play15:18

it to DNA typically when we begin with a

play15:21

sample we would ask is the mRNA degraded

play15:24

if it's not degraded we can perform poly

play15:27

a selection before converting the RNA

play15:30

into DNA this conversion from RNA into

play15:33

DNA is done to increase this

play15:35

ability of the molecule and ensure

play15:36

success with sequencing if the mRNA

play15:40

transcript is degraded however we can do

play15:42

special treatments that can repair the

play15:44

transcript to still provide sequence

play15:46

below material for the project next once

play15:50

the material has been converted into DNA

play15:52

it will likely be of different sizes

play15:54

because different mRNA transcripts are

play15:56

generally of different lengths so the

play15:58

next step would be to fragment the DNA

play16:00

into uniform sizes to make sure each

play16:03

fragment is equally likely to be

play16:04

sequenced the next step is to ligate

play16:08

sequencing adapters so that the DNA

play16:10

fragments can bind the sequencer itself

play16:16

once this is done you can input the

play16:19

material into the sequencer and you'll

play16:21

go through a step called cluster

play16:23

generation or bridge PCR now oh sorry

play16:29

I'm just gonna jump back briefly bridge

play16:31

PCR is one of the most important steps

play16:33

for sequencing because when you're

play16:35

sequencing these molecules it's not

play16:37

possible to sequence a single DNA

play16:39

molecule in the sequencer itself you

play16:43

first have to generate clusters of

play16:44

identical molecules that are then

play16:46

sequenced together so in this diagram in

play16:49

part one you can see a DNA molecule that

play16:53

binds to the flow cell of the sequencer

play16:56

the molecule then bends over and binds

play16:59

to the other adapter on the flow cell

play17:02

before DNA synthesis begins to form a

play17:05

double-stranded DNA molecule you can see

play17:09

this now in part 3 where the sequencing

play17:12

reaction that generated the double

play17:14

strand DNA occurs in step four you can

play17:18

see that you now have two molecules in

play17:20

two clusters this process is repeated

play17:23

many times in panel five until you have

play17:27

sufficient DNA molecules to be able to

play17:29

sequence each cluster now importantly

play17:33

you need to know the concentration of

play17:35

your library so that you can avoid over

play17:37

clustering over clustering happens when

play17:40

the cluster density is too high this

play17:42

leads to the Machine not being able to

play17:44

accurately read each cluster to

play17:47

determine what the DNA sequences

play17:48

and generally leads to the entire

play17:50

sequencing run failing the opposite

play17:53

problem you also want to avoid which is

play17:55

under clustering this happens when the

play17:57

cluster density is too low and it leads

play17:59

to an overall lower sequencing output

play18:02

and again makes it hard for the

play18:03

sequencer to read what the sequences at

play18:06

that cluster next I'll briefly go over

play18:10

aluminous technology for sequencing by

play18:12

synthesis to give you an idea of how the

play18:14

sequencing process itself actually works

play18:17

so when you have the template DNA it

play18:20

will have a primer that will prime it

play18:23

for DNA synthesis that binds to the flow

play18:25

cell next you have individual

play18:29

nucleotides that have fluorescent dyes

play18:30

bound to them which can be used during

play18:33

the synthesis reaction in the next panel

play18:43

you can see that the a nucleotide has

play18:44

been added to the sequence

play18:46

all the nucleotides that didn't bind

play18:49

because they don't have a complementary

play18:51

base on the other strand would be washed

play18:53

away after the step occurs there be a

play18:58

fluorescent emission and response to

play19:00

light stimulus from the nucleotide

play19:03

that's been added that will be imaged by

play19:04

the sequencer after this step the

play19:07

fluorescent editor the fluorescent site

play19:10

is cleaved from this molecule washed

play19:13

away and then the process repeats again

play19:18

once this repeats a to the end of

play19:21

sequencing you will then be able to have

play19:22

the Machine go through and read each of

play19:25

the fluorescent signals that occurred to

play19:27

piece together what the sequencing was

play19:29

for that molecule at every stage of

play19:33

sequencing though quality controls

play19:34

essential and there are three core

play19:38

technologies we use to perform this QC

play19:40

the first of which is qubit that can

play19:43

measure DNA concentration the Agilent

play19:46

bio analyzer which can assess your DNA

play19:49

library and how well it's been

play19:51

fragmented and then qpcr which can be

play19:54

used to determine how much of the actual

play19:56

prepared library is sequence of all so

play20:00

you're starting one of the most

play20:01

important things you can do is to run a

play20:04

RN angel to determine if your material

play20:06

is possibly degraded if you have a high

play20:09

quality sample you should see a few

play20:11

distinct bands in your gel versus if the

play20:13

sample is degraded you'll typically see

play20:15

a large smear in that Lane if you do

play20:21

have this degradation we can use special

play20:23

kits to recover degraded RNA or to

play20:26

reverse formaldehyde modification to RNA

play20:28

if your sample was fixed prior to

play20:31

extraction and this can help provide you

play20:33

with sequence little material for RNA

play20:35

seek next if you have a low amount of

play20:38

starting material the first thing you

play20:40

have to do is actually know what this

play20:42

quantity is and for that we can use

play20:44

cubit to measure the concentration of

play20:46

nucleic acids that are present in a

play20:48

sample this is crucial for library

play20:50

preparation if you do have a low amount

play20:53

of starting material we can use special

play20:55

kits that can amplify the mRNA that is

play20:57

present using the poly a tail prior to

play21:00

RNA seek next after you have the

play21:05

material that's been converted from RNA

play21:07

to DNA and you go through the

play21:09

fragmentation step the Agilent

play21:11

bioanalyzer comes in handy to tell you

play21:13

how efficient the fragmentation was and

play21:15

to give you an idea of the average

play21:16

fragment size in your sample and then

play21:19

after you've liked it at the adapters

play21:20

successfully qPCR becomes essential and

play21:23

I'll go through these two steps next

play21:25

with Agilent when you input your sample

play21:29

the data that you get from it is

play21:31

effectively a graph that tells you how

play21:34

big the fragment sizes are that are

play21:36

present in your sample and their general

play21:37

distribution so in this graph here you

play21:40

can see that there are two peaks on the

play21:42

left and right hand side of the graph

play21:44

I've highlighted them here in red boxes

play21:47

for you what you actually want to look

play21:49

at though is the fragment size highlight

play21:52

in this green box which tells you

play21:54

roughly that your fragments are on the

play21:56

higher end of the range of fragment

play21:58

sizes which is a good result which you

play22:00

want to have for sequencing the

play22:03

alternative is when you have an uneven

play22:06

distribution of fragment sizes with no

play22:08

large fragments and typically these

play22:12

smaller fragments are more difficult to

play22:13

see

play22:14

and are less likely to give you a

play22:15

high-quality sequencing result or lead

play22:17

to a failure in sequencing so we

play22:19

generally want to avoid this next with

play22:22

qPCR this gives you a helpful metric

play22:26

that you can't measure with qubit

play22:27

now remember qubit can tell you about

play22:29

the amount of nucleic acids present in a

play22:31

sample but qPCR can tell you how much of

play22:34

the nucleic acids are actually

play22:37

sequencing bull and have adapters like

play22:39

age them correctly next I'm going to go

play22:44

over a bit more of rna-seq analysis and

play22:46

interpretation of the data

play22:49

so the general workflow once the

play22:53

next-generation sequencing has been

play22:54

complete is that you will go from raw

play22:56

data to a format called fast queue with

play22:59

intermittent steps before you can do

play23:01

data analysis the raw data from the

play23:03

sequencer you will generally never see

play23:05

because either software or the sequencer

play23:08

itself will process it into fast queue

play23:10

now fast queue is simply the FASTA

play23:14

format a text-based format for depicting

play23:16

nucleotides with the quality control

play23:18

info from that sequencing run in that

play23:20

particular sequencing read if you want

play23:24

to see what this actual fast queue data

play23:26

would look like shown here to the right

play23:29

and you can see that it's largely a line

play23:31

of alphanumeric text but it's kind of

play23:33

hard to understand what's going on the

play23:36

important sequence information here is

play23:38

highlighted in the red box which is the

play23:39

actual sequencing result from this

play23:42

particular sample from this particular

play23:43

cluster next you have to figure out how

play23:48

to use the actual data before you can do

play23:49

the analysis that you're interested in

play23:52

so with fast queue this is generally

play23:55

what we provide for each next-generation

play23:57

sequencing project whether it's RNA seek

play23:59

whole genome seek or another type of

play24:01

sequencing if you wanted to work with

play24:07

this you'd have to take fast queue data

play24:09

first and align it to the reference

play24:11

sequence we briefly went over the

play24:13

alignment of reads

play24:14

earlier in the presentation this is one

play24:16

of the crucial steps at the end of

play24:18

sequencing after you've done the

play24:21

alignment which can generally be done

play24:22

using different software you would then

play24:24

have to normalize your data before you

play24:26

can begin the

play24:27

alysus and generate beautiful figures

play24:29

that you can eventually publish with

play24:31

your manuscript I'm gonna over a bit of

play24:35

how the data is normalized it's

play24:37

important to normalize your data for two

play24:40

main reasons

play24:40

but first is you need to normalize the

play24:43

number of reads per sample because some

play24:45

samples may receive different reads next

play24:49

you need to normalize the number of

play24:50

reads per gene because genes are

play24:52

different lengths and you have to take

play24:54

this into account I'll go over this a

play24:56

bit more in the next few slides for

play24:59

instance if you have three samples

play25:01

sample a B and C you would generally

play25:05

want each sample to be sequenced an

play25:07

equal amount but because of stochastic

play25:09

differences in sequencing that cannot be

play25:12

controlled you generally have different

play25:14

sequencing outputs for each sample in

play25:15

this example sample a would have about

play25:18

20 million reads sample B about 30

play25:20

million reads and sample C would have 10

play25:22

million reads if you didn't normalize

play25:24

your data without doing any processing

play25:27

you would think that samples C has

play25:31

one-third of the gene expression of

play25:33

sample B for instance or that sample a

play25:37

and sample B's expression levels are

play25:39

different even if they may be identical

play25:43

secondly you need to normalize relative

play25:46

to the length of the gene and the number

play25:48

of reads for that gene in this example

play25:51

you can see gene a is about one KB and

play25:53

gene B is 2 KB or twice that length for

play25:59

this example gene a has 5 reads that map

play26:01

to it and gene B has 10 reads but gene B

play26:03

is twice as long if you didn't normalize

play26:06

your data you would think that gene B is

play26:08

expressed twice as highly even if that

play26:10

may not be the case so you'd then have

play26:12

to normalize the data there's a useful

play26:15

metric we can use that can account for

play26:17

differences between samples and between

play26:19

genes this is called fragments per

play26:22

kilobase of transcript per million

play26:25

mapped reads or fpkm this is a way to

play26:29

look at the relative expression of a

play26:30

transcript proportional to the number of

play26:32

cDNA fragments that it originated from

play26:35

or sorry that originated from it so fpkm

play26:38

is essentially the normalized estimation

play26:40

gene expression based on the RNA

play26:42

sequencing data or another way to think

play26:45

of this is fpkm helps normalize the

play26:48

differences in the amount of reads per

play26:50

sample as well as the number of reads

play26:53

proportional to how long a given gene is

play26:57

in this specific example once we've

play27:01

normalized gene a and gene B you can see

play27:03

that the fpkm value is 2 for both

play27:07

indicating that both these genes despite

play27:09

having different lengths and different

play27:11

numbers of mapped reads have the

play27:13

relatively similar expression level if

play27:17

you were to compare this to a treated

play27:19

normalized sample you'd then be able to

play27:21

include whether or not expression has

play27:22

increased or decreased relative to

play27:24

control using this normalized data set

play27:28

next once you have this what type of

play27:31

analysis can you actually do the first

play27:34

is you can use heat maps to look at

play27:37

large changes in expression between a

play27:39

control sample and a tree example and to

play27:42

visually detect depict increases or

play27:44

decreases in gene expression next you

play27:48

can perform principal component analysis

play27:50

to see if any of your samples have

play27:53

different changes in gene expression

play27:54

that may correlate and be related to the

play27:56

treatment conditions they underwent next

play28:00

you can also do functional annotation of

play28:03

differentially expressed genes with this

play28:06

this allows you to look at genes that

play28:07

have increased an expression decreased

play28:09

in expression or remain the same and see

play28:12

if there's any similar pathways or

play28:14

processes that they might be involved in

play28:15

such as metabolism inflammation or even

play28:18

a mean response next I'm gonna hand it

play28:23

back to boshy to go over a couple

play28:24

examples of researchers that have used

play28:26

RNA seek in their studies

play28:34

thank you for that excellent talk

play28:36

Christopher as he just mentioned I'll be

play28:38

taking a few minutes to discuss some of

play28:40

the studies that people have used to

play28:42

demonstrate the power and versatility of

play28:44

our NAC and after that we'll wrap up the

play28:48

webinar and proceed to the Q&A session

play28:51

now the first of these projects is the

play28:54

encode project which focused on

play28:56

identifying genome-wide regulatory

play28:58

regions in different cell lines this was

play29:01

then followed by the model organism

play29:03

encode project which aimed to provide by

play29:06

the biological research community with a

play29:08

comprehensive encyclopedia of genomic

play29:11

functional elements in the model

play29:13

organism C elegans and D Milano gaster

play29:17

this was a huge undertaking and was

play29:20

aimed at helping to better understand

play29:22

the downstream effects of regulatory

play29:24

regions next we have the Cancer Genome

play29:28

Atlas Project which took advantage of

play29:31

RNA seek to analyze thousands of cancer

play29:34

patient samples to better understand the

play29:37

underlying mechanisms of malignant

play29:39

transformation and progression and

play29:41

cancer finally the use of RNA seek in

play29:46

medicine has been substantial as it's

play29:49

allowed for researchers to expand their

play29:51

work into personalized medicine which

play29:53

can have a huge impact on genetic

play29:55

disease and with that we now come to the

play30:00

wrap-up portion of the webinar and as a

play30:02

thank you

play30:03

we'd like to offer your promo code for

play30:05

25% off of your RNA sequencing

play30:08

bioinformatics package which you can see

play30:10

here and will also be included in our

play30:12

follow-up email and now that we've

play30:17

finished through with the background of

play30:18

RNA seek let's take a moment to examine

play30:21

some of the resources that ABM offers

play30:24

first make sure to check out our website

play30:27

where all of our resources are collected

play30:29

this includes our knowledgebase articles

play30:32

as well as our YouTube videos and blog

play30:34

posts to help provide you with the tools

play30:36

you need for your experiments now not

play30:41

only do we have a diverse selection of

play30:43

educational material we also have an

play30:45

incredibly knowledgeable technical

play30:47

support

play30:47

team who can guide you through your

play30:49

experiments as well as a dedicated

play30:51

customer service team to ensure that you

play30:54

receive your items in a timely manner if

play30:57

you have any questions about our

play30:59

materials or services you can always

play31:01

reach out to us by phone email online

play31:04

chat and in addition we have a

play31:06

comprehensive frequently asked questions

play31:09

section on our website so you can always

play31:11

browse through and see if you can find

play31:13

the answers there thank you for your

play31:17

taking the time to join us today you

play31:19

just give you a heads up we'll be having

play31:21

a for another webinar on whole genome

play31:23

sequencing so keep an eye out for an

play31:26

invitation from us soon thank you for

play31:29

taking the time to join us today and now

play31:31

we'll proceed over to the Q&A section

play31:39

all right so you just take a moment give

play31:42

us a moment please just to go through

play31:44

all the questions and we'll start up

play31:45

with as many as we can get through all

play31:53

right so we've got one here from Ian oh

play31:56

and the question is how long does it

play31:59

take to do library prep as well as QC

play32:02

and sequencing so Chris so generally

play32:07

from the time we receive your samples to

play32:09

do a preliminary QC the library prep

play32:12

process the Agilent QC and prepare your

play32:16

samples for sequencing the process from

play32:18

the date we get your samples until you

play32:20

get your data is about four to six weeks

play32:22

now a large part of that time isn't set

play32:26

aside for the actual sequencing process

play32:28

which can generally be done in about a

play32:30

week but it's to ensure that we have

play32:31

sufficient samples in order to set up

play32:33

the sequencing run if you have a large

play32:36

number of samples we can possibly

play32:37

provide you with the results from your

play32:39

sequencing in a matter of a few weeks

play32:43

that sound look good answer so now let's

play32:46

look for another one we've got one here

play32:48

from Andrew Cushman he's asking if if

play32:53

he's done his experiment and he has the

play32:55

raw data can he give us the raw data and

play32:58

have us do the sequencing for him

play33:01

thanks Andrew that's a great question so

play33:03

if you do have raw data and you don't

play33:05

know what to do with it we have a

play33:07

dedicated in-house bioinformatics team

play33:09

that can effectively do nearly any type

play33:11

of analysis that you need for your

play33:12

project we have a number of

play33:14

bioinformatics services listed on each

play33:16

of our NGS pages including for RNA seek

play33:18

if there's something that you're

play33:20

interested in that we don't list if you

play33:22

simply send us an email we can work with

play33:24

our bioinformatics team to set up a

play33:25

custom analysis package for you and

play33:27

generally get the results to you in a

play33:29

matter of a few weeks depending on how

play33:31

challenging the analysis is and how much

play33:33

custom software we have to develop for

play33:34

you that's perfect

play33:40

all right now we've got another one here

play33:42

from Kate what happens if my samples

play33:45

fail QC Kris again thanks Kate for that

play33:51

great question so if we receive your

play33:53

samples and there's any issue during the

play33:55

QC process we'll reach out to you and

play33:58

let you know if we would like to ask for

play34:00

new samples or if there's steps we can

play34:02

take to try to address it and process

play34:04

your samples and possibly still achieve

play34:06

a high-quality sequencing results if for

play34:10

whatever reason your samples don't pass

play34:12

the additional QC we reach out to you

play34:14

again and ask you if you'll be able to

play34:17

provide new samples or provide with

play34:20

alternative options for proceeding

play34:30

that was a great answer Chris we're just

play34:34

scrolling through the questions all

play34:38

right so we have one more I think this I

play34:41

think we've only got time left for one

play34:43

more question

play34:44

so we've got one here from and why she

play34:49

Ann's asking if she can submit one

play34:52

sample and use it for both RNA seek and

play34:55

MI RNA seek or do they need to have do

play34:58

they need to submit separate samples for

play35:00

each one it's actually quite a popular

play35:04

question so when we're trying to process

play35:07

samples from micro RNA seek we need to

play35:09

use special kits that are unique for

play35:11

small RNA sequences this would be

play35:13

different than what we would normally do

play35:14

for regular RNA seek so if you did want

play35:17

to do both rna-seq and micro RNA seek

play35:19

we'd either ask for twice the starting

play35:22

sample amount or two separate samples in

play35:24

order to process this that was fantastic

play35:30

and I hope that was helpful and that

play35:33

made sense let me see if we've got we

play35:37

have if we're able to answer any more

play35:41

all right

play35:43

so unfortunately I think we are a little

play35:46

up on time right now so we'll do is

play35:49

we'll go through the questions we'll

play35:51

write up some answers and we'll be

play35:53

emailing that out to you along with the

play35:54

slides for this webinar so once again

play35:57

thank you all for joining us today and

play35:59

we hope you have a great rest of the day

play36:01

thank you

Rate This

5.0 / 5 (0 votes)

Related Tags
RNA-seqWebinariosBiotecnologíaGenómicaDiseño de ExperimentosAnálisis de DatosBioinformáticaDesarrollo CientíficoEducación CientíficaTecnología de Secuenciación
Do you need a summary in English?