The Beginner's Guide to RNA-Seq - #ResearchersAtWork Webinar Series
Summary
TLDREl webinar presenta una introducción a RNA-seq, una técnica avanzada para estudiar la expresión génica a través de la secuenciación de alta capacidad. Cubre desde conceptos básicos hasta el diseño de experimentos, análisis y interpretación de datos. Se destaca la importancia de la normalización de datos y se mencionan estudios significativos que han utilizado RNA-seq, como el Proyecto Cancer Genome Atlas. Además, se ofrece un descuento promocional y se invita a la audiencia a participar en la sección de preguntas y respuestas.
Takeaways
- 📘 El webinar comenzó con una introducción a RNA-seq para aquellos recién llegados, enfocándose en el uso constante y sistemático de los productos y servicios de la empresa.
- 🌐 Se compartieron detalles sobre cómo, tras el webinar, se proporcionarían diapositivas y grabaciones para los asistentes, y se señaló la importancia de registrarse para recibir esta información.
- 🔬 Se presentó a los oradores, Dr. Christopher y Battery, expertos en aplicaciones científicas y desarrollo de productos con experiencia en áreas como la biología celular y la terapia génica.
- 🏢 Se habló sobre Applied Biological Materials (ABM), fundada en 2004, con sede en Vancouver, Canadá, y su misión de catalizar descubrimientos científicos en la vida y el desarrollo de medicamentos.
- 📈 Se mencionó el crecimiento de ABM y sus instalaciones, destacando su compromiso de ofrecer un servicio de clase mundial a sus clientes.
- 🧬 Se exploraron los antecedentes de RNA-seq, desde el northern blot en 1977 hasta el desarrollo de microarrays y la aparición de RNA-seq, que permite la detección de ARN y el estudio de la expresión génica.
- 🔍 Se discutieron las ventajas de RNA-seq, como la alta resolución y la capacidad de secuenciar transcriptomas nuevos sin datos previos, en comparación con métodos anteriores.
- 🧬 Se abordó la importancia de la selección de ARN en la preparación de la muestra para RNA-seq, ya sea a través de enriquecimiento de ARN poli-A o la eliminación de ARN ribosomal.
- 🔬 Se explicó el flujo de trabajo general de RNA-seq, incluyendo la preparación de la librería, la PCR de enlace, la secuenciación por síntesis y el análisis de datos.
- 📊 Se mencionaron las técnicas de análisis y normalización de datos RNA-seq, como el uso de fragments por kilobase de transcripto por millones de lecturas mapeadas (FPKM) para comparar niveles de expresión genética.
- 🌟 Se destacaron estudios como ENCODE y el Cancer Genome Atlas Project, que han utilizado RNA-seq para entender mejor los mecanismos subyacentes de la enfermedad y el cáncer.
- 💡 Se ofreció un código promocional del 25% de descuento en el paquete de bioinformática de secuenciación de ARN y se invitó a los asistentes a contactar a ABM para obtener más recursos y soporte técnico.
Q & A
¿Qué es RNA-seq y cómo se relaciona con la regulación de la expresión génica?
-RNA-seq es una técnica que permite identificar y cuantificar la expresión deARN en una muestra biológica en un momento dado, lo que permite investigar los cambios en la expresión génica y detectar eventos de esplicación alternativa y nuevos genes.
¿Qué es el propósito de los webinars de Applied Biological Materials (ABM)?
-Los webinars de ABM están diseñados para proporcionar capacitación constante y sistemática que ayuda a mantener a los participantes informados sobre los productos y servicios de la empresa.
¿Qué áreas de investigación tiene experiencia Christopher, el especialista en aplicaciones científicas de ABM?
-Christopher tiene casi 10 años de experiencia en investigación y diseño experimental con investigadores de todo el mundo en áreas que van desde la biología celular y el desarrollo hasta la validación de CRISPR y la ingeniería de líneas celulares.
¿Qué es el propósito de la sesión de preguntas y respuestas al final del webinar?
-La sesión de preguntas y respuestas permite a los participantes hacer preguntas relacionadas con el contenido del webinar y recibir respuestas directas de los expertos para aclarar dudas y obtener información adicional.
¿Cuál es la importancia de la normalización de datos en el análisis de RNA-seq?
-La normalización de datos es crucial para ajustar las diferencias en la cantidad de lecturas por muestra y por gen, lo que permite comparar la expresión génica de manera justa y precisa entre muestras y condiciones experimentales distintas.
¿Qué es FPKM y cómo ayuda a normalizar los datos de RNA-seq?
-FPKM (Fragments Per Kilobase of transcript per Million mapped reads) es una métrica que ayuda a normalizar los datos de RNA-seq, considerando la cantidad de fragmentos de cDNA y la longitud del gen, para estimar la expresión génica de manera relativa.
¿Cómo se realiza la selección de ARN mensajero (mRNA) durante la preparación de la muestra para RNA-seq?
-La selección de mRNA se puede realizar a través de la enriquecimiento con polia (poly A enrichment) o la eliminación de ARN ribosomal (rRNA depletion), dependiendo de los objetivos del estudio y el tipo de muestra.
¿Qué es la profundidad de secuenciación y por qué es importante en RNA-seq?
-La profundidad de secuenciación se refiere a la cantidad total de lecturas de ARN que se obtienen en un experimento RNA-seq. Es importante porque determina la cantidad de información que se puede obtener sobre la expresión génica y la calidad de los resultados.
¿Qué es el Proyecto Genoma Atlas del Cáncer y cómo ha utilizado RNA-seq en sus investigaciones?
-El Proyecto Genoma Atlas del Cáncer es un esfuerzo para analizar miles de muestras de pacientes con cáncer utilizando RNA-seq, con el objetivo de entender mejor los mecanismos subyacentes de la transformación y progresión del cáncer.
¿Cómo se puede utilizar RNA-seq en la medicina personalizada y qué impacto puede tener en las enfermedades genéticas?
-RNA-seq puede ampliar el trabajo en medicina personalizada al proporcionar información detallada sobre la expresión génica y la regulación en respuesta a diferentes condiciones y tratamientos, lo que puede tener un impacto significativo en el diagnóstico y el tratamiento de enfermedades genéticas.
Outlines
😀 Introducción al webinar sobre RNA-seq
El primer párrafo presenta el inicio del webinar y agradece la paciencia de los asistentes. Se da la bienvenida y se menciona el tema principal, RNA-seq, para aquellos recién llegados. Los webinars tienen como objetivo proporcionar capacitación constante y sistemática sobre productos y servicios. Después del webinar, se compartirán diapositivas y la grabación. Se destaca la opción de hacer preguntas a través de una cuenta de Google y se menciona un enlace de registro para asegurar la recepción de material adicional. Se presenta un esquema de los temas a tratar, desde la expresión génica hasta la interpretación de experimentos RNA-seq. Finalmente, se presentan los oradores, Dr. Christopher y Battery, expertos en aplicaciones científicas y desarrollo de productos con amplia experiencia en investigación.
🌟 Desarrollo histórico y ventajas de la técnica RNA-seq
Este párrafo detalla la evolución de las técnicas para estudiar la expresión génica, desde el northern blot en 1977 hasta la aparición de RNA-seq. Se discuten las limitaciones de los métodos anteriores, como la necesidad de conocer previamente las secuencias de ADN o ARN, y se contrastan con las ventajas de RNA-seq, que incluyen una alta resolución y la capacidad de secuenciar transcriptomas sin datos previos. Además, se menciona el aumento exponencial en las publicaciones sobre RNA-seq desde su introducción, lo que demuestra su utilización y accesibilidad para los investigadores.
🔬 Consideraciones para el diseño de un experimento RNA-seq
El tercer párrafo se enfoca en aspectos importantes a considerar al diseñar un experimento RNA-seq. Se discute la composición del ARN total en las células y la necesidad de enriquecer la muestra con el material de interés, como el ARN mensajero, mediante técnicas como la enriquecimiento con polia o la depuración de ARN. También se abordan las decisiones a tomar en cuanto a la profundidad de secuenciación y el uso de secuenciación de un extremo o ambos extremos, dependiendo de los objetivos del proyecto.
🛠️ Flujo de trabajo general de RNA-seq y tecnologías de secuenciación
Aquí se describe el flujo de trabajo general de un proyecto RNA-seq, desde la preparación de la biblioteca hasta el análisis de datos. Se mencionan los pasos de fragmentación, ligación de adaptadores, generación de clusters y secuenciación por síntesis. Se destaca la importancia de la PCR puente y se explica cómo se evitan los problemas de sobre-o sub-agrupación. Además, se introducen las tecnologías de control de calidad, como el qubit, el Agilent bioanalyzer y el qPCR, y se describe cómo se utiliza la tecnología de secuenciación por síntesis para leer las secuencias ADN.
📊 Análisis y interpretación de datos RNA-seq
El quinto párrafo se centra en el análisis y la interpretación de los datos obtenidos a través de RNA-seq. Se explica el proceso de conversión de datos en formato fastq y se discuten las etapas de alineación de lecturas, normalización y generación de visualizaciones como mapas de calor y análisis de componentes principales. Se introduce el uso de métricas como las piezas por kilobase de transcripto por millones de lecturas mapeadas (FPKM) para la normalización de datos y se describen las diferentes técnicas de análisis, como la anotación funcional de genes diferencialmente expresados.
🏥 Aplicaciones de RNA-seq en la investigación y la medicina
Este párrafo presenta ejemplos de estudios que han utilizado RNA-seq, como el Proyecto Encode, el Proyecto Model Organismo Encode y el Atlas del Genome del Cáncer. Se destaca cómo RNA-seq ha permitido avances en la medicina personalizada y en la comprensión de las bases moleculares de enfermedades como el cáncer. Se ofrece un código promocional para un descuento en el paquete de bioinformática de secuenciación de ARN y se mencionan los recursos educativos y el soporte técnico que ofrece ABM.
📞 Preguntas y respuestas sobre RNA-seq y servicios de ABM
En el último párrafo, se abordan preguntas comunes relacionadas con el proceso de RNA-seq, como el tiempo que toma la preparación de la biblioteca y el proceso de secuenciación, la posibilidad de realizar análisis con datos brutos proporcionados por el investigador y el manejo de muestras que no pasan la calidad. Se menciona la capacidad de ABM de realizar análisis personalizados y se ofrece una visión general de los recursos y el soporte que la empresa proporciona a sus clientes.
Mindmap
Keywords
💡ARN
💡RNA-seq
💡Secuenciación de próxima generación (NGS)
💡Diseño del experimento
💡Análisis de datos
💡Normalización de datos
💡Annotación funcional
💡Estudio ENCODE
💡Transcriptoma
💡Bioinformática
Highlights
Webinar begins with an introduction to RNA-seq for new attendees, outlining the format and benefits of the training.
Post-webinar resources include PowerPoint slides and recorded webinar for continued learning.
Google account sign-in enables live Q&A during the webinar through the chat box.
Registration ensures receipt of slides and post-webinar question responses.
Webinar topics include gene expression background, NGS intro, RNA-seq experiment design, workflow analysis, and case studies.
Speaker introductions feature Dr. Christopher and Battery, specialists with extensive research experience.
ABM's mission is to catalyze scientific discoveries in life sciences and drug development.
Historical context of gene examination methods leads to the current RNA-seq technology.
Advantages of RNA-seq include single nucleotide resolution and ability to sequence new transcriptomes without prior data.
Illumina's 2007 advent correlates with an exponential increase in RNA-seq publications, indicating widespread use.
Next-generation sequencing (NGS) basics are explained, including read types and considerations for project goals.
RNA-seq experiment design considerations include enriching for mRNA and choosing sequencing depth and read length.
General RNA-seq workflow involves library preparation, bridge PCR, sequencing, and analysis.
Sequencing by synthesis technology and quality control measures are discussed for ensuring accurate results.
Data analysis post-sequencing includes converting raw data to fastq, alignment, normalization, and various analytical techniques.
FPKM is introduced as a metric for normalizing gene expression data across samples and gene lengths.
Examples of RNA-seq applications include large-scale projects like ENCODE and its impact on personalized medicine.
ABM offers promo codes, resources, and support for RNA sequencing and bioinformatics packages.
Webinar concludes with a Q&A session addressing practical questions about RNA-seq processes and services.
Transcripts
hello everyone thank you all for your
patience and we'll be starting the
webinar up now so good morning and thank
you for taking the time to join our
webinar today today's topic will provide
you with an introduction to RNA seek for
those of you who are new to our webinars
they're designed to provide constant and
systematic training that will help keep
you in whelmed formed on our products
and services following the webinar we'll
be sharing a copy of the PowerPoint
slides as well as the recorded webinar
itself I would like to point out that if
you're signed into a Google account you
can ask any questions you may have in
the chat box to the right of the screen
please know that if you did not register
earlier there's a registration link in
the video description
this will ensure that you get a copy of
the slides and answers to any questions
post it in the chat during the webinar
to help frame this webinar outline here
are the topics that we'll be covering
first will be we'll begin with some
background on gene expression followed
by an intro into NGS after that we'll
discuss some of the important
considerations when designing your RNA
seek experiment followed by
understanding the workflow analysis and
interpretation of your experiment
finally we'll take some time to review
some projects that have shown the power
and versatility of RNA seek before we
begin talking about RNA seek I'd like to
take a moment to introduce your speakers
joining me today is my colleague dr.
Chris Christopher's mention a scientific
application specialist here at ABM after
completing his PhD at UBC where he
studied stem cell regulation he joined
ABM with a passion for helping
scientists achieve their goals with
nearly 10 years of experience in
research and experimental design with
researchers around the world and in
areas ranging from cell and
developmental biology to CRISPR
validation and cell line engineering he
can assist with nearly every product
with M nearly any
project you have from initial set up to
post sequencing data analysis alongside
him you have myself battery battery a
product development specialist with over
five years research experience in areas
such as drug delivery lipid biosynthesis
and gene therapy obtained over the
course of my graduate degree at SFU our
main goal is to work closely with
clients and provide them with the
support that ABM has been renowned for
before we begin with the core content of
the webinar I'd like to take a moment to
talk about applied biological materials
and what our goals are
ABM was founded in 2004 and has been
driven to catalyze scientific
discoveries in the field of life science
and drug development for the last 15
years headquartered in Vancouver Canada
we are one of the fastest growing
biotech companies in the region and
since our inception we have worked hard
to become known as a reliable source for
clients such as yourselves this hard
work has allowed us to expand our
facilities beginning with a branch in
Chang su Province in China in 2013 and a
new facility in Bellingham that will be
opening up later in 2019 these
expansions have put us in a position to
work with each and every one of you and
provide the world-class service you all
deserve with our team of passionate and
trained scientists iBM is dedicated to
empowering researchers with the latest
innovations for all their scientific
needs now before we discuss the core
details of this webinar I feel it's
important to discuss some of the work
that led us to RNA seek as well as it's
important to enter research in general
when studying the role of genes in
development of disease there are three
distinct levels we can use for
examination we can study them at the DNA
level by studying mutations to determine
their effects on gene at the RNA level
by studying gene expression and at the
protein level by examining folding
patterns of genes the first link in the
chain leading to RNA seek is the
northern bloc
developed in 1977 by James Alwyn David
Kemp and George stark at Stanford
University
this tool was extremely useful and that
it allowed for the study of gene
expression via RNA detection following
the northern blot was rtq PCR which was
developed in the early 80s by Kary
Mullis and this technique allows for the
detection characterization and
quantification of RNA transcripts
finally the last step prior to RNA seek
was the microarray which was developed
by Patrick Brown at Stanford University
this tool is impressive and then it can
be used to simultaneously study the
expression levels of thousands of genes
at one time and with that we finally
come to the present RNA seek which
allows us to reveal the presence and
quantity of RNA in a biological sample
at any given moment as well as to look
at the changes in gene expression over
time now when considering these three
prior methods we can see that while they
all had strengths such as the low
reagent cost and the ability to be
easily done in the comfort of your own
lab there are some items that we should
consider a key note is the fact that
each system required prior knowledge of
the chain or the mRNA in order to be
used making it impossible to use these
methods to study novel genes with RNA
seek on the other hand the pros are
considerable well the initial cost of
RNA seek is high relative to the
previous methods the use of single
nucleotide resolution alongside the
ability to sequence new transcriptomes
without prior data is a huge bonus when
coupled with its massively high
throughput this creates a winning
combination that would be an asset to
any researcher now with the advent of
Illumina in 2007 we can see the near
exponential increase in the number of
RNA seek publications which demonstrates
the widespread use and accessibility of
this tool to researcher
and now we find ourselves in a position
to answer the question what is RNA seek
in essence RNA seek is a technique that
allows us to begin with cells or tissues
and examine the expressed genes by
taking advantage of next-generation
sequencing we can learn about changes in
gene expression and identify novel
splicing events and genes with this
brief background I'd like to pass things
over to Christopher who will begin with
a brief overview of - the topics for
today's talk but first a quick reminder
to everyone listening to please post any
questions you may have in the in the
chat box do during the course of the
webinar
thank you boshy for that brief
introduction and thanks for joining us
today for our webinar
I'm gonna go briefly into an
introduction to next-generation
sequencing to give you a better idea of
how the technology works before going
into more details about RNA seek so
next-generation sequencing can be used
for a variety of purposes including
whole genome sequencing studying changes
in gene expression using RNA seek or
performing metagenomic studies with
environmental samples the basic workflow
for all next-generation sequencing is
the same where you take input material
whether it's DNA or RNA fragment it to
be of a similar size ligate sequencing
adapters that can then bind to the
sequencer before undergoing
high-throughput sequencing now there are
a couple important terms to know for all
next-generation sequencing approaches
the first is the read a read is a
sequence of nucleotides that will be
sequenced so on the right you can see
that there's a double-stranded DNA
molecule and when it's sequenced it will
sequence along one strand which gives
you the sequencing read for the
nucleotides that are present we'd refer
this as single and sequencing since
you're only synthesizing one strand in
this case the sense strand alternatively
you can also use paired-end sequencing
to read a fragment
both strands of the molecule when you're
trying to decide which option to pick
for your project it's important to
consider what your actual project goal
is for instance single end sequencing is
generally sufficient to study changes in
gene expression whereas paired end
sequencing is more useful for whole
genome sequencing studying alternative
splicing or de novo transcriptome
studies next the read can vary in length
the read length is considered to be the
number of nucleotides that are sequenced
per read to give you an idea of the read
length sizes typically for RNA seek 75
nucleotides is a common read length
which would be great for studying gene
expression or resequencing samples 150
nucleotides is a longer read length
that's more suitable for assembling new
transcriptomes
or whole genome sequencing for
eukaryotes and even longer reads such as
300 nucleotides are more suitable for
amplicon seek as well as metagenomic
studies next once you have a sequencing
read you still have to figure out how it
aligns to a reference sequence so in
this example you can see that there's a
reference sequence in dark blue along
the bottom of the image and then the
read and light blue which maps to a
specific location in that reference
sequence if we were to look at a
specific nucleotide such as this G here
highlighted with a red box you can see
that there are two reads in light blue
that map to this eight and the reference
sequence because there are two reads
covering this we describe this as 2 X
coverage because you cover that
nucleotide with two different readings
if we look at another regions such as
this C residue you can see that there
are four reads that map to this specific
location this would give it a coverage
of 4x oftentimes there are also sites
which don't have any reads that mapped
them at all such as this a residue and
for this site you could describe it as
having zero x coverage now when you take
all
of these sites and add the levels of
coverage for each nucleotide together
you can make an idea of the coverage for
that site so in this example there are
six reads for these three bases G a and
C when you divide it by the number of
nucleotides you would get 2x coverage so
you could say that the sequencing depth
is 6 reads with an average of 2x
coverage for these nucleotides now for
most samples you would typically
sequence millions of reads to make sure
as much of the entire transcript time as
possible is sequenced typically bigger
genomes or transcript rooms require more
reads so for instance bacteria which
have a genome size of about 5 million
base pairs would require fewer reads and
less sequencing than mammals that would
have a 3 billion base pair genome or
plants which would even have much larger
genome sizes in terms of RNA seek for
bacteria this would look like 8 million
reads per sample versus 20 million reads
or 40 million reads per sample from
mammals and plants next I'm gonna go
over a couple important things to
consider when designing your rna-seq
experiment first most of the RNA that's
present in a cell is not actually
messenger RNA that is what most
researchers want to sequence for
instance if you take the total RNA
present and look at the breakdown of
what it's composed of you'll find that
about 85 percent of it is ribosomal RNA
which most researchers typically do not
want a sequence next about 10 to 12
percent is transfer RNA again which most
researchers are not interested in
sequencing for their projects mRNA
itself is typically about 2 to 3 percent
of the total RNA present in a cell and
then an even smaller percentage than
this is composed in micro rna's long
non-coding rnas circular RNAs and other
sequences so if your starting point is
this large population of total RNA you
have to find a way to enrich for what
you actually want a sequence in that
sample there are two basic ways that we
can do this the first of which is poly a
enrichment where we use special beads
that can bind to the poly a tails on
mrnh
and scripts to pull them out of the
population of total RNA and enriched for
poly a sequences alternatively if you
want to study mRNAs as well as small
RNAs like micro RNAs
you can use a treatment called our RNA
depletion which will selectively remove
the are RNA sequences from the total RNA
population if you're working with
eukaryotic samples you can use either
option poly a enrichment or RNA
depletion if you're working with
prokaryotic samples however you
absolutely have to use an RNA depletion
treatment because prokaryotic cells
typically do not have poly a tails on
their mRNA transcripts next you need to
consider how you're designing your
project and what your goal is if your
goal is to Nauvoo transcriptome assembly
for a species which previously hasn't
had a sequence transcriptome you
typically want greater sequencing depth
and longer reads to help with the
assembly next if you only want to study
changes in gene expression to see if a
particular gene increases or decreases
in response to a stimulus using single
end sequencing is typically sufficient
finally if you want to identify novel
transcripts or new alternative splicing
events you would typically want to use a
greater sequencing depth and paired end
sequencing next I'm gonna go over
briefly the general RNA seek workflow
for most projects this is highlighted
here and I don't want you to focus too
much on this but there are four basic
steps beginning with library preparation
before a sample is input into the
sequencer followed by bridge PCR
sequencing by synthesis and finally
analysis at the outset of the project
you'd have to input your starting
material assess its quality and convert
it to DNA typically when we begin with a
sample we would ask is the mRNA degraded
if it's not degraded we can perform poly
a selection before converting the RNA
into DNA this conversion from RNA into
DNA is done to increase this
ability of the molecule and ensure
success with sequencing if the mRNA
transcript is degraded however we can do
special treatments that can repair the
transcript to still provide sequence
below material for the project next once
the material has been converted into DNA
it will likely be of different sizes
because different mRNA transcripts are
generally of different lengths so the
next step would be to fragment the DNA
into uniform sizes to make sure each
fragment is equally likely to be
sequenced the next step is to ligate
sequencing adapters so that the DNA
fragments can bind the sequencer itself
once this is done you can input the
material into the sequencer and you'll
go through a step called cluster
generation or bridge PCR now oh sorry
I'm just gonna jump back briefly bridge
PCR is one of the most important steps
for sequencing because when you're
sequencing these molecules it's not
possible to sequence a single DNA
molecule in the sequencer itself you
first have to generate clusters of
identical molecules that are then
sequenced together so in this diagram in
part one you can see a DNA molecule that
binds to the flow cell of the sequencer
the molecule then bends over and binds
to the other adapter on the flow cell
before DNA synthesis begins to form a
double-stranded DNA molecule you can see
this now in part 3 where the sequencing
reaction that generated the double
strand DNA occurs in step four you can
see that you now have two molecules in
two clusters this process is repeated
many times in panel five until you have
sufficient DNA molecules to be able to
sequence each cluster now importantly
you need to know the concentration of
your library so that you can avoid over
clustering over clustering happens when
the cluster density is too high this
leads to the Machine not being able to
accurately read each cluster to
determine what the DNA sequences
and generally leads to the entire
sequencing run failing the opposite
problem you also want to avoid which is
under clustering this happens when the
cluster density is too low and it leads
to an overall lower sequencing output
and again makes it hard for the
sequencer to read what the sequences at
that cluster next I'll briefly go over
aluminous technology for sequencing by
synthesis to give you an idea of how the
sequencing process itself actually works
so when you have the template DNA it
will have a primer that will prime it
for DNA synthesis that binds to the flow
cell next you have individual
nucleotides that have fluorescent dyes
bound to them which can be used during
the synthesis reaction in the next panel
you can see that the a nucleotide has
been added to the sequence
all the nucleotides that didn't bind
because they don't have a complementary
base on the other strand would be washed
away after the step occurs there be a
fluorescent emission and response to
light stimulus from the nucleotide
that's been added that will be imaged by
the sequencer after this step the
fluorescent editor the fluorescent site
is cleaved from this molecule washed
away and then the process repeats again
once this repeats a to the end of
sequencing you will then be able to have
the Machine go through and read each of
the fluorescent signals that occurred to
piece together what the sequencing was
for that molecule at every stage of
sequencing though quality controls
essential and there are three core
technologies we use to perform this QC
the first of which is qubit that can
measure DNA concentration the Agilent
bio analyzer which can assess your DNA
library and how well it's been
fragmented and then qpcr which can be
used to determine how much of the actual
prepared library is sequence of all so
you're starting one of the most
important things you can do is to run a
RN angel to determine if your material
is possibly degraded if you have a high
quality sample you should see a few
distinct bands in your gel versus if the
sample is degraded you'll typically see
a large smear in that Lane if you do
have this degradation we can use special
kits to recover degraded RNA or to
reverse formaldehyde modification to RNA
if your sample was fixed prior to
extraction and this can help provide you
with sequence little material for RNA
seek next if you have a low amount of
starting material the first thing you
have to do is actually know what this
quantity is and for that we can use
cubit to measure the concentration of
nucleic acids that are present in a
sample this is crucial for library
preparation if you do have a low amount
of starting material we can use special
kits that can amplify the mRNA that is
present using the poly a tail prior to
RNA seek next after you have the
material that's been converted from RNA
to DNA and you go through the
fragmentation step the Agilent
bioanalyzer comes in handy to tell you
how efficient the fragmentation was and
to give you an idea of the average
fragment size in your sample and then
after you've liked it at the adapters
successfully qPCR becomes essential and
I'll go through these two steps next
with Agilent when you input your sample
the data that you get from it is
effectively a graph that tells you how
big the fragment sizes are that are
present in your sample and their general
distribution so in this graph here you
can see that there are two peaks on the
left and right hand side of the graph
I've highlighted them here in red boxes
for you what you actually want to look
at though is the fragment size highlight
in this green box which tells you
roughly that your fragments are on the
higher end of the range of fragment
sizes which is a good result which you
want to have for sequencing the
alternative is when you have an uneven
distribution of fragment sizes with no
large fragments and typically these
smaller fragments are more difficult to
see
and are less likely to give you a
high-quality sequencing result or lead
to a failure in sequencing so we
generally want to avoid this next with
qPCR this gives you a helpful metric
that you can't measure with qubit
now remember qubit can tell you about
the amount of nucleic acids present in a
sample but qPCR can tell you how much of
the nucleic acids are actually
sequencing bull and have adapters like
age them correctly next I'm going to go
over a bit more of rna-seq analysis and
interpretation of the data
so the general workflow once the
next-generation sequencing has been
complete is that you will go from raw
data to a format called fast queue with
intermittent steps before you can do
data analysis the raw data from the
sequencer you will generally never see
because either software or the sequencer
itself will process it into fast queue
now fast queue is simply the FASTA
format a text-based format for depicting
nucleotides with the quality control
info from that sequencing run in that
particular sequencing read if you want
to see what this actual fast queue data
would look like shown here to the right
and you can see that it's largely a line
of alphanumeric text but it's kind of
hard to understand what's going on the
important sequence information here is
highlighted in the red box which is the
actual sequencing result from this
particular sample from this particular
cluster next you have to figure out how
to use the actual data before you can do
the analysis that you're interested in
so with fast queue this is generally
what we provide for each next-generation
sequencing project whether it's RNA seek
whole genome seek or another type of
sequencing if you wanted to work with
this you'd have to take fast queue data
first and align it to the reference
sequence we briefly went over the
alignment of reads
earlier in the presentation this is one
of the crucial steps at the end of
sequencing after you've done the
alignment which can generally be done
using different software you would then
have to normalize your data before you
can begin the
alysus and generate beautiful figures
that you can eventually publish with
your manuscript I'm gonna over a bit of
how the data is normalized it's
important to normalize your data for two
main reasons
but first is you need to normalize the
number of reads per sample because some
samples may receive different reads next
you need to normalize the number of
reads per gene because genes are
different lengths and you have to take
this into account I'll go over this a
bit more in the next few slides for
instance if you have three samples
sample a B and C you would generally
want each sample to be sequenced an
equal amount but because of stochastic
differences in sequencing that cannot be
controlled you generally have different
sequencing outputs for each sample in
this example sample a would have about
20 million reads sample B about 30
million reads and sample C would have 10
million reads if you didn't normalize
your data without doing any processing
you would think that samples C has
one-third of the gene expression of
sample B for instance or that sample a
and sample B's expression levels are
different even if they may be identical
secondly you need to normalize relative
to the length of the gene and the number
of reads for that gene in this example
you can see gene a is about one KB and
gene B is 2 KB or twice that length for
this example gene a has 5 reads that map
to it and gene B has 10 reads but gene B
is twice as long if you didn't normalize
your data you would think that gene B is
expressed twice as highly even if that
may not be the case so you'd then have
to normalize the data there's a useful
metric we can use that can account for
differences between samples and between
genes this is called fragments per
kilobase of transcript per million
mapped reads or fpkm this is a way to
look at the relative expression of a
transcript proportional to the number of
cDNA fragments that it originated from
or sorry that originated from it so fpkm
is essentially the normalized estimation
gene expression based on the RNA
sequencing data or another way to think
of this is fpkm helps normalize the
differences in the amount of reads per
sample as well as the number of reads
proportional to how long a given gene is
in this specific example once we've
normalized gene a and gene B you can see
that the fpkm value is 2 for both
indicating that both these genes despite
having different lengths and different
numbers of mapped reads have the
relatively similar expression level if
you were to compare this to a treated
normalized sample you'd then be able to
include whether or not expression has
increased or decreased relative to
control using this normalized data set
next once you have this what type of
analysis can you actually do the first
is you can use heat maps to look at
large changes in expression between a
control sample and a tree example and to
visually detect depict increases or
decreases in gene expression next you
can perform principal component analysis
to see if any of your samples have
different changes in gene expression
that may correlate and be related to the
treatment conditions they underwent next
you can also do functional annotation of
differentially expressed genes with this
this allows you to look at genes that
have increased an expression decreased
in expression or remain the same and see
if there's any similar pathways or
processes that they might be involved in
such as metabolism inflammation or even
a mean response next I'm gonna hand it
back to boshy to go over a couple
examples of researchers that have used
RNA seek in their studies
thank you for that excellent talk
Christopher as he just mentioned I'll be
taking a few minutes to discuss some of
the studies that people have used to
demonstrate the power and versatility of
our NAC and after that we'll wrap up the
webinar and proceed to the Q&A session
now the first of these projects is the
encode project which focused on
identifying genome-wide regulatory
regions in different cell lines this was
then followed by the model organism
encode project which aimed to provide by
the biological research community with a
comprehensive encyclopedia of genomic
functional elements in the model
organism C elegans and D Milano gaster
this was a huge undertaking and was
aimed at helping to better understand
the downstream effects of regulatory
regions next we have the Cancer Genome
Atlas Project which took advantage of
RNA seek to analyze thousands of cancer
patient samples to better understand the
underlying mechanisms of malignant
transformation and progression and
cancer finally the use of RNA seek in
medicine has been substantial as it's
allowed for researchers to expand their
work into personalized medicine which
can have a huge impact on genetic
disease and with that we now come to the
wrap-up portion of the webinar and as a
thank you
we'd like to offer your promo code for
25% off of your RNA sequencing
bioinformatics package which you can see
here and will also be included in our
follow-up email and now that we've
finished through with the background of
RNA seek let's take a moment to examine
some of the resources that ABM offers
first make sure to check out our website
where all of our resources are collected
this includes our knowledgebase articles
as well as our YouTube videos and blog
posts to help provide you with the tools
you need for your experiments now not
only do we have a diverse selection of
educational material we also have an
incredibly knowledgeable technical
support
team who can guide you through your
experiments as well as a dedicated
customer service team to ensure that you
receive your items in a timely manner if
you have any questions about our
materials or services you can always
reach out to us by phone email online
chat and in addition we have a
comprehensive frequently asked questions
section on our website so you can always
browse through and see if you can find
the answers there thank you for your
taking the time to join us today you
just give you a heads up we'll be having
a for another webinar on whole genome
sequencing so keep an eye out for an
invitation from us soon thank you for
taking the time to join us today and now
we'll proceed over to the Q&A section
all right so you just take a moment give
us a moment please just to go through
all the questions and we'll start up
with as many as we can get through all
right so we've got one here from Ian oh
and the question is how long does it
take to do library prep as well as QC
and sequencing so Chris so generally
from the time we receive your samples to
do a preliminary QC the library prep
process the Agilent QC and prepare your
samples for sequencing the process from
the date we get your samples until you
get your data is about four to six weeks
now a large part of that time isn't set
aside for the actual sequencing process
which can generally be done in about a
week but it's to ensure that we have
sufficient samples in order to set up
the sequencing run if you have a large
number of samples we can possibly
provide you with the results from your
sequencing in a matter of a few weeks
that sound look good answer so now let's
look for another one we've got one here
from Andrew Cushman he's asking if if
he's done his experiment and he has the
raw data can he give us the raw data and
have us do the sequencing for him
thanks Andrew that's a great question so
if you do have raw data and you don't
know what to do with it we have a
dedicated in-house bioinformatics team
that can effectively do nearly any type
of analysis that you need for your
project we have a number of
bioinformatics services listed on each
of our NGS pages including for RNA seek
if there's something that you're
interested in that we don't list if you
simply send us an email we can work with
our bioinformatics team to set up a
custom analysis package for you and
generally get the results to you in a
matter of a few weeks depending on how
challenging the analysis is and how much
custom software we have to develop for
you that's perfect
all right now we've got another one here
from Kate what happens if my samples
fail QC Kris again thanks Kate for that
great question so if we receive your
samples and there's any issue during the
QC process we'll reach out to you and
let you know if we would like to ask for
new samples or if there's steps we can
take to try to address it and process
your samples and possibly still achieve
a high-quality sequencing results if for
whatever reason your samples don't pass
the additional QC we reach out to you
again and ask you if you'll be able to
provide new samples or provide with
alternative options for proceeding
that was a great answer Chris we're just
scrolling through the questions all
right so we have one more I think this I
think we've only got time left for one
more question
so we've got one here from and why she
Ann's asking if she can submit one
sample and use it for both RNA seek and
MI RNA seek or do they need to have do
they need to submit separate samples for
each one it's actually quite a popular
question so when we're trying to process
samples from micro RNA seek we need to
use special kits that are unique for
small RNA sequences this would be
different than what we would normally do
for regular RNA seek so if you did want
to do both rna-seq and micro RNA seek
we'd either ask for twice the starting
sample amount or two separate samples in
order to process this that was fantastic
and I hope that was helpful and that
made sense let me see if we've got we
have if we're able to answer any more
all right
so unfortunately I think we are a little
up on time right now so we'll do is
we'll go through the questions we'll
write up some answers and we'll be
emailing that out to you along with the
slides for this webinar so once again
thank you all for joining us today and
we hope you have a great rest of the day
thank you
Voir Plus de Vidéos Connexes
¿Cómo analizar los datos? | Todo lo Que Necesitas Saber para Empezar
Modelamiento de datos en 10 Minutos
QUÉ ES UNA ENTREVISTA DE INVESTIGACIÓN (ENFOQUE CUALITATIVO) | VENTAJAS, TIPOS E INSTRUMENTOS
Tablas Dinámicas en Excel - Todo lo que necesitas saber 😎
Requirements, admin portal, gateways in Microsoft Fabric | DP-600 EXAM PREP (2 of 12)
Moda para Datos Agrupados
5.0 / 5 (0 votes)