The pros and cons of GWAS

The Sheekey Science Show
15 Sept 201910:17

Summary

TLDRThis video introduces genome-wide association studies (GWAS), which identify gene loci associated with traits or phenotypes. The host explains the pros and cons of GWAS, emphasizing its hypothesis-free nature and its potential therapeutic benefits. Limitations such as missing heritability and a lack of diversity in the studies are discussed, along with the need for further experiments to enhance accuracy. The video also touches on polygenic risk scores, which predict the likelihood of traits like obesity, and stresses the importance of making these scores actionable and independent of existing information.

Takeaways

  • 🧬 Genome-wide Association Studies (GWAS) are used to identify gene loci associated with a trait or phenotype of interest.
  • 🔍 GWAS involves scanning the entire genome to find single nucleotide polymorphisms (SNPs) that may be associated with a particular trait.
  • 🌟 One of the main benefits of GWAS is that it is hypothesis-free, allowing researchers to identify potential gene associations without prior knowledge.
  • 💊 GWAS can have therapeutic potential, particularly when combined with other data like polygenic risk scores, which can predict the likelihood of certain conditions.
  • 📈 The data for GWAS is becoming more accessible due to the prevalence of sequencing data, making these studies technically easier to conduct.
  • 🔄 Combining GWAS with other studies, like single-cell RNA sequencing, can help overcome some of the limitations and provide a fuller understanding of gene expression.
  • ⚠️ A key limitation of GWAS is the issue of 'missing heritability', where identified genetic variants only account for a small percentage of the genetic influence on a phenotype.
  • 🔎 GWAS often identifies loci rather than specific genes, which can be challenging as it's the gene function that researchers are typically more interested in understanding.
  • 🌐 There is a lack of diversity in many GWAS, which can limit the applicability of findings to different populations and may contribute to 'missing heritability'.
  • 🏥 Polygenic risk scores generated from GWAS can be used to predict an individual's risk for certain diseases or conditions, potentially enabling early intervention.

Q & A

  • What are genome-wide association studies (GWAS)?

    -Genome-wide association studies (GWAS) are used to identify gene loci associated with a specific trait or phenotype by analyzing genetic variations across the entire genome. These studies look for single nucleotide polymorphisms (SNPs) that may be linked to certain traits.

  • What is the main advantage of genome-wide association studies?

    -One major advantage of GWAS is that they are hypothesis-free. Researchers do not need prior knowledge about the genome, making it possible to identify genetic variants that are likely associated with a particular trait.

  • What are some limitations of genome-wide association studies?

    -Key limitations include the inability to establish causation, the issue of missing heritability, the challenge of identifying rare variants, and the potential for identifying non-coding regions or loci without clearly linked genes.

  • What does 'missing heritability' mean in the context of GWAS?

    -Missing heritability refers to the gap between the genetic variants identified through GWAS and the full genetic explanation for a trait. The variants typically explain only a small percentage of the heritability, leaving most of the genetic factors unaccounted for.

  • How can combining GWAS with other studies improve results?

    -Combining GWAS with studies such as single-cell RNA sequencing can help researchers better understand which genes are expressed in specific tissues and confirm the association between genetic variants and phenotypes, enhancing the reliability of findings.

  • Why is the lack of diversity in GWAS a problem?

    -Most GWAS have used sequencing data primarily from European populations, which may limit the generalizability of results. Including more diverse ethnic groups could help address missing heritability and provide more comprehensive insights into genetic associations.

  • What is a polygenic risk score, and how is it related to GWAS?

    -A polygenic risk score is a measure that combines the effects of multiple genetic variants identified through GWAS to estimate an individual's likelihood of developing a specific trait or disease. It can be used for early disease prediction and prevention.

  • What are some potential applications of polygenic risk scores?

    -Polygenic risk scores can help predict the likelihood of developing conditions such as obesity, cancer, or other complex diseases. By identifying individuals at higher risk, early intervention and preventive measures can be implemented.

  • What is the main challenge with using polygenic risk scores for actionable insights?

    -The challenge lies in ensuring that polygenic risk scores provide independent and actionable information beyond what is already known. For example, if someone is already obese, a risk score confirming their likelihood of obesity offers little new information.

  • How can rare genetic variants influence the effectiveness of GWAS?

    -GWAS typically focus on common genetic variants, but rare variants may have a greater effect size on traits. These rare variants might be missed in standard GWAS but could play a significant role in explaining missing heritability.

Outlines

00:00

🧬 Introduction to Genome-Wide Association Studies

This paragraph introduces genome-wide association studies (GWAS), explaining their purpose and methodology. GWAS are used to identify gene loci associated with a specific trait or phenotype by scanning the entire genome for single nucleotide polymorphisms (SNPs) that may correlate with the trait. The paragraph provides an example of how a SNP could be associated with tallness. It also mentions previous GWAS examples, such as those related to chronotype and obesity, and a recent study on same-sex sexual behavior. The pros of GWAS include being hypothesis-free, having therapeutic potential, and being technically easy to perform due to the prevalence of sequencing data. The paragraph also touches on the importance of combining GWAS data with other studies, like single-cell RNA sequencing, to gain a fuller understanding of the genetic associations.

05:02

🔎 Limitations and Potential of Genome-Wide Association Studies

The second paragraph delves into the limitations of GWAS, such as the challenge of associating genetic variants with causation rather than just correlation. It discusses the issue of 'missing heritability,' where the sum of identified genetic variants only accounts for a small percentage of the likelihood of a phenotype. The paragraph also addresses the problem of identifying common variants over rare ones, which might have a greater effect size. It suggests that combining GWAS with other experiments, like RNA sequencing, can help overcome some of these limitations. The paragraph also highlights the importance of including diverse populations in GWAS to improve the understanding of genetic associations and heritability. It concludes by discussing the potential of GWAS in generating polygenic risk scores, which can predict the likelihood of developing certain conditions or phenotypes, and the importance of these scores being actionable and independent of other available information.

10:04

📝 Conclusion on Genome-Wide Association Studies

The final paragraph summarizes the discussion on GWAS, emphasizing their benefits and current limitations. It reiterates the potential of GWAS in conjunction with other studies to advance genetic research and the importance of addressing the limitations to fully harness their capabilities. The paragraph ends on a note of thanks to the audience for their attention, indicating the conclusion of the video script.

Mindmap

Keywords

💡Genome-wide Association Studies (GWAS)

Genome-wide Association Studies (GWAS) are research methods used to identify genetic variations (often single nucleotide polymorphisms, or SNPs) associated with a particular trait or phenotype across the entire genome. In the video, GWAS is described as a tool to find loci related to traits like height or disease susceptibility without needing prior hypotheses. This technique is key to understanding genetic influences on traits.

💡Single Nucleotide Polymorphisms (SNPs)

Single Nucleotide Polymorphisms (SNPs) are genetic variations where a single nucleotide (A, T, C, or G) in the genome is altered. In the context of GWAS, SNPs are the primary focus when identifying genetic markers associated with traits like obesity or chronotype. SNPs are used as markers to locate regions in the genome that correlate with specific phenotypes.

💡Phenotype

A phenotype is an observable trait or characteristic, such as height, eye color, or susceptibility to a disease. GWAS aims to connect specific genetic variants (SNPs) to particular phenotypes. For example, the video mentions traits like obesity or same-sex sexual behavior as phenotypes studied using GWAS.

💡Polygenic Risk Scores

Polygenic risk scores estimate an individual's likelihood of developing a certain trait or condition based on the sum of multiple genetic variants. The video explains that polygenic risk scores can predict traits like body mass index (BMI) by analyzing multiple SNPs identified through GWAS, potentially aiding early disease identification and prevention.

💡Loci

Loci are specific, fixed positions on a chromosome where a particular gene or genetic marker is located. In GWAS, researchers look for loci where genetic variations (like SNPs) are associated with traits of interest. The video provides an example of loci linked to traits such as height or obesity.

💡Missing Heritability

Missing heritability refers to the gap between the genetic variations identified in studies like GWAS and the total heritability of a trait. In the video, it is explained that even when loci are identified through GWAS, they often account for only a small percentage of a trait’s heritability, leaving much of it unexplained. This phenomenon is known as missing heritability.

💡Hypotheses-Free

Hypotheses-free means that researchers do not need a preconceived idea or theory about which genes are linked to a trait when conducting GWAS. This is one of the benefits of GWAS mentioned in the video, as it allows for broad, unbiased searches across the genome to identify new loci associated with traits.

💡Therapeutic Potential

Therapeutic potential refers to the possibility of using research findings, such as those from GWAS, to develop new treatments or interventions for diseases. The video suggests that identifying genetic loci associated with diseases through GWAS could lead to targeted therapies in the future, especially when combined with polygenic risk scores.

💡Linkage Disequilibrium

Linkage disequilibrium describes the non-random association of alleles at different loci. It is mentioned in the video as a challenge in GWAS, where multiple genetic variants in a region may appear associated with a trait, making it difficult to pinpoint the exact variant causing the phenotype. It complicates the process of understanding the genetic basis of traits.

💡RNA Sequencing

RNA sequencing is a method used to analyze the quantity and sequences of RNA in a sample, providing insights into which genes are active in different cells and tissues. In the video, RNA sequencing is suggested as a complementary tool to GWAS for better understanding gene expression and verifying the relevance of the loci identified, improving confidence in the results.

Highlights

Introduction to genome-wide association studies (GWAS) and their role in identifying gene loci associated with traits or phenotypes.

GWAS are hypothesis-free, meaning they don't require prior knowledge to identify gene loci associated with a phenotype.

GWAS can have therapeutic potential, particularly when combined with polygenic risk scores to predict disease likelihood.

Advancements in sequencing technology have made it easier to access the data required for GWAS, increasing their prevalence.

Combining GWAS with other experiments, such as single-cell RNA sequencing, provides a more comprehensive understanding of gene expression and its relationship with traits.

One major limitation of GWAS is that identifying an association between a gene and a trait does not imply causation.

The concept of 'missing heritability' refers to the fact that GWAS often explain only a small percentage of the genetic variance for complex traits.

GWAS tend to miss rare variants, which may have a greater effect on the trait of interest compared to common variants.

Another limitation of GWAS is that they often identify loci without identifying the gene itself, making it harder to understand the function and role of the gene.

There is a lack of diversity in GWAS, with many studies relying predominantly on European populations, which can limit the generalizability of findings.

Polygenic risk scores, derived from GWAS, can be used to estimate the likelihood of an individual developing a certain disease or phenotype.

Polygenic risk scores could enable earlier detection of diseases and more personalized treatments based on genetic risk.

The actionable potential of polygenic risk scores is still being evaluated, as they must offer independent predictive value beyond existing information.

An example of actionable polygenic risk scores is the use of these scores to assess the risk of obesity, which could lead to targeted interventions.

The future of GWAS and polygenic risk scores depends on addressing limitations such as rare variant identification, heritability gaps, and study diversity.

Transcripts

play00:00

hello and welcome back to the cheeky

play00:01

sign jus so in today's video we're going

play00:04

to look at the pros and cons of

play00:06

genome-wide Association studies so

play00:08

before I jump straight in and tell you

play00:10

about the pros and cons I'll first

play00:12

introduce you to what are genome-wide

play00:14

Association studies and then after

play00:17

looking at the pros and cons we'll look

play00:18

at what polygenic risk scores are and

play00:20

how genome-wide Association studies are

play00:22

used to look at putting at risk scores

play00:25

so firstly what are genome-wide

play00:27

Association studies so genome-wide

play00:30

Association studies are used to identify

play00:33

gene loci associated with a trait or

play00:36

phenotype of interest so the idea is

play00:38

that you look across the entire genome

play00:41

and try to identify single nucleotide

play00:43

polymorphisms as you can see here with T

play00:47

a and G and whether or not the T the a

play00:51

witha G is associated with a phenotype

play00:54

of interest so let's say people who had

play00:57

a T instead of an a were also had the

play01:01

phenotype of being really tall and so

play01:05

that could be a low save associated with

play01:10

the traits of being tool that make sense

play01:12

and so if you do this across the entire

play01:14

genome you'll find that certain leucite

play01:17

are more likely to be associated with

play01:18

that phenotype than others and ones that

play01:20

surpass the threshold are of interest

play01:22

because it suggests that that is a low

play01:26

site why there was a correlation between

play01:28

having that genetic variant and the

play01:32

traits so in previous videos I've

play01:34

already given examples of where

play01:36

genome-wide Association studies have

play01:38

been done for example I gave an example

play01:40

of where it's been done with chronotype

play01:42

and also with obesity but this paper

play01:45

here is a recent paper that's forces on

play01:47

a genome-wide Association study with

play01:50

Lisa associated with same-sex sexual

play01:53

behavior and so my point is is that they

play01:56

are happening all the time and can be

play01:59

done with many different phenotypes but

play02:02

why do them what are the benefits of

play02:04

genome-wide Association studies so

play02:07

firstly the beauty of genome-wide

play02:09

Association studies are that they are

play02:11

hypotheses free

play02:13

you don't need to know anything

play02:14

beforehand and so starting from nothing

play02:17

you can identify a site that have genes

play02:20

that are likely associated with the

play02:23

phenotype that you're studying and then

play02:24

these genes can be further characterized

play02:27

and understood to understand what causes

play02:30

the phenotype of interest

play02:32

wow that was a long-winded explanation

play02:34

but genome-wide Association studies may

play02:37

also have therapeutic potential as we'll

play02:40

see later when we look at polygenic

play02:42

Briscoes and lastly the data for

play02:45

genome-wide Association studies requires

play02:47

sequencing data which is becoming ever

play02:50

more prevalent and so is technically

play02:52

quite easy to get now or at least easier

play02:54

than it was previously

play02:56

and these datasets can be even more

play02:59

informative when they're combined with

play03:00

other studies as well and actually

play03:03

combining genome-wide Association

play03:05

studies with other studies helps they

play03:07

become some of the limitations that I've

play03:09

seen with genome-wide Association

play03:11

studies so one experiment that I mean is

play03:14

single-cell RNA sequencing data because

play03:17

with that information you can see which

play03:20

genes are being expressed in a cell and

play03:22

which shells from assassin tissue and so

play03:25

if you can see the same subset of genes

play03:29

being expressed as the same subset of

play03:33

genes identified from a genome-wide

play03:34

Association study you have a fuller

play03:37

understanding and better confidence that

play03:40

the genes that you're looking at are

play03:43

possibly associated with the phenotype

play03:45

that you're studying it would make more

play03:47

sense probably if you had an example but

play03:50

bear with me for now but there are

play03:52

downsides to genome-wide Association

play03:54

studies the first thing the keywords

play03:56

Association Association of valais site

play03:59

with the traits doesn't necessarily mean

play04:01

causation and so the genes that are

play04:03

being identified may have new world

play04:06

value in terms of understanding the

play04:08

cause of a phenotype so the next

play04:10

limitation with genome-wide Association

play04:13

studies is the missing heritability and

play04:15

so what I mean by this is that if you

play04:18

took all the low-side that surpassed the

play04:20

thresholds of being associated with that

play04:23

phenotype and you added them all

play04:26

together the total sum of all those

play04:28

those I in terms of the likelihood of

play04:33

you getting that phenotype condition

play04:35

that sees whatever you're looking at is

play04:36

only around like two to three percent

play04:39

maximum for most of these genome-wide

play04:41

Association studies so what about you

play04:45

know that the other 97% where is this

play04:48

missing heritability and so part of this

play04:51

issue comes from the fact that the low

play04:53

site with like a strong link with the

play04:57

phenotype I'm being identified and that

play04:59

comes down to a kind of a flaw we're not

play05:02

really a floor-by-floor of the

play05:03

genome-wide Association studies that you

play05:05

have your population of different whole

play05:08

genome sequences that you're looking at

play05:11

and you're trying to find lo site in all

play05:14

of them all of the people with the

play05:16

condition that have that trains that

play05:18

have you know what I mean and the

play05:22

problem is is if you're looking for

play05:24

common variants you're going to miss the

play05:26

rare variants that let's say one or two

play05:28

people who have the condition also have

play05:31

that variance and it seems possible that

play05:35

these rare variants are the ones that

play05:37

have a greater effect size as you can

play05:40

see in the graph here but there is still

play05:42

hope that these rare variants can be

play05:44

identified given that now we have the

play05:47

tools to easily from whole genome

play05:50

sequencing and with this knowledge that

play05:53

we could be missing them we're more

play05:55

likely to look for the reference so

play05:57

another limitation with genome-wide

play05:58

Association studies is that it often

play06:01

identifies loose I where you have a

play06:04

genetic variant but not a gene why are

play06:08

it's the genes that were more interested

play06:10

in because then we can understand the

play06:13

function of the gene where the genes

play06:14

expressed I understand how that links it

play06:17

to the phenotype and so certain reasons

play06:20

could be that the low size and a

play06:23

non-coding region in which case it could

play06:26

be affecting the expression also not

play06:28

just one gene but multiple genes that

play06:30

altogether could be influenced in the

play06:32

phenotype um there's also a linkage

play06:35

disequilibrium whereby there might be

play06:37

more than one gene

play06:39

in that same region and/or there's so is

play06:44

more to do the fact that there could be

play06:46

multiple variants within a certain low

play06:49

sign is determining which of those

play06:51

variants is responsible and then lastly

play06:56

is looking at which tissues important in

play07:00

terms of the gene so as I said there

play07:03

might be a gene that you've identified

play07:05

but in might only be expressed and

play07:09

Sasson tissues and so to get answers to

play07:13

these different points you need to

play07:15

combine genome-wide Association studies

play07:18

with other experiments such as the RNA

play07:22

sequencing data that I spoke about

play07:23

earlier and another important issue that

play07:26

is now being better addressed is the

play07:28

lack of diversity in the studies at the

play07:30

moment so a lot of the genome-wide

play07:33

Association studies have used sequencing

play07:35

data from mainly European populations

play07:38

and say by including this including more

play07:42

ethnicities into these studies could

play07:44

also explain why there seems to be a

play07:46

lack of heritability in the results and

play07:50

so it would be valuable to include them

play07:52

and so if some of these limitations can

play07:55

be addressed there's great excitement

play07:56

for genome-wide Association studies for

play07:59

being used to generate polygenic riscos

play08:02

whereby they can take your genetic

play08:05

information and look at which less

play08:07

variants you've got and therefore your

play08:10

risk or likelihood of getting a certain

play08:13

disease or phenotype and so this could

play08:16

be the likelihood that you could get a

play08:19

certain type of cancer and this

play08:21

predictive information kids enable

play08:23

better identification and early

play08:27

identification of certain diseases which

play08:30

will leads a better time available for

play08:32

treatment and so by taking body mass

play08:35

index as an example the greater your

play08:37

polygenic risk or the greater the

play08:41

prediction of your BMI is and so if you

play08:43

did this across a population you would

play08:45

see a bell curve for the distribution or

play08:48

Fresco's and so the risk score could

play08:51

have some

play08:52

kind of predictive value for assessing

play08:54

people who are high risk of having a

play08:56

high BMI and getting obesity and

play08:58

therefore could be targeted through

play09:00

effective treatment but that's the key

play09:03

thing that I need to emphasize it's

play09:04

about whether or not a risk score can be

play09:07

actionable and so this was actually

play09:09

already done in a recent paper whereby

play09:11

they used a genome-wide polygenic risk

play09:13

score to quantify the inherited

play09:15

susceptibility to obesity and they did

play09:19

this to see if they could identify

play09:21

adults at risk of severe obesity and the

play09:24

idea is that if you can then

play09:27

you can try and help to prevent that

play09:29

from happening but the other key thing

play09:32

is that these polygenic risk scores have

play09:34

to be independent of other information

play09:37

that you can already get so if you were

play09:39

going to help somebody and that person

play09:42

was already obese and getting a probably

play09:44

done at risk score that tells you that

play09:46

they're going to be obese is already

play09:47

very much help then is it because it's

play09:49

already gonna be obvious so it's all

play09:52

about whether or not play gentle risk

play09:55

scores are going to be actionable and

play09:57

can be independent of other sources of

play09:59

information that we can already attained

play10:01

so I hope this video has given you a

play10:03

good introduction to what genome-wide

play10:05

Association studies are and their

play10:07

benefits and the current limitations

play10:10

with them but also what we can to you

play10:12

with them so thanks for listening

Rate This

5.0 / 5 (0 votes)

Ähnliche Tags
Genetic StudiesHealth ResearchPolygenic RiskGenome AnalysisPhenotype LinkDisease PredictionGenetic VariantsRNA SequencingHeritability IssueEthnic Diversity
Benötigen Sie eine Zusammenfassung auf Englisch?