Neighbour Joining

Anders Gorm Pedersen
31 Jan 201715:29

Summary

TLDRThe video script discusses the application of distance-based methods in reconstructing genetic trees, focusing on the use of least squares and minimum evolution criteria for optimal tree selection. It highlights the impracticality of exhaustive search due to the exponential growth of possible trees with more leaves, and introduces heuristic and clustering algorithms as alternative approaches. The script provides a detailed explanation of the neighbor-joining method, a popular clustering algorithm, which involves iterative steps to build a tree by combining the nearest nodes based on specific criteria. The process is illustrated with an example using a distance matrix, demonstrating how the algorithm constructs a tree with branch lengths that match observed distances, although it does not guarantee finding the best possible tree.

Takeaways

  • 🌳 The script discusses distance-based methods for reconstructing genetic trees, focusing on finding the optimal tree with the smallest sum of squared differences.
  • 🔍 It highlights the impracticality of exhaustive search for large datasets due to the explosive growth in the number of possible trees with an increase in leaves.
  • 🔄 An alternative to exhaustive search is heuristic search, which uses a hill climbing algorithm to gradually move towards better trees, starting from a random tree and improving upon it.
  • 👉 The script introduces clustering algorithms, specifically neighbor-joining, as a third method for distance-based methods, which is widely used and faster than the other two methods.
  • 📊 The starting point for clustering algorithms is a distance matrix, from which the algorithm iteratively builds the tree by combining the two nearest nodes at each step.
  • 🔬 The neighbor-joining algorithm computes a specific measure 'u' for each leaf to determine which nodes to combine first, based on the smallest pairwise observed distance minus the 'u' values.
  • 🔗 After combining nodes, neighbor-joining uses formulas to calculate the branch lengths connecting the nodes to their new ancestral node, rather than simply splitting the observed distance.
  • 📈 The algorithm iteratively updates the distance matrix by replacing the rows and columns of the combined nodes with a new row and column for the ancestral node, and recalculating distances.
  • 🔚 The process continues until all nodes are combined into a tree, with the algorithm providing a rapid method to obtain an overview of the data set, even though it does not guarantee the best possible tree.
  • 📝 The script provides a detailed step-by-step guide for implementing the neighbor-joining algorithm, which can be followed to construct a tree from a given distance matrix.
  • 📉 While neighbor-joining does not always provide the perfect match between observed and patristic distances, especially for larger datasets, it often yields reasonably good trees for an overview.

Q & A

  • What are the main methods discussed in the transcript for reconstructing genetic trees?

    -The transcript discusses three main methods for reconstructing genetic trees: exhaustive search, heuristic search, and clustering algorithms such as neighbor-joining.

  • What is the primary challenge with using exhaustive search to find the optimal genetic tree?

    -The primary challenge with exhaustive search is that the number of possible trees grows exponentially with the number of leaves, making it impractical for more than a small number of leaves.

  • How does the heuristic search method differ from exhaustive search?

    -The heuristic search method, also known as hill climbing, starts with a random tree and iteratively improves it by choosing the best neighboring tree based on the sum of squared differences, rather than examining all possible trees.

  • What is the basic premise of clustering algorithms in genetic tree reconstruction?

    -Clustering algorithms, such as neighbor-joining, work by iteratively combining the two 'nearest' nodes into a tree, gradually building up the tree structure until all nodes are included.

  • How does the neighbor-joining algorithm decide which nodes to combine first?

    -The neighbor-joining algorithm decides which nodes to combine first by calculating a measure for each pair of nodes based on their pairwise observed distance minus the average distances from each node to all other nodes, and then selecting the pair with the smallest value.

  • What is the purpose of the value 'u' in the neighbor-joining algorithm?

    -The value 'u' is the average distance from a leaf to all other leaves, divided by the number of leaves minus two. It is used to determine which pairs of nodes are 'nearest' and should be combined first in the tree.

  • How does the neighbor-joining algorithm compute the branch lengths when combining two nodes?

    -The algorithm uses specific formulas to compute the branch lengths from the combined ancestral node to each of the original nodes, based on the observed distances and the 'u' values of the nodes being combined.

  • What is the significance of the patristic distance in the neighbor-joining algorithm?

    -The patristic distance is the distance from one node to another in the tree. The neighbor-joining algorithm aims to construct a tree where the patristic distances match the observed distances as closely as possible.

  • Why might the neighbor-joining algorithm not always provide the best possible tree?

    -The neighbor-joining algorithm does not guarantee finding the best possible tree because it is an iterative procedure that builds the tree step by step, which can lead to local optima rather than a global optimum.

  • What is the advantage of using clustering algorithms like neighbor-joining for reconstructing genetic trees?

    -Clustering algorithms like neighbor-joining are advantageous because they are much faster than exhaustive or heuristic searches, making them suitable for large datasets where a rapid overview is needed.

Outlines

00:00

🌲 Distance-Based Phylogenetic Tree Reconstruction

This paragraph introduces the concept of using distance-based methods to reconstruct genetic trees. It discusses the challenges of using least squares and minimum evolution criteria due to the exponential growth of possible trees with an increasing number of leaves. The paragraph explains that exhaustive search is impractical for more than 15 leaves, making heuristic search methods like hill climbing necessary. It also introduces clustering algorithms, specifically Neighbor-Joining, as a third method for tree construction. The Neighbor-Joining algorithm is highlighted as the most widely used clustering method, which iteratively builds a tree by combining the two nearest nodes based on a distance matrix. The paragraph concludes by explaining that while these methods are faster, they do not guarantee the best possible tree but seem to work well empirically.

05:01

🔍 Neighbor-Joining Algorithm: Step-by-Step Explanation

The second paragraph delves into the specifics of the Neighbor-Joining algorithm, starting with the computation of a measure 'u' for each leaf, which is the average distance from the leaf to all others, scaled by n-2. The algorithm aims to combine the two tips with the smallest pairwise observed distance minus their respective 'u' values. After combining nodes, the paragraph explains the process of determining the branch lengths connecting the nodes to their new ancestral node and how to update the distance matrix with the new ancestral node's distances to the remaining tips. The paragraph concludes by emphasizing that the Neighbor-Joining method is a rapid iterative procedure that does not guarantee the optimal tree but can provide a good least squares tree for large datasets.

10:05

📊 Applying the Neighbor-Joining Algorithm to a Distance Matrix

This paragraph provides a practical example of applying the Neighbor-Joining algorithm to a real distance matrix. It walks through the initial step of computing the 'u' values for each node and then identifies the pair of nodes with the smallest combined 'u' value to start the tree construction. The paragraph details the calculation of branch lengths for the ancestral node and the process of updating the distance matrix to include the new node. It continues by repeating the process with the reduced dataset until only two nodes remain, which are then connected to complete the tree. The example concludes by verifying that the patristic distances in the resulting tree match the observed distances, demonstrating the Neighbor-Joining algorithm's effectiveness in this instance.

15:07

🤔 Limitations and Practicality of the Neighbor-Joining Algorithm

The final paragraph acknowledges the limitations of the Neighbor-Joining algorithm, noting that while it can provide good trees, it does not always guarantee the best possible tree, especially for larger datasets. It emphasizes that the algorithm is a heuristic method that offers a reasonably good solution without the need for exhaustive search. The paragraph concludes by highlighting the practicality of the Neighbor-Joining algorithm for quickly obtaining an overview of large datasets, suggesting its value in preliminary phylogenetic analysis.

Mindmap

Keywords

💡Distance-based methods

Distance-based methods are a class of techniques used in phylogenetics to reconstruct evolutionary trees based on the genetic distances between different species or taxa. In the context of the video, these methods involve calculating a sum of squared differences to determine the best possible tree that fits the genetic data. The script discusses how these methods can be computationally intensive, especially with a large number of leaves, leading to the development of heuristic and clustering algorithms to manage complexity.

💡Least squares

Least squares is an optimality criterion used in the construction of phylogenetic trees. It refers to minimizing the sum of the squared differences between the observed genetic distances and the distances predicted by the tree. The video script explains how this criterion is used to find the tree with the smallest sum of square differences, which is considered the optimal tree.

💡Minimum evolution

Minimum evolution is another optimality criterion that seeks to find the tree that requires the least amount of evolutionary change to explain the observed data. It is mentioned in the script as one of the methods used to find the best possible trees, although it is not the primary focus of the explanation.

💡Heuristic search

Heuristic search is a strategy used when it is impractical to examine all possible solutions to a problem, such as reconstructing a phylogenetic tree. The script describes this method as starting with a random tree and iteratively improving it by swapping branches to neighboring trees with better sum of squared differences, gradually moving towards a local or global optimum.

💡Neighbor-joining

Neighbor-joining is a specific clustering algorithm used in phylogenetics, which is highlighted in the video as a widely used method for distance-based tree reconstruction. The algorithm involves iteratively combining the two 'nearest' nodes, based on a specific measure, to build the tree step by step, starting from a distance matrix.

💡Distance matrix

A distance matrix is a square matrix that contains the genetic distances between all pairs of taxa being studied. In the script, it is described as the starting point for both heuristic search and clustering algorithms, such as neighbor-joining, where it is used to determine which nodes to combine in the tree-building process.

💡Clustering algorithms

Clustering algorithms are a set of methods used to group data points into clusters based on their similarity or distance. In the context of the video, these algorithms are used to build phylogenetic trees by iteratively combining the nearest nodes. The script explains that neighbor-joining is an example of such an algorithm.

💡Optimality criterion

An optimality criterion is a rule or standard used to evaluate the quality of a phylogenetic tree. The script discusses how distance-based methods and neighbor-joining do not use an explicit optimality criterion like least squares but instead rely on iterative procedures to build the tree.

💡Parsimony

Parsimony is a principle in phylogenetics that favors the tree that requires the fewest evolutionary changes to explain the data. The script mentions that the number of possible trees grows exponentially with the number of leaves, making it impractical to find the most parsimonious tree when the number of leaves is large.

💡Hill climbing algorithm

Hill climbing is a heuristic search algorithm used to find an approximate solution to optimization problems. In the script, it is used in the context of heuristic search for tree reconstruction, where it involves iteratively moving to 'neighbor' trees with better sum of squared differences, aiming to reach a local or global optimum.

Highlights

Discussion on using distance-based methods to reconstruct genetic trees.

Exploration of least squares and minimum evolution optimality criteria for finding the best possible trees.

Challenge of the exponential growth of possible trees with an increasing number of leaves.

Introduction of heuristic search as an alternative to exhaustive search for large datasets.

Explanation of the hill climbing algorithm for gradually moving towards better tree structures.

Introduction of clustering algorithms as a third method for distance-based methods.

Description of the neighbor-joining method as the most widely used clustering algorithm.

Process of starting with a distance matrix in clustering algorithms.

Iterative procedure of building a tree by combining the two nearest nodes.

Explanation of the neighbor-joining algorithm steps for tree construction.

Computation of a specific measure 'u' for deciding which nodes to combine first.

Use of the smallest pairwise observed distance for node combination in neighbor-joining.

Calculation of branch lengths in the combined ancestral node.

Updating the distance matrix with a new ancestral node and its distances to other nodes.

Example walkthrough of applying the neighbor-joining algorithm to a real distance matrix.

Demonstration of how the neighbor-joining algorithm builds a tree step by step.

Final result of the neighbor-joining algorithm matching observed and patristic distances.

Emphasis on the neighbor-joining algorithm not always guaranteeing the best tree but often providing good results.

Transcripts

play00:00

[Music]

play00:03

so we're discussing how to use distance

play00:06

based methods to reconstruct the genetic

play00:08

trees we've been seeing how the least

play00:12

squares and minimum evolution optimality

play00:15

criteria can be used to find the best

play00:16

possible trees we in essence the ideas

play00:20

that we try to look through all possible

play00:21

trees for each of them we can compute

play00:23

this measure this sum of squared

play00:25

differences and the idea is then to pick

play00:27

the tree with the smallest sum of square

play00:29

differences that the optimal tree

play00:31

however as we saw when we were

play00:34

discussing persimmony that's a problem

play00:36

in that the number of trees grows

play00:38

explosively with the number of leaves so

play00:41

find a realistic number of leaves if we

play00:43

have more than 1015 leaves it's

play00:46

impossible to look through all the trees

play00:48

so distance based method is therefore

play00:51

possible just as it was for presumably

play00:54

based methods for the persimmon a method

play00:56

to do a heuristic search it's exactly

play00:58

the same idea you start with a random

play01:00

tree you look for a set of neighboring

play01:03

trees by doing small rearrangements for

play01:05

each of the neighboring trees you can

play01:07

compute this sum of squared differences

play01:09

you then pick the neighbor that has the

play01:11

best sum of squared difference the

play01:13

smallest value of Q pick that as your

play01:15

new starting point

play01:16

take the neighboring trees from that

play01:19

again take the neighbor with the best

play01:21

value climb one more step and that way

play01:24

using this hill climbing algorithm you

play01:26

will gradually move towards beta and

play01:28

beta trees essentially reaching a local

play01:31

and hopefully global optimum where you

play01:34

have a good least squares tree so

play01:36

exhaustive search one option if you have

play01:39

few tips heuristic search another option

play01:41

if you have more tips and that's

play01:43

possible both from a simile and for

play01:45

distance basement however in the case of

play01:48

distance based methods there's actually

play01:50

a third way and that is the group of

play01:52

methods known as clustering algorithms

play01:54

clustering methods neighbor-joining

play01:56

is by far the the most widely used of

play01:59

these and I'm going to explain how how

play02:02

that works the main idea and clustering

play02:05

algorithms is that instead of having an

play02:07

actual optimality criterion instead of

play02:09

being able to for any given tree compute

play02:12

how good is this tree and then use some

play02:14

method to find the best tree the

play02:17

that instead you have an iterative

play02:18

procedure where you gradually step by

play02:21

step build your tree in the case of

play02:24

clustering algorithms as always in

play02:26

distance based methods the starting

play02:28

point is a distance matrix the IDN

play02:31

clustering algorithms is that you then

play02:33

take two nodes that are in some sense of

play02:35

the word and I'll return to that the two

play02:37

nearest ones so you take two nodes that

play02:40

you believe to be adjacent you put those

play02:43

together in the tree in the sense that

play02:46

you connect both of them to a common

play02:48

ancestral node in the distance matrix

play02:51

you now replace the columns and rows

play02:53

that you had for these two original tips

play02:56

you replace that by a new column and row

play02:59

corresponding to your new and special

play03:01

node you then just need to figure out

play03:03

what should I put in the distance matrix

play03:06

as the distance from this ancestral node

play03:08

to the rest of them and depending on the

play03:09

method there different ways of doing

play03:11

this you repeat these steps each step

play03:15

putting on putting together the two

play03:18

nearest nodes putting together the two

play03:21

Lunars nodes etc gradually building up

play03:23

the tree onto all the nodes in the tree

play03:25

length this is very rapid you just have

play03:30

an iterative procedure you look at each

play03:32

node one and you will end up with a tree

play03:35

quite rapidly there is of course no

play03:39

guarantee that the tree you find will be

play03:41

the best possible tree it won't be the

play03:43

necessary at least be the optimal

play03:46

least-squares tree or the optimal

play03:48

minimum evolution tree

play03:49

however empirically these methods do

play03:52

seem to work fairly well and they are

play03:54

definitely a lot faster so especially in

play03:57

situations where you want to rapidly get

play03:59

an overview of where your data set is

play04:00

very large it might be an area with

play04:03

looking the Astoria so in the case of

play04:08

neighbor-joining

play04:08

which is the method will just have a bit

play04:12

close read as I say each step in a

play04:16

clustering algorithm is about picking

play04:18

two nodes to put together and they are

play04:20

the nodes that are in some sense of the

play04:22

word nearest now in the case of

play04:25

neighbor-joining the sense of nearest is

play04:28

a bit peculiar perhaps

play04:30

but it nevertheless works quite well so

play04:32

I've put on this slide the entire

play04:34

algorithm the entire recipe for doing

play04:37

maple joining in principle if you know

play04:40

how to compute a order to program a

play04:42

computer you can take this slide and

play04:44

make a small computer program that will

play04:46

take as its input a distance matrix and

play04:49

then construct a neighbor joining tree

play04:51

as its output just by following the

play04:54

steps on this single slide so I'm going

play04:57

to just rapidly read through it it may

play04:59

not make sense while I'm going through

play05:01

it now but after I've done that we will

play05:03

do I will walk you through an example

play05:05

where we use the neighbor-joining

play05:06

algorithm on a real distance matrix to

play05:09

build a tree and you will see how that

play05:10

works out so the idea in

play05:13

neighbor-joining

play05:14

is that you compute a particular measure

play05:17

that you use as the basis for deciding

play05:20

which nodes to combine first for each

play05:23

tip for each leaf you compute this value

play05:26

called u which is the sum of the

play05:29

distance from your tip to all the other

play05:31

tips divided by n minus 2 where n is the

play05:35

number of these so this is essentially

play05:37

the average distance from your note from

play05:40

your leaf I to all bliss

play05:43

except you divide by n minus 2 instead

play05:45

of by n minus 1 there's a deeper reason

play05:48

for why he looks like that I'm not going

play05:50

to go into this now I'm just stating the

play05:52

algorithm in neighbor-joining

play05:55

the idea is now to combine the two tips

play05:58

I and J where this expression is

play06:01

smallest okay you combine the two trips

play06:04

I and J where the pairwise observed

play06:06

distance between the nodes minus UI

play06:10

minus u J is the smallest for reasons

play06:15

that I won't go into again yeah this

play06:16

turns out to to be a good way of

play06:19

creatively building the tree once you

play06:22

have combined the two nodes there are

play06:24

some things you need to figure out you

play06:26

have combining the two nodes to an

play06:27

ancestral node so you now need to figure

play06:30

out what is the length of the branch

play06:32

connecting each of these nodes to the

play06:34

ancestral node in principle you could

play06:36

just divide it in half and say half the

play06:39

observed distance would be put on either

play06:41

of the branches going out to either

play06:43

that's what's done in a method called

play06:45

upgma that we won't be discussing but in

play06:48

neighbor-joining

play06:49

you instead use the expressions down

play06:51

here which I again will not go into for

play06:54

computing the two separate branch names

play06:57

finally you need to figure out this new

play07:00

ancestral node that you just put in

play07:03

place of the two original nodes what is

play07:06

the distance from that and personal

play07:07

notes to the rest of your tips in your

play07:09

tree there's a formula for that also

play07:12

that's based on the observed pairwise

play07:14

distances and in the following manner I

play07:16

won't again go into that bottom line

play07:19

follow these these steps compute you for

play07:23

each of the tips compute this value for

play07:26

each pair of tips pick the pair what is

play07:29

the smallest compute new branch lengths

play07:32

from your q combined tips down to the

play07:35

ancestral node compute branch lengths

play07:37

from your ancestral notes all the

play07:38

remaining nodes and then run one extra

play07:42

iteration where you're now treating the

play07:43

new ancestral node as a tip you repeat

play07:46

until all the nodes are combined so

play07:49

let's walk through an example and see

play07:51

how that works let's say we start with

play07:53

the distance matrix we have four

play07:55

sequences we have these observed

play07:56

distances step number one is to compute

play08:00

the value you so for each node you take

play08:04

the sum of the distance from your node

play08:06

to the remaining nodes and divide it by

play08:09

n minus 2 in this case n is 4 we have 4

play08:13

beats so n minus 2 is 2 in the case of a

play08:17

the difference the distance from A to B

play08:19

is 17 hoc a 21-8 to D is 27 so we add up

play08:25

those three numbers and divide by 2

play08:27

that's 32.5 same for B 12 plus 18 plus

play08:34

17 are the three distances from B to the

play08:37

remaining three nodes divided by two we

play08:40

get those numbers do the same for C and

play08:42

D now we compute this number for each

play08:47

pair of sequences the pairwise distance

play08:50

between the two tips

play08:52

- you for one tip - you for the other

play08:55

tip

play08:56

so in the case of HB for instance the

play08:59

observed distance from A to B is 17 you

play09:04

for a is thirty two point five you for B

play09:08

is twenty three point five so 17 minus

play09:13

thirty-two point five minus 23.5 inserts

play09:16

at minus thirty nine okay we do the same

play09:20

for all the other pair of sequences in

play09:24

this particular case we now need to find

play09:27

the smallest value in this table so in

play09:31

this particular case we actually have

play09:33

two equally small values we have a and B

play09:35

and C and D so we arbitrarily select C

play09:40

and D as the starting point for this

play09:43

combination and put them together on the

play09:45

tree we have now decided C and B must be

play09:49

adjacent in the tree so we connect both

play09:51

of them to some ancestral node which

play09:53

I've called X on this particular slide

play09:56

we now need to figure out and what are

play09:59

the lengths of the branches connecting C

play10:01

and D to X I showed you the formulas in

play10:04

the original algorithm slide the formula

play10:08

for node I here for instance is half the

play10:12

observed distance between in this case C

play10:14

and D plus half the difference between

play10:17

UI and UJ so the observed distance

play10:21

between C and D was 14 0.5 perfor team

play10:26

plus 0.5 times U of C that's twenty

play10:32

three point five minus U of B that's

play10:35

twenty nine point five plug in the

play10:37

numbers do the computations comes out at

play10:40

four do the same for the other node zero

play10:44

point five times fourteen plus 0.5 times

play10:48

u for in this case D minus u 4 plug in

play10:53

the numbers the value is 10 so according

play10:56

to this formula we have now combined C

play10:59

and D connected them to a common

play11:01

ancestral node and we've computed the

play11:03

branch lengths from that ancestral node

play11:05

after them

play11:06

notice that the some of the branch

play11:08

lengths in this case 4 plus 10 is 14

play11:12

that's the same as the observed distance

play11:14

between them so in this case we have

play11:16

actually the patristic distance between

play11:18

C and D matches the one we observed

play11:21

originally away we Dowe need in the

play11:25

distance matrix to replace the rows and

play11:27

columns for C and D with and you roll

play11:30

and a new column for our single

play11:32

ancestral node to do that we need to

play11:35

compute the distance from our new node

play11:38

here X to the remaining nodes in the

play11:41

tree a and B at the two remaining nodes

play11:43

that we haven't dealt with yet there's a

play11:46

formula for that also it's essentially

play11:49

these distances I put it up here plug in

play11:54

the numbers you get the distance from X

play11:56

to a 17 the distance from X to B is 8 so

play12:01

we put in those numbers and the distance

play12:03

matrix and we can now delete the C and D

play12:06

rows and columns giving us this new

play12:09

distance matrix and we're now ready to

play12:12

run an X reservation again for each of

play12:16

the nodes in the distance matrix we

play12:18

compute this value u we take the

play12:22

distance the sum of the distances from a

play12:24

node to the remaining nodes so in the

play12:27

case of a that the distance to B plus

play12:29

the distance to X 17 plus 17 we divide

play12:33

it by n minus 2 we now have three nodes

play12:35

meaning we divided by 1 17 plus 17

play12:39

divided by 1 is 34 4 be 8 plus 17 is 25

play12:45

and finally 4 X 8 plus 17 is also 25 we

play12:51

now compute this other value for each

play12:54

pair of sequences the pairwise distance

play12:57

minus u for one node - you for the other

play12:59

node in this case the smallest value in

play13:03

this table is between a and B we

play13:06

therefore combine a and B in the tree we

play13:10

connect them both through a an ancestral

play13:12

node again we compute the two separate

play13:15

branch lengths using the formulas I

play13:17

showed you before in this case they turn

play13:20

out to be

play13:20

thirteen and four and again you will

play13:22

actually notice that this means that the

play13:24

sum of these thirteen plus four is 17

play13:27

that's the same as the observed distance

play13:29

between these notes so again in this

play13:31

case in this part of the tree the

play13:33

patristic distance matches the observed

play13:35

distance we now have just one remaining

play13:40

pair of sequences we now need to replace

play13:43

a and B with the new ancestral note why

play13:46

we need to compute the distance from Y

play13:48

to X that's again done using the formula

play13:51

that was on the original algorithm slide

play13:54

it turns out to be four and at this

play13:56

point we're actually done because we now

play13:58

only have these two nodes in the tree so

play14:00

we just need to connect them with a

play14:02

branch which has the link that we just

play14:04

computed so the result of all of this is

play14:09

the following we started with this

play14:11

distance matrix we went through the

play14:14

different steps in the algorithm and you

play14:15

can see that we ended up with the tree

play14:17

looking like this we have C and E

play14:20

together and B and a and we have these

play14:22

branch lengths the distance from A to B

play14:24

is 17 that's the same as we see in the

play14:28

tree

play14:28

13 plus 4 the distance from A to C is 21

play14:33

13 plus 4 plus 4 that's 21 the distance

play14:38

from A to D is 27 13 plus 4 plus 10 web

play14:43

27 distance from B to C 4 plus 4 plus 4

play14:47

that's 12 that also fits distance from B

play14:50

to B 4 plus 4 plus 10 that's 18 and

play14:55

finally the distance from C HD 10 plus 4

play14:57

is 14 so in this particular case the

play15:00

neighbor-joining algorithm actually gave

play15:02

us the perfect answer there's a perfect

play15:04

match between the observed distances and

play15:07

the patristic distances that will not

play15:09

always be the case in particular for

play15:11

larger data sets and for cases where

play15:13

it's not possible of course to find the

play15:15

best possible tree but oftenly the

play15:17

joining does give reasonably good trees

play15:20

it will not guarantee you that you find

play15:22

the best tree

play15:25

[Music]

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
Genetic TreesNeighbor-JoiningDistance MatrixClustering AlgorithmEvolutionary BiologyData AnalysisOptimality CriteriaHill ClimbingPhylogeneticsBioinformatics
هل تحتاج إلى تلخيص باللغة الإنجليزية؟