Social Network Structure: Graphs

Russell Haines
22 Jun 202113:40

Summary

TLDRThis script delves into the fundamentals of graph theory, emphasizing the importance of defining nodes and edges in a network. It discusses the significance of connectivity, paths, and the breadth-first search method for determining shortest paths and graph diameter. The script also highlights the role of pivotal and gatekeeper nodes, and introduces concepts like local gatekeepers and network components. It underscores the variability in network analysis based on the researcher's definition of an edge, using examples like the 'six degrees of separation' and the 'small world' phenomenon to illustrate the impact of edge definition on network structure and behavior.

Takeaways

  • 📚 Definition of a Graph: A graph in Chapter Two represents a network, with nodes as objects/agents and edges as connections, defined by the creator of the graph.
  • 🔗 Connectivity and Paths: The concept of connectivity within a graph is based on the existence of paths between nodes, which are sequences of connected nodes.
  • 🌐 Connected Graphs: A graph is considered connected if every node is linked to every other node, forming a cohesive network.
  • 🔎 Breadth-First Search: This algorithm is used to find the shortest path from a selected node to all other nodes in the graph.
  • 📏 Graph Measurements: Key measurements include the diameter (longest distance between any pair of nodes) and the average distance across the graph.
  • 🔑 Pivotal Nodes: These are nodes that lie on a path between two other nodes and can affect the distance between them, with gatekeepers being the most pivotal.
  • 🚫 Local Gatekeepers: These are nodes that connect to neighbors who do not connect directly with each other, but may not necessarily be pivotal.
  • 🌟 Components and Giant Components: Graphs can be broken down into components, with a focus on significant subsets of nodes for analysis.
  • 🔍 Defining Edges: The definition of an edge is crucial as it determines how the graph appears and functions, affecting concepts like the 'small world' idea.
  • 🌐 Small World Phenomenon: The concept of six degrees of separation is heavily dependent on how edges are defined within a network.
  • 🔮 Predictive Modeling: Graphs can be used to predict future connections or 'flows' between nodes, such as in social networks, communication, or web page navigation.

Q & A

  • What is the primary concept introduced in Chapter Two of the transcript?

    -The primary concept introduced in Chapter Two is the definition of a graph, which represents a network with nodes and edges, and the importance of understanding the context and definition provided by the author of the graph.

  • What is the significance of nodes in a graph?

    -Nodes in a graph represent objects or agents that have the ability to act within the network, and they are crucial for understanding the connectivity and interactions within the graph.

  • What are edges in the context of a graph?

    -Edges in a graph represent the connections between nodes, and their definition can vary widely depending on the researcher's perspective and the context of the network being modeled.

  • What is the concept of a path in a graph?

    -A path in a graph is the sequence of nodes that are connected through edges, and it is fundamental to understanding connectivity and the shortest route between two nodes.

  • What does a connected graph imply?

    -A connected graph implies that every node is linked to every other node within the graph, indicating a complete network where all components are accessible from any starting point.

  • What is the purpose of a breadth-first search in graph theory?

    -The purpose of a breadth-first search is to determine the shortest path from a chosen node to every other node in the graph, which helps in understanding the overall connectivity and structure of the network.

  • What is the diameter of a graph?

    -The diameter of a graph is the longest shortest path between any pair of nodes, which provides a measure of how 'spread out' the graph is in terms of connectivity.

  • What is the significance of pivotal nodes in a graph?

    -Pivotal nodes are crucial in a path between two nodes, and their removal can increase the distance between those nodes or even disconnect them, highlighting their importance in maintaining network connectivity.

  • What is a gatekeeper node and how is it different from a pivotal node?

    -A gatekeeper node is an extreme type of pivotal node. If removed, it completely disconnects two nodes, making it impossible to find a path between them, whereas a pivotal node's removal might only increase the distance but not necessarily disconnect the nodes.

  • What is the concept of a local gatekeeper in a graph?

    -A local gatekeeper is a node that connects to its neighbors but does not necessarily make them pivotal. Removing a local gatekeeper might not change the distance to other nodes, indicating a less critical role in network connectivity.

  • How does the definition of an edge impact the properties of a graph?

    -The definition of an edge significantly impacts the graph's properties, including its connectivity, diameter, and average distance, as it determines how nodes are connected and what constitutes a valid connection within the network.

  • What is the concept of 'small world' in graph theory and how does it relate to the definition of an edge?

    -The 'small world' concept refers to the idea that nodes in a network are connected by a small number of intermediary nodes, often associated with the 'six degrees of separation'. The definition of an edge plays a crucial role in determining the 'small world' number, as it dictates what constitutes a connection between nodes.

Outlines

00:00

📚 Understanding Graphs and Network Concepts

The first paragraph delves into the fundamental concepts of graph theory as presented in Chapter Two, emphasizing the definition of a graph and its components—nodes and edges. It discusses how the graph represents a network, with nodes symbolizing objects or agents capable of action and edges representing the connections between them. The paragraph also touches on the variability in defining nodes and edges, which can alter the graph's interpretation. It introduces the concept of connectivity, explaining the significance of paths between nodes and the idea of a connected graph where every node is linked. The paragraph further explains breadth-first search as a method for determining the shortest path between nodes and how it can be used to calculate the diameter and average distance of a graph. It concludes with a note on the impracticality of applying breadth-first search to very large graphs due to computational constraints.

05:03

🔍 The Importance of Node Characteristics and Graph Components

The second paragraph focuses on the characteristics of nodes within a graph, highlighting the concepts of pivotal nodes and gatekeepers. Pivotal nodes are essential in paths between other nodes, and removing them can increase the distance between connected nodes. Gatekeepers are a special type of pivotal node, whose removal can sever connections entirely. The paragraph also introduces the concept of local gatekeepers, which may not be pivotal but still play a significant role in local connections. It discusses the idea of graph components and the practicality of focusing on specific subsets of nodes for analysis, rather than attempting to analyze an entire 'world graph.' The paragraph underscores the importance of defining edges in a graph, as this definition can greatly affect the analysis and interpretation of the network. It provides examples of how different edge definitions can lead to different interpretations of connectivity, such as in the case of co-authors or the 'six degrees of separation' concept.

10:05

🌐 The Impact of Edge Definitions on Network Analysis

The third paragraph explores the profound impact that edge definitions have on network analysis, using the concepts of 'small world' numbers and degrees of separation as examples. It discusses how the definition of an edge can vary widely, affecting the perceived closeness of connections within a network. The paragraph provides examples such as the 'Bacon number' in film networks, where the definition of an edge as co-cast members leads to a surprisingly small average distance between actors. It also touches on the Milgram experiment, where the edge was defined by personal acquaintance, and how this can vary culturally. The speaker emphasizes the importance of researchers defining edges based on what is useful for their analysis, likening edges to potential flows of information or interactions. The paragraph concludes by encouraging an understanding of edges as representing not just current connections, but also the likelihood of future interactions.

Mindmap

Keywords

💡Graph

A graph in the context of this video represents a network, consisting of nodes and edges. Nodes are objects or agents with the ability to act, while edges are the connections between these nodes. The definition of what constitutes a node or an edge is determined by the researcher or author of the graph. The graph is central to the video's theme as it serves as a model for various types of networks, such as social, informational, or economic systems.

💡Node

A node is a fundamental element of a graph, representing an object or an agent within a network. Nodes have agency, meaning they can act and influence the network through their connections. In the video, nodes are used to illustrate the concept of connectivity and the roles different elements play within a network, such as pivotal nodes or gatekeepers.

💡Edge

Edges in the video are the connections between nodes in a graph. They represent the relationships or interactions between different elements of a network. The definition of an edge is crucial as it determines the structure and connectivity of the graph, and it can significantly impact the analysis of the network, such as in the small-world phenomenon or the six degrees of separation concept.

💡Connectivity

Connectivity refers to the extent to which nodes in a graph are linked by edges. A connected graph ensures that there is a path between every pair of nodes, which is a key concept in understanding how information or influence can spread within a network. The video discusses different types of connectivity, including connected and disconnected graphs.

💡Breadth-First Search

Breadth-first search is an algorithm mentioned in the video for determining the shortest path from a selected node to all other nodes in a graph. It is used to measure distances within the network and helps in understanding the network's structure, such as calculating the diameter or the average distance between nodes.

💡Diameter of a Graph

The diameter of a graph is the longest shortest path between any two nodes in the network. It provides insight into how 'spread out' the network is and is an important metric when analyzing the efficiency of information transfer within the network. The video uses this concept to discuss the properties of different types of graphs.

💡Pivotal Node

A pivotal node is a node that lies on a path between two other nodes. Removing this node or its connecting edge can increase the distance between the two nodes it connects. The concept is important in the video as it highlights the importance of certain nodes in maintaining the connectivity and efficiency of a network.

💡Gatekeeper

A gatekeeper is a type of pivotal node that is so crucial that removing it would break the connection between two nodes entirely. Gatekeepers are the extreme form of pivotal nodes and represent control points within a network, as illustrated in the video with examples like co-authorship graphs.

💡Local Gatekeeper

A local gatekeeper is a node that connects to its neighbors but does not necessarily play a pivotal role in the connectivity of the entire network. The concept is discussed in the video to differentiate between nodes that are pivotal on a larger scale and those that are only important within their immediate neighborhood.

💡Component

In graph theory, a component refers to a subgraph in which any two nodes are connected to each other by paths, and which is connected to no additional nodes in the supergraph. The video emphasizes the importance of focusing on components to make sense of large and complex networks, such as the world graph.

💡Small World

The 'small world' concept, as discussed in the video, refers to the phenomenon where most nodes in a network can be reached from every other node through a small number of steps or connections. This idea is closely tied to the definition of edges and is exemplified by the six degrees of separation theory.

Highlights

Definition of a graph as a representation of a network with nodes and edges defined by the creator.

Nodes in a graph represent objects with agency, capable of acting within the network.

Edges signify connections between nodes, with various ways to define these connections.

Connectivity in graphs involves the concept of paths and the sequence of nodes connected by edges.

A connected graph implies every node is linked, whereas a disconnected graph has unlinked nodes.

Breadth-first search is introduced as a method to determine the shortest path from a node to all others.

The diameter of a graph is the longest distance between any pair of nodes.

The average distance in a graph is calculated by averaging the distances between all pairs of nodes.

The impracticality of performing breadth-first search on very large graphs due to excessive computation.

Characteristics of nodes, including pivotal nodes and gatekeepers, which play crucial roles in connectivity.

Gatekeepers are pivotal nodes that, if removed, disrupt all connections between certain nodes.

Local gatekeepers are pivotal within their immediate neighborhood but do not affect overall graph connectivity.

The importance of defining components in a graph to focus on relevant subsets of nodes.

The significance of defining an edge in a graph, as it greatly influences the graph's structure and interpretation.

The concept of 'small world' and its dependency on the definition of an edge, as illustrated by the six degrees of separation.

The variability in defining edges, such as including directors or producers in a movie network, affecting the 'small world' number.

The importance of considering cultural differences when defining an edge, like the meaning of knowing someone on a first-name basis.

The idea of edges representing potential future connections or flows, such as information or transactions.

The concept of edges as a statistical probability of future interactions, emphasizing their predictive nature.

Transcripts

play00:00

all right so I'm gonna talk here about

play00:01

chapter twos first I'm gonna talk about

play00:04

the basic vocabulary and the concepts

play00:07

you need to know out of Chapter two

play00:10

the first main idea in Chapter two that

play00:12

you need to understand is the definition

play00:14

of a graph so chapter two they go

play00:17

through a whole bunch of graphs and

play00:20

they're putting nodes and they're

play00:21

putting edges between them and what you

play00:24

need to make sure you understand is the

play00:27

graph itself so whatever graph you're

play00:28

seeing is a representation of this

play00:33

network and the network definition is

play00:37

defined by whoever created the graph so

play00:42

everything you see there is something

play00:45

that the author whoever chose to put

play00:49

there so in this context here I mean the

play00:53

context of that graph you need to think

play00:56

here about the nodes in that graph being

play01:00

some sort of object and usually that

play01:03

object needs to have some sort of agency

play01:06

meaning they have sort of a will or an

play01:09

ability to act in the network and

play01:11

choosing to activate these edges and the

play01:16

edges being the connections between the

play01:18

nodes there's lots of different ways to

play01:20

define the nodes and lots of different

play01:22

ways to define the connections so again

play01:25

the definition here is up to the

play01:28

researcher or whoever produced the graph

play01:31

when you talk about connectivity from

play01:33

Chapter two the idea that's most

play01:37

important in this sense is this path

play01:39

idea and if we're going to say you know

play01:42

we're following a path between two nodes

play01:46

in my graph you know this path is the

play01:48

sequence of nodes that are connected the

play01:51

connection or the connectivity of the

play01:54

graph

play01:57

has something to do with the way the

play01:59

whole graph works together so if we're

play02:01

going to say a connected graph meaning

play02:04

the researcher drew a graph here and

play02:06

every item or every node is linked that

play02:10

means your graph you're showing is

play02:13

nected I'll talk in a minute about a

play02:14

disconnected graph and why you would

play02:17

graph that so we had the different ways

play02:19

of sort of measuring distances and a few

play02:22

other things so the first idea was this

play02:25

breadth-first search and you can see

play02:28

here you know i've kind of put it in the

play02:30

slide but the breadth first search is

play02:32

where you pick a node in the graph and

play02:34

you measure a distance from that node to

play02:37

every other node it's got to be the

play02:40

shortest path here so a distance is

play02:42

always the shortest path I mean aren't

play02:43

taken you know sort of there's no long

play02:45

scenic route between two nodes

play02:48

there's the shortest path and that's

play02:51

what the breadth-first search needs to

play02:53

find for us once we've done a breadth

play02:55

breadth-first search for every node

play02:58

every pair of nodes in our graph we then

play03:01

can figure out or you and then we can

play03:05

determine what the diameter of the graph

play03:07

is so the diameter of the graph is gonna

play03:09

be the longest distance between any pair

play03:12

of nodes in the graph and then the

play03:14

average distance would be if we averaged

play03:17

every distance between every pair of

play03:19

nodes in a graph that would be the

play03:21

average distance hopefully you're

play03:23

already thinking right now if we see

play03:25

these really large graphs with thousands

play03:28

and thousands of nodes you actually

play03:30

would not ever do even a breadth-first

play03:32

search and determine any you know every

play03:35

distance in the graph it's just too much

play03:36

computation we'll talk about that issue

play03:39

in a little bit or yeah so usually

play03:41

wouldn't do it all right speaking of

play03:44

nodes so nodes do have characteristics

play03:47

and that was actually the main focus in

play03:49

this chapter so we had some terms here

play03:52

the ones you need to be clear on and I

play03:55

think these are clear and the exercises

play03:57

the main point here is a pivotal node

play04:00

that is you have a pivotal node if we

play04:02

have a path between two pairs of nodes a

play04:06

pivotal node falls on that path and if

play04:09

you take that pivotal node out or remove

play04:12

an edge to the pivotal node that would

play04:15

increase the distance between those two

play04:19

it's a gatekeeper is basically a type of

play04:24

pivotal node that is so pivotal that if

play04:26

you take it out you have no possible way

play04:30

of connecting those two nodes again so a

play04:33

gatekeeper is basically the extreme type

play04:37

of pivotal node and that you know again

play04:41

would make it so you can't connect to

play04:43

nodes or there's some nodes that

play04:46

wouldn't be connected and in that case

play04:47

going back a couple here we're talking

play04:50

about a graph if that's no longer

play04:52

connected if you and again it's not

play04:55

necessarily removing the gatekeeper it's

play04:57

removing one of the edges between a

play04:59

gatekeeper and other nodes a local

play05:03

gatekeeper is sort of the junior version

play05:07

of a pivotal node a local gatekeeper

play05:10

does connect to its neighbors its

play05:12

neighbors don't connect directly with

play05:13

each other but in the case of a local

play05:16

gatekeeper it's possible that they're

play05:19

not a pivotal node that is if you take

play05:22

out an edge to a local gatekeeper you

play05:25

can still reach the other node and the

play05:29

distance could be the same length okay

play05:32

so it's possible to be a local

play05:34

gatekeeper and not a pivotal node now so

play05:38

that's basically what you know that's in

play05:40

the chapter and that stuff that exercise

play05:42

is kind of emphasized I want to go

play05:44

through a couple of things now that

play05:47

needs are two different emphasizing that

play05:49

I think in the book this book is by

play05:51

economists and we're here to kind of as

play05:55

technologists and social scientists

play05:57

trying to get graphs to work for us so

play06:01

when here we talk about components and

play06:04

giant components again you're the one as

play06:08

the researcher that's defining what's

play06:10

relevant it's really pointless to think

play06:13

about a network graph of everything or

play06:15

everyone in the world because you would

play06:18

never be able to analyze that graph so

play06:21

focusing on a component is very normal

play06:24

so you're gonna break the graph up

play06:26

you're going

play06:26

do a subset of nodes and focus on that

play06:29

set of nodes so it's possible also once

play06:32

we're focusing on a set of nodes that we

play06:36

can actually say we have various

play06:37

components in what would be considered a

play06:39

quote world graph maybe and in that case

play06:42

it's possible to graph the components

play06:44

separately meaning we zoom up a level

play06:47

okay we aren't necessarily going to do

play06:49

that much I mean at this point it's

play06:51

we've got a lot going on but that's kind

play06:55

of one of the options here to reduce the

play06:57

complexity of what's going on the next

play07:00

thing that's super super important is

play07:02

how they define an edge defining an edge

play07:06

is a huge deal and I mean what your

play07:08

graph is going to look like everything

play07:10

about it certainly how you define a node

play07:13

is something but it's really enough to

play07:15

start saying node is a object slash

play07:18

agent and at that point you know people

play07:22

is easy enough to make a node so here we

play07:25

had a graph this was a graph of the what

play07:30

was this this was co-authors so people

play07:33

that appeared on the same paper together

play07:36

and you look at the graph over here and

play07:39

you can see there's actually a sort of

play07:41

large central component in the middle

play07:44

and then there's these three other sort

play07:46

of isolated nodes together now I want to

play07:49

emphasize here the reason this is all

play07:51

drawn together is because whoever the

play07:53

researchers is think of all these nodes

play07:56

as being connected in another way

play07:58

already okay now I think these are

play08:01

people who work at the National

play08:04

Institutes of Health's and a few

play08:06

different research centers what I

play08:08

emphasize here is this is people that

play08:10

have co-authored an article together the

play08:14

sub underlying graph the implied

play08:17

connections might include people that

play08:19

ate lunch together people that had

play08:20

coffee together people that met each

play08:23

other in the parking lot I mean there's

play08:25

a whole lot of ways that our nodes might

play08:28

be connected behind the scenes

play08:31

which is actually why we're seeing this

play08:33

graph here with these two or really

play08:36

three different components

play08:38

with two that are really tiny and one

play08:40

that's really big so defining an edge is

play08:43

a huge yield so switching over to their

play08:48

small world idea in the book and boy

play08:50

they sure love that one the idea of the

play08:53

six degrees of separation entirely

play08:56

depends on your definition of an edge so

play09:00

if you think about the bacon number

play09:01

that's actually a pretty low number and

play09:04

if you remember that from the book I

play09:05

think it was something like most of them

play09:07

are less than five the bacon number edge

play09:10

definition is castmembers so that means

play09:12

this is someone who appeared in a movie

play09:14

and maybe appears in the credits they

play09:16

spoke something or in some way acted and

play09:19

got included in the credits so just

play09:22

based on that all the people of all

play09:25

these movies are very you know closely

play09:27

connected they have a very short

play09:29

distance to each other the Milgram

play09:31

experiment was a similar thing well not

play09:34

similar I mean they defined their edge

play09:36

they defined it in a different way

play09:38

the idea there was someone you know on a

play09:40

first-name basis so what I want you to

play09:42

think about right now is if we're gonna

play09:44

think about the network of movies you

play09:50

would change the bacon number if you

play09:52

just simply added the directors of

play09:55

movies for example so if it included

play09:58

cast member and director it would might

play10:01

be a different number if you included

play10:02

cast members and producers and not

play10:04

directors so there's a lot of

play10:06

variability in here on how you might

play10:08

come up with your quote small world

play10:11

number meaning your average distance

play10:15

similarly the Milgram experiment this

play10:18

idea of someone you know on a first-name

play10:20

basis and I know this as an American

play10:22

what that means knowing someone on a

play10:25

first name basis I actually know people

play10:28

that would you know consider themselves

play10:29

very prominently that I can address on a

play10:32

first name basis but I know in other

play10:34

countries this is someone I would never

play10:36

address by their first name you know for

play10:39

example I know one of the city council

play10:42

members of our city here and I address

play10:45

her by her first name which is you know

play10:48

that's just what I do

play10:50

and so we think about you know sort of

play10:52

my connection my degrees of separation

play10:54

from you know just some random not

play10:56

random if I pick to say Barack Obama and

play10:59

how many steps would it take me to get

play11:01

to Barack Obama is probably via this

play11:04

woman who I'm quote on a first-name

play11:05

basis it's a very short distance where

play11:10

you know if I wasn't comfortable

play11:13

addressing her on a short name basis all

play11:15

of a sudden you know the distance

play11:16

expands by probably 10 so defining an

play11:21

edge is a very very important thing and

play11:23

you as a researcher are the one who is

play11:26

going to define the edge and it's based

play11:28

on what's useful to you so I like to

play11:33

think of this as flows some people don't

play11:35

like the idea of flows but to me we're

play11:38

activating an edge so usually

play11:41

information is going to flow so we

play11:45

talked about nodes being agents so

play11:48

having something they want to do and

play11:50

that means the nodes are activating the

play11:54

edges sort of consciously so if we went

play11:59

back to some of the examples in the book

play12:01

one of them is about instant messaging

play12:03

who talked to whom with that sort of

play12:06

model we're going to say you know

play12:08

knowing who talked to whom what we're

play12:10

trying to do is say what's gonna happen

play12:13

later what happens next who is going to

play12:15

talk to whom and so we might be trying

play12:17

to predict you know when a very and when

play12:19

a video goes viral where does it go who

play12:22

is going to engage those edges it's the

play12:26

same with these will get to these as we

play12:29

go later in the book that's you know as

play12:31

you make transactions who's trading with

play12:34

whom this is the idea of you know who

play12:36

traded in the past with whom can be

play12:38

modeled who is going to transact with

play12:41

whom this also and I want you to think

play12:43

about this one too if you think about

play12:45

web pages so we have a web pages that

play12:46

have links to other pages you can use

play12:50

those to predict what links that user

play12:52

will follow but we understand that just

play12:54

because a link appears on the page it

play12:55

doesn't mean that they're actually ever

play12:57

going to be used and I think that's the

play12:59

most profound example here about an edge

play13:03

so an edge here a link here is a an

play13:07

availability or a likelihood of a future

play13:10

connection okay so our edges when we as

play13:13

researchers model these edges what we're

play13:15

trying to say is this is something that

play13:17

might happen that might be used and we

play13:20

need to be conscious of that that

play13:21

there's sort of a statistical

play13:23

probability that going on here

play13:25

so hopefully that'll get you kind of the

play13:29

interesting things going on here we'll

play13:32

do the exercise and hopefully that will

play13:35

illuminate things further

Rate This

5.0 / 5 (0 votes)

Ähnliche Tags
Graph TheoryNetwork AnalysisBreadth-First SearchConnectivityPivotal NodesGatekeepersLocal GatekeepersSmall WorldSix DegreesEdge DefinitionSocial Networks
Benötigen Sie eine Zusammenfassung auf Englisch?