Datasets: Analysing Using Networkx
Summary
TLDRThis tutorial video script guides viewers on analyzing various network datasets using the Python package NetworkX. It covers the process of reading different network formats like gexf, edgelist, dot net, gml, pajek, and graphml into NetworkX graph objects. The script demonstrates basic network analysis, including obtaining network information, visualizing networks with matplotlib, and exploring properties like degree distribution, density, clustering coefficient, and diameter. The goal is to provide a comprehensive understanding of network analysis techniques using NetworkX.
Takeaways
- 😀 The video demonstrates how to analyze various network datasets using the Python package 'networkx'.
- 📁 It covers six different network datasets in formats such as gexf, edgelist, dot (equivalent to pajek), gml, and graphml.
- 🛠️ The tutorial starts by showing how to import the 'networkx' package along with 'matplotlib' for network visualization.
- 📚 The first dataset analyzed is a Facebook network in edgelist format, using the 'read_edgelist' function from 'networkx'.
- 🔍 Basic network information like the number of nodes, edges, and whether the network is directed or not, is obtained using the 'info' function.
- 📈 The script explains how to read different network formats into a 'networkx' graph object using specific functions like 'read_pajek' for dot.net format.
- 🌐 Visualization of networks is showcased using the 'draw' function from 'networkx' and 'show' from 'matplotlib'.
- 📊 The video describes how to plot the degree distribution of a network, indicating the number of nodes with a particular degree.
- 📉 A log-log plot of the degree distribution is introduced to identify if the network follows a power law distribution.
- 🔢 The concept of network density is explained, which helps determine if a network is sparse or dense based on the ratio of actual to possible edges.
- 💡 The clustering coefficient is discussed, illustrating how to calculate it for nodes and find the average clustering coefficient for the entire network.
- 📏 The diameter of a network, which is the longest shortest path between any two nodes, is calculated to understand network connectivity.
Q & A
What is the main topic of the video?
-The main topic of the video is analyzing different network datasets using the NetworkX package in Python.
How many network datasets were downloaded in the previous video?
-Six network datasets were downloaded in the previous video.
What are the different formats of the network datasets mentioned in the video?
-The different formats mentioned are gexf, edge list, dot net (equivalent to pajek), gml, pajek, and graphml.
Which software is used for editing the Python file in the video?
-Sublime Text is used for editing the Python file in the video.
What function from the NetworkX package is used to read an edgelist format network?
-The function used to read an edgelist format network is `nx.read_edgelist`.
What basic information does the 'info' function from NetworkX provide about a network?
-The 'info' function provides basic details such as the number of nodes, number of edges, and the average degree of the network.
How can one determine if a network is directed or not using NetworkX?
-One can determine if a network is directed by using the function `nx.is_directed` and passing the graph object as a parameter.
What is the purpose of the 'read_pajek' function in NetworkX?
-The 'read_pajek' function is used to read networks in dot net or dot paj file formats, which are equivalent to pajek format, into a NetworkX graph object.
How can the 'draw' function visualize a network in NetworkX?
-The 'draw' function visualizes a network by plotting it. It requires the graph object as a parameter and uses matplotlib's 'show' function to display the graph.
What is a degree distribution in the context of network analysis?
-Degree distribution refers to the measure that shows the number of nodes in a network that have a particular degree, providing insight into how connectivity is distributed among the nodes.
How can one plot the degree distribution of a network in NetworkX?
-One can plot the degree distribution by first obtaining the degrees of all nodes, then counting the occurrences of each degree, and finally using matplotlib to create a plot with unique degrees on the x-axis and the count of nodes for each degree on the y-axis.
What does a log-log plot of degree distribution indicate about a network?
-A log-log plot of degree distribution can indicate if a network follows a power law distribution. If the plot forms a straight line, it suggests that the network has a power law degree distribution, meaning a few nodes have very high degrees while most have very low degrees.
What is the significance of the clustering coefficient in network analysis?
-The clustering coefficient indicates the degree to which nodes in a network cluster together. It measures the likelihood that two nodes connected to a common node are also connected to each other, reflecting the presence of cliques or groups within the network.
How is the diameter of a network calculated?
-The diameter of a network is calculated as the length of the longest shortest path between any two nodes in the network, essentially representing the greatest distance one must travel to reach any node from another.
What does the density value of a graph represent?
-The density value of a graph represents the ratio of the actual number of edges to the maximum number of possible edges in the graph, indicating how sparse or dense the graph is.
Outlines
📚 Introduction to Network Analysis with NetworkX
The speaker introduces the topic of network analysis using the Python package NetworkX. They discuss the variety of network datasets downloaded in the previous video, including formats such as gexf, edge list, dot (equivalent to pajek), gml, and graphml. The speaker outlines the process of analyzing these datasets using NetworkX and matplotlib for visualization, starting with the Facebook combined edgelist network, and demonstrates the use of 'read_edgelist' function to convert the network into a graph object. Basic network information such as the number of nodes and edges, and the type of graph (directed or undirected) is extracted using the 'info' function.
🔍 Exploring Different Network Formats with NetworkX
This paragraph delves into the analysis of different network formats using NetworkX. The speaker explains how to read networks in dot (pajek) format using the 'read_pajek' function, and discusses the properties of the football network, such as being a multi-digraph with directed edges. The speaker also covers the analysis of gml and pajek formatted networks, emphasizing the multi-graph nature and the average degree. The paragraph concludes with the examination of graphml and gexf formats, highlighting the directed nature of the Wikipedia network and the use of 'read_graphml' and 'read_gexf' functions to convert these networks into graph objects for further analysis.
🖼️ Visualizing Networks in NetworkX
The speaker introduces the concept of visualizing networks using NetworkX. They demonstrate how to use the 'draw' function along with matplotlib's 'show' function to visualize the karate network in gml format. The paragraph showcases the interactive features of the visualization, such as moving and zooming into the graph. The speaker also discusses different layout options available in NetworkX, such as circular, spectral, and spring layouts, and how they can be applied to visualize the network differently. The paragraph concludes with a brief mention of saving the visualized graph as an image file.
📊 Analyzing Degree Distribution in Networks
The speaker discusses the concept of degree distribution in networks, explaining how it represents the number of nodes with a specific degree. They provide a step-by-step explanation of how to calculate the degree distribution for the karate network using NetworkX. This includes using the 'degree' function to obtain a dictionary of node degrees, converting this into a set to find unique degrees, and then counting the occurrences of each degree to create a distribution list. The speaker also describes how to plot this distribution using matplotlib and suggests that real-world networks often exhibit a power-law degree distribution, which is illustrated in the plot of the karate network.
📉 Examining Network Density and Properties
The speaker explores the concept of network density, which indicates whether a network is sparse or dense by comparing the actual number of edges to the maximum possible number of edges. They explain how to calculate the density using the 'density' function in NetworkX and provide examples, including a complete graph and an empty graph. The speaker also touches on additional network properties such as the clustering coefficient, which measures the closeness of connections between neighbors of a node, and the diameter, which is the longest shortest path between any two nodes in the network. The paragraph concludes with a brief discussion on how to calculate these properties using NetworkX.
🔬 Further Network Analysis Techniques
In this paragraph, the speaker continues the discussion on network analysis, focusing on the calculation of the clustering coefficient for individual nodes and the average clustering coefficient for the entire network. They demonstrate how to use the 'clustering' function in NetworkX to obtain these values and emphasize the significance of the clustering coefficient in understanding the interconnectedness within a network. The speaker also revisits the concept of network diameter, explaining its importance in gauging the overall connectivity of a network. The paragraph concludes with a practical example of how to determine the diameter of a network using NetworkX, highlighting the efficiency of real-world networks in reducing the distance between nodes.
Mindmap
Keywords
💡NetworkX
💡Datasets
💡Edgelist Format
💡Graph Object
💡Visualization
💡Degree Distribution
💡Density
💡Clustering Coefficient
💡Diameter
💡GraphML
💡GEXF
Highlights
Introduction to analyzing network datasets using the networkx package in Python.
Overview of six different network datasets in various formats: gexf, edgelist, dot net, gml, pajek, and graphml.
Demonstration of creating a Python file for network analysis and importing necessary libraries.
Using the read_edgelist function from networkx to analyze the Facebook network in edgelist format.
Explanation of the nx.info function to provide basic details about the network, such as number of nodes and edges.
Methods to determine if a network is directed or not using nx.is_directed function.
Conversion of dot net format networks, equivalent to Pajek format, into networkx objects using read_pajek function.
Reading GML and Pajek formatted networks with the same read_pajek function.
Utilizing read_graphml function to handle and analyze Wikipedia network in graphml format.
Reading gexf formatted networks into a graph object using the read_gexf function.
Visualization of networks using nx.draw and plt.show functions from networkx and matplotlib.
Introduction of different network layouts in networkx, such as circular, spectral, and spring layouts.
Analysis of degree distribution in networks to understand the connectivity of nodes.
Explanation of how to plot degree distribution and the significance of power law degree distribution in real-world networks.
Investigation of network density to determine if a graph is sparse or dense.
Calculation of clustering coefficients to measure the degree of clustering or community structure within a network.
Determination of network diameter to understand the longest shortest path between any two nodes.
Practical applications and flexibility of networkx for various network analysis tasks.
Transcripts
Hey everyone! In the previous video we had downloaded a number of network datasets in
different formats in this video we are going to see how we can analyze them using networkx
package of python. Let's take a look at the datasets that we have downloaded , so these are
the datasets that we downloaded in the previous video so we had six networks we have a network in
gexf format we also had a Facebook network in edge list format we had a football network in dot net
format which it isequivalent to pajek format and we hadkarate club network in gml format we also
had karate network in pajek formatand we also had a Wikipedia network in graphml format.
So these were the sixnetwork datasets that we had now we going to see how we can analyze them
so I have all thedatasets in this folder I am going to create a new python file here where I
will be writing all the four so data sets dot py I am going to open in it in an editor I am
using sublime text here can use any editor for that matter ok since we are going to make use of
networkx package and I am going import it we are also going to visualize the networks so i am going
to import matplotlib as well. Now lets take the first network in this folder we haveletsmake use
of this facebook combined dot txt network which is in edgelist format let me copy this name
So now since the facebook network is in edgelist format the function that we are going to make use
of is read edgelist so I am going to write g is equal to nx dot read edgelist and here I am going
to get the name of the network nowour datasets are kept in a folder I am sorry. I think thats
the name ok so in that we have this is the name of the network so theyou see the function that we are
using is read edge list function which is present in networkx package and as a parameter we are
giving the name of the network now this function basically takes the network in a in the edgelist
format and returns a graph object and then we can apply any function on this graph object.
For example if you want to look at basic information of this network we can write
nx dot info on as a parameter here give g . So info is function which provides a basicdetail as
to the number ofnodes number of edges etcetera about to graph so let's save this file , and I
am going to open my terminal here I am going to run this file. So all right so here you
seefirstly it tells us the type the type is graph as in it basically tells us whether it
isdigraph or directed graph or it is multigraph and it tells us the number of nodes it also tells
us the number of edges and the average degree. So these are just a basic details about the graph.
Now let's go back to our file andadd some more things so this is what the basic things. If you
just want to get the number of nodes you can write , so now nx dot number of nodes is a function
which returns you the number of nodes if you if you have tojustget the number of nodes you can
use this function and similarly if you want the number of edges if you don't want all the other
information you can use this and in case you want toknow whether the graph is directed or not then
you can make use of this function print n x dot is directed so as a parameter you pass the graph
So this should tell us whether the graph is directed or not lets go back and try to run
this so here you see after the basic statistics it told us the number of nodes and edges and
it's false that is it is non directed so this facebook network that was a friendship network
its basically undirected network so that was abouthow we can we read and edge list network
intonetworkx object now let us see the other kinds of networks that we have we also have
here dot net format as I told you in the previous video that dot net format is basically the pajek
format so in order to read this network into the networkx object we will usepajek functions.
So let me show you how to do it so what I am going to do is i am going to use this function
read pajek which is a function that is used to read a dot net or dot paj file so let me
check the name of thenetwork football dot net so change it ok so so I am reading this dot net
network through this function read pajek into a graph object g and then applying all these
operations lets see and we run this . So here you see that the type of graph is multi digraph. That
means there are multiple edges between the nodes and its also directed graph and it's telling us a
number of nodes number of edges and since it is a directed graphit's telling us the average in
degree and average out degree. And after that again it's giving the result of the functions
number of number of nodes number of edges and its true which means it is a directed graph.
So this these are just the basic functions let's see what other kinds ofnetworks do we have we have
a gml network we have a pajek format as well ok ah. Let me show you that for reading the pajek
files as well you use the same function that is read pajek ok. So the network name is karate dot
paj so I will replace this ok. So when I run this i am getting that this is a multi graph
ok and the number of nodes are thirty four number of edges is seventy eight and this
is a average degree and its not a directed graph so its so I getting false here ok.
Now two more network formats that we have are graphml and gexf. So let me show you
quickly show you how we can read them as well ok so graphml ah. How do we handle it?
We so let me ok the function that we use it read graphml and the file name is wikipedia
dot graphml so this is how we will use it ok since this is written information. I am going
to I am going to comment this ok. Now let me run this all right so what we getting here is
that its a digraph wikipedia because this isgraph betweenthat the nodes are the articlesso ah. Is it
basically tells us whether an article is referred to in the other article or not. So its basically
a directed graph nine twenty one are the number of nodes number of edges are given and since its
directed we are getting in degree as well as the out degree the average in degree and out degree
and it's true that means it a directed graph. So let's go back here and the only format that
is left is g e x f let me quickly show you how to convert into graph object as well and copying
its name and here lets go back ok [sin/since] since this is a g fg e x f format will make use
of function read g e x f and what's name of the function let me rename this so that it doesn't
create any problem and let me rename this is well ok. So it should work let's see so this
is how we canread various networks in different formats and convert them into a graph object
once they are converted into graph object we can apply various functions on them and we can play
around with them now let also show you how we can visualize a network in networkx package.
So let me take a small network let me takekarate network itself andwe had it in g m
l format. [vocalize-noise] I thing we have in that video ok we can quickly add it so the function is
read g m l so if you have a graph which is in g m l format the function that you will use is n
x dot read g m l so ok all right. So we read a karatenetwork which was in g m l format we
made use in function n x dot read g m l. So let's execute this program ok so this is a simple graph,
and this is a number of nodes and edges and it is an undirected graph now let me show you how we can
visualize this graph I am going to comment this i am going to comment this as well.
Now the function that we used tovisualize the graph is n x dot draw and the parameter we will
do the graph that we have to draw in order to see that graph we have to use this function p l
t dot show basically the show function which is available in matplotlib so that is how we will
be able to see this graph ok now let me run this all right so this is all graph the karatenetwork
and the labels are given here let me also show you a few features that this interface provides
for example thiswhen you click this you can just move the graphthe way you want and the
next option is zoom to rectangle so when you click that if you want to carefully observed
some part of the graph you can just zoom it and see and if you further want to do that you can do
the way you want and then you can go back then you go then go back so these are few function
that few feature that you can make use of and this is configures subplots this will be used
when we plot something this is just a graph i will show you the functionality of this later
then this is how you can save the figure ok let me close this window now let me go back to
the program let me also show you how a directed graph in networkx looks like since this karate is
in undirected graph let me comment this and if I am not wrong this football network was
a directed network , so I am doing the same analysis on football dot net let me execute
it again you see this is howdirected graph and networkx looks like soyou see the the arrows
are represented like this if you want to zoom it again you can make use ofthis feature.
you want to zoom this area you can further zoom it is well so this is how you can closely analyze
whatever you want in the graph right , and you can then go back, or you can just press home
here ok then you can save the graph is well ok let me close this now ok I am going to continue
the rest of the analysis on karate networks so I am going to on comment object back let
me also show you that there are different layouts which are available in networkx for
example we have as if now use this function nx dot draw we can use another function if you want
to different layout so we can use n x dot draw circular ok so this is one of the layouts there
are various layouts available let me show you the output that we get in this case.
So here the all the nodes are arranged along a circle and the edges between them are shown
like this so this iscalled circular layout. There are number of other layout as well for
example spectral layout and spring layout so you canjust read the documentation about them
so we have visualized the network. Let's close this and lets go back to our program andwe can
do some basic analysis on this one of the thing that we can check on this network is
we can check the degree distribution. Now what is degree distribution degree
distribution basically tells us how many nodes are there in the network that have a particular degree
so this is done for all possibledegrees that a node can have in the graph a let's take this
example graph so herehow many nodes are having degree one sonode number four and node number
nine they are having degree one so corresponding to one will have two and similarly we can check
how many nodes are having degree two how many nodes are degree three four and five so these
are all the possibledegrees that nodes can have in this graph andcorresponding to the
degree we have the number of nodes that havethat particular degree so this basically is called the
degree distributionof the nodes in a graph nowit is alwaysnice idea to plot the degree distribution
to get a better idea of the graph . So when we plot this we get this kind
of distribution so on the x axis we maintain the degree and on the y axis we maintain the number of
nodes that have that particular degree so this is the kind of plot that we get for thisexample graph
we can checkfor ourdatasets what kind of degree distribution to theexhibit so lets go back to our
program and lets try to check what kind of degree distribution this karate network has for that
purpose I am going to create a function,to plot the degree distribution of this graph g so let
me create a function here ok before weimplement function I want to show you a few things on the
ipython console so let me copy this and let me open the ipython console here so basically, I am I
just copied those first two statements here and let me also copied this statement here ok
so we have thekaratenetwork in the object g now if I want to see the degree of each node
in this graph I can make use of this function nx dot degree g so what this function doesthis
degree function it basically returns a dictionary where the key is the number of the node and the
value is the degree of that node so here we get the dictionary for all thirty four nodes we are
we are getting the degrees of these nodes now lets go one step back andrecall what is our aim
here. Our aim is tohave the degree distribution of nodes in the graph ( Refer Time: 18:00)
So basically we want that for a particular degree how many nodes are there in the network that are
have in that degree so we basically first of all want to get the possible degrees that the nodes
are having in this network so we are getting the dictionary here where we are getting all
the possible values of the degrees thats in that the nodes can have so what we are interested here
in is basically the values so what i can write is nx dot degree g so this is going to give me
a dictionary all i am interested n is the value so i am going to write dot values here so it should
return me earliest which is having all the values so what is get here is basically all the possible
degrees that the nodes are having now my aim is get the possible degrees that the nodes can
have here you see there are lot of reputation so I want to get the unique degrees so in that case
what I can do is I can just write all this inside of function set so what it will do is it will
convert the output into a set and we know that in set there cannot be in reputations so what we get
is the unique values basically the unique degrees that the nodes can have in this particular network
now a list is more flexible data structure as compared to set because we can perform lot of
operation so i can further convert this intolist if i want this is basically up to you how or you
want to handle the data so i convert this into a list now what i get finally is a list of all the
unique valuesof degrees that the nodes can have in this network so i show you here so that we can
seewhats whats going on in the function now lets get back and use these functions inin the function
that we are creating so firstly I want all the degrees so I will write nx dot degree gI want
all the values only I dont want the dictionary so I will write dot values so here I get all the
degrees let me comment all the degrees . Now I am also interesting getting all the
unique degrees so that I can see what is the possibledegree values so what I am going to dothe
same thing that I did on the console I am going to passall degrees here ok so here I will get all the
unique degrees now to get the degree distribution what I basically want is I want to out of all the
values that are there in this list unique degrees I want to see how many nodes are having that
particular degree so basically what I will do is I will fetch one element out of this list that is
unique degrees. I will see how many occurrences of that value are there in all degrees right.
So probably I can start for loop here, so I will write for I in unique degrees sorry for I so what
I want to check is how many occurrences of I are available in all degrees so I can start a
variable probably x is equal to all degrees dot count, so we have this function count
in a list which tells us the occurrence number of occurrences of a particular element in that list,
so x will be telling us the number of occurrences of the degree I in the list all degrees soprobably
we can keep a track of all these values so we can create another list count of degrees
I will I started a empty list and , so I am going to append the x values to this list so basically
the occurrences of the first first degree are now stored in the list count of degrees.
So after this for loop is finished we will have all thedegree distributions in thislist that is
called count of degrees and then we can plotted at so lets try plotting is it so p l t dot plot as
you might be knowing there there aretwo parameters that have to be passed here so on the x axis we
want all the unique degrees and on the y axis we want have many nodes have have that degree
which we have stored in count of degrees so will pass that here so let lets try plotting it so i
will write p l t dot show ok we havent call this function x so let me call this function here and
comment this so I will call this function plot degree distribution and i will pass this g here
ok lets go back herenot here will go ok will try running this ok so this is the degree distribution
that we are getting for the karate clubah network So this is the x axis where the possible degrees
are there and on the y axis the number of nodes having that degree are there and we you can just
again play around with this plot. As I told you you can justuse this to move the plot
you can use this to zoom a particular part, and you can just go back as well,
and this is aah feature that you can use here ah. Basically you can just increase ordecrease
the x and ymargins so this is up to you whatever way you want it , and you can always reset it is
well solet me close it so this is basically up to you how you can how you want to visualize it ok
let be close it there was no x and y axis as you see so we cando that so let me just quickly do
thatbefore that you can also change the way this plot isappearing so if I if I put these dots you
will get the plot inin the form of these dots ok. So you get yellow dots here , and you can also put
line here and dots is well. So we will get both the things. So this is plot that you
gettingah one thing to observe here is that most of the real worldnetworks exhibit power
law degree distribution which means that there are very few nodes which have very high degree
and there are lot of nodes which have very less degree. So the same is being followed
in this small network as wellapart from one exception of this node sothat might happen in
some cases but in general real world networks have this power law degree distribution ok.
So lets go back and add a few more things may be Lets add the x label so lets add degrees here and
then we can add the y label as well may be number of nodes and maybe we can add the title as welland
degree distribution of karate network so lets sum this ok you get the title and the x and y labels
as well soyou can justuse more features more functions and and decorate this plot the way
you want theres one more thingthat usually is done in case of power law degree distribution
we can also check the log log plot of that. So how can wedo that we can just replace this
plot by log log so in that case it will give as alog log plot which basically means in take
the log of x axis it take thelog of y axis and ifif a network is following complete power log
the log log plot should be in a straight line so lets see what kind of plot we get in this case.
So here firstly we had section here secondlyit wasnt perfect power law so we have this kind
ofline which is not exactly straight so this is the kind of log log plot that we getting in this
case so this one about the degree distribution nowI am closing it lets go back here and seesome
more properties that we can analyze on thisnetwork we done with degree distribution lets go ahead.
So nextthing that we can check is density density value ofgraph basically tells us whether it
is a sparse graph or it is a dense graph with respect to the number of edges present so if
there are n nodes in a network the total possible edges that that network can have will be n choose
to out of these n choose to edges how many what is the fraction of the edges that are present in the
graph is basicallywhat is stored by the density value so if it is a simple graph the density
value between will be between zero and one and if it is empty graph the density will be zero if it
iscomplete graph density will be one however if it is a multi graph wheremore than one edges are
allowed between two nodes in that case density value will be more than one can be more one for
example in in this diagram you seesimple graph with nine nodes so the total possible edges that
can be there in in this network will be nine choose to which is nine into eight divided by
two that is six thirty six now the number of edges that are present actually in this graph are eleven
so the density is going to be eleven divided by thirty six and that is equal to point three one so
the graph is not very denseso this this is the kind of indication that the density valuegives us
Sowe can go back to the console and seethe density value forfewnetworks for example let me go here
and let's create, lets create a complete graph. So I am going to write g is equal to nx dot complete
graph of say hundred nodes ok sosince it is a complete graph what should be the density
value lets check nx dot density. So this is the function density which is available in networkx
which will give us the density value obviously in case of complete graph it is going to be one. Let
me now createin other graph let me repeat n t nx dot graph, so I havent added an aedges into
it let me add few nodes here , so I am going to pass a list , so I am passing only four nodes.
ah lets check the density value of this network so since we have not added any
edge obviously the density value wasgoing to be zero and let's go back here andsee what is
going to be the density value for arenetwork karate network so what I will write is print
density is , so nx dot density sorry and I am to pass g here let's go backcheck here density
is point one three nine so basically its sort of[fas/parts] parts graphsothat we can check, so
that is about the densitylet's go back and seefew morethings that we can check on these networks.
Nextyou can see clustering coefficient so for a given node clustering coefficient basically
tells us the number of lengths that are present amongst the neighbours of this node with respect
to the total number of lengths that can be possibleI will show youusing examplelet's
take this networklet's try to find out the clustering coefficient for this nodes six so
as you can see this node asfive neighbours write these are one two three five and eight so there
are five neighbours so what we have to check here is the number of links that are present
amongst these neighbours that is the number of links amongst one two three five and eight
as you can see we see only one linkthat is there between two and three there is no other
link present amongst these neighbours soon the numerator we will put one and on the denominator,
we will put the total possible links that that can be there amongst thesefive nodes so amongst five
nodes total possible links can be five choose to which is five in to four divided in two
that that is ten so, in this case, the clustering coefficient of this nodes six will be one divided
by ten so in case of friendship network this clustering coefficient basically tells us how
closely net the friends ofparticular node areso we can calculate the clustering coefficient value
for every node and then we can find the average as well so average clustering coefficientfor
they tells us thethe amount ofclustering present amongst the nodes in the graph.
So let's go back and try to check the same for our network so I am going to comment this the function
that we can use for finding the clustering is nx dot clustering however this function basically
returns a dictionary which gives the clustering coefficient value for every node so you can always
iterate over this dictionary so what i will do is for i in n x dot clustering gso i am interested
in all the items so i will write dot items i want to print i so i am going to comment this
so what we are doing here isthis n x dot clustering is returning a dictionary which
contains the clustering coefficient values for all the nodes we are just going toprint
them so lets run this so we getting this dictionary where for every node we are
getting the clustering coefficient value so if you want the average clustering coefficient we
can either average average these values or we can directly make use of another function which is
n x dot average i am sorry average clustering so this should tell us the average clustering
present in the network so you see point five seven is a average clustering so more this
value more the clustering and more title in it thethe people in the friendship network are
that was about the clustering coefficient lets go back and check few more properties
so what is the diameter of a networkdiameter is basically the maximum shortest path that
we have to travel to go from one node to the otherfor exampleif you if you know about all
pair shortest path algorithm it basically the returns the metrics wherethe the values
are the length of the shortest path being the two nodesso is it as that for everyevery pair
of thenodes so whatever is the maximum value in that metrics will be the diameter of the network
in other words its the shortest path between two most distant nodes in the network sofor
example if you see node one and node nine so if you have to go from one to nine you would
have to traverse this path one to six to five to nine. There is no other shortestpath between them
so the length of this this path path is three and we dont see any othershortest path which is
longer than this threeif you want to go from one to four again the length to the path is three if
you want togo from three to nine the length is three so we dont see any other shortest
path which is more than three sothats why the diameter of thisnetwork will be three we can
check the diameter of our network here is well. so I am going to comment this and lets check the
diameter so I will write diameter is so I will the function that we can use is n x dot diameter g
So it should give us the diameter so diameter is five soso there are thirty four nodes here and
the diameter is fiveits its basically observe that in real world networks the diameters is
basically very less becausethe nodes are connected to each other and that makes the
distance between them very small and lets how the diameter reduces so these were just thefewpoints
analysis thatwe performed on the networks that we downloadedthat the main thing to noticehere
is that once you get the network in thenetworkx graph object you can just play around with it
you can apply all the functions that are available you can just read the documentation and apply the
functions which are relevant for that network and you can go ahead with your analysis.
Ver Más Videos Relacionados
5.0 / 5 (0 votes)