SNA Chapter 1 Lecture 3
Summary
TLDRThis lecture introduces five types of real-world networks: social, biological, information, technological, and language. It explores examples like Twitter and Facebook for social networks, protein interactions for biological networks, and the World Wide Web for information networks. The talk also covers network analysis levels from microscopic to macroscopic, including node properties, community detection, and network motifs. The course aims to teach network properties, dynamics, and applications like node classification, link prediction, and anomaly detection in social media.
Takeaways
- 🌐 There are five main types of real-world networks: social, biological, information, technological, and language networks.
- 🤔 Social networks can be exemplified by platforms like Twitter and Facebook, where users and their relationships form the nodes and links.
- 🧬 Biological networks include protein-protein interaction networks and neural interactions, representing complex biological systems.
- 🌐 Information networks encompass the World Wide Web and citation networks, where nodes represent web pages or documents and links represent hyperlinks or citations.
- 🔌 Technological networks involve infrastructure like power grids, airline, and railway networks, where nodes represent components and links represent connections.
- 💬 Language networks are based on word co-occurrences or keyword relationships within texts, useful for natural language processing tasks.
- 🔍 Network analysis involves studying networks at three levels: microscopic (individual nodes and edges), mesoscopic (clusters or communities), and macroscopic (entire network properties).
- 🔄 Network dynamics include the study of network formation, evolution, and the spread of information or influence within networks.
- 📈 The course will cover applications of network analysis such as node classification, link prediction, and anomaly detection, which are crucial for understanding complex systems.
- 📚 Prerequisites for the course include knowledge of Python programming, probability and statistics, linear algebra, algorithm design, and basics of machine learning and deep learning.
- 🚀 The course aims to provide a comprehensive understanding of network analysis, equipping students with skills to analyze and interpret complex network structures and their applications.
Q & A
What are the five types of real-world networks discussed in the script?
-The five types of real-world networks discussed are social networks, biological networks, information networks, technological networks, and language networks.
Can you provide an example of a social network mentioned in the script?
-An example of a social network mentioned is the Twitter follower-following network, where nodes are users and links represent the follower-following relationships.
What is a biological network and what are some examples provided in the script?
-A biological network refers to interactions within biological systems. Examples include protein-protein interaction networks, neural networks, and food networks.
How is the World Wide Web categorized in the context of network types?
-The World Wide Web is categorized as an information network, where nodes can represent web pages and links represent hyperlinks between them.
What are the two types of nodes interaction levels discussed at the microscopic level of network analysis?
-The two types of nodes interaction levels discussed are dyadic level, which involves interactions between two nodes, and triadic level, which involves interactions between three nodes.
What does the term 'degree distribution' refer to in network analysis?
-Degree distribution refers to the frequency of different degrees (number of connections) present in a network, which is a property analyzed at the macroscopic level.
What is the 'diameter' of a network and how is it determined?
-The diameter of a network is the longest shortest path between any pair of nodes in the network. It is determined by measuring the shortest paths between all pairs of nodes and identifying the longest one.
What is a mesoscopic view of a network and what does it involve?
-A mesoscopic view of a network involves looking at specific regions or structures within the network, such as clusters or communities where nodes are densely connected, and network motifs which are recurrent sub-structures.
What is the 'small-world property' in networks and how does it relate to the '6 degrees of separation'?
-The 'small-world property' refers to the phenomenon where most nodes in a network can be reached from every other node through a small number of steps. The '6 degrees of separation' is a specific example of this property, suggesting that any two individuals are six or fewer acquaintances apart.
How does the script relate social networks to the real world and societal behaviors?
-The script suggests that social networks act as a proxy for our society, reflecting public opinion, information consumption patterns, and enabling participation in polls and decision-making processes.
What applications of network analysis are mentioned in the script?
-Applications of network analysis mentioned include node classification, link prediction, growth and virality of messages, anomaly detection, and graph representation learning for various purposes like fraud detection and recommender systems.
Outlines
🌐 Introduction to Real World Networks
The paragraph introduces the concept of real-world networks, emphasizing five main types: social, biological, information, technological, and language networks. Social networks are exemplified by platforms like Twitter and Facebook, where users and their relationships form the nodes and edges. Biological networks include protein-protein interaction networks and neural interactions. Information networks encompass the World Wide Web and citation networks. Technological networks involve infrastructure like power grids and airline routes. Language networks are based on word co-occurrences. The speaker suggests that networks can be abstracted from nearly any dataset, highlighting the versatility of network analysis.
🔬 Exploring Network Types and Their Structures
This section delves deeper into the types of networks, providing specific examples and discussing their structural nuances. It covers communication networks like wireless mesh networks and the internet, scientific networks such as citation and co-authorship networks, and their directional properties. The importance of understanding clusters or communities within networks is highlighted, along with the concept of language networks, including word co-occurrence networks and keyword co-occurrence networks. The paragraph also introduces semantic networks, akin to knowledge graphs, and their attribute-based connections.
🏥 Networks in Epidemiology and Multi-Scale Analysis
The paragraph discusses the application of networks in tracking epidemics, like virus spread through patient interactions, and the strategic implications for controlling outbreaks. It introduces three levels of network analysis: microscopic, mesoscopic, and macroscopic. Microscopic analysis focuses on individual nodes and edges, mesoscopic on community structures and network motifs, and macroscopic on the network's global properties like diameter and edge density. The paragraph emphasizes the importance of analyzing networks at different scales to understand their complexity and dynamics.
🔍 Network Analysis: Properties and Theories
This section ponders whether common properties across different types of networks can lead to a general theory of network structure and dynamics. It mentions properties like the 'small world' effect, 'six degrees of separation', scale-free networks, clustering, robustness against attacks, and cascade effects. The discussion sets the stage for exploring how these properties manifest in various networks and what they reveal about the networks' behavior and resilience.
📈 Applications and Implications of Network Analysis
The paragraph outlines the applications of network analysis in social media, emphasizing its growth and impact on society. It discusses how social networks are used for public opinion polling and information dissemination. The lecture series will cover various topics including node classification, link prediction, growth and virality of messages, anomaly detection, and graph representation learning. The paragraph also mentions the importance of understanding network-based anomaly detection and the potential of applying network analysis to fraud detection and recommender systems.
🎓 Course Prerequisites and Conclusion
The final paragraph outlines the prerequisites for the course, recommending knowledge in Python programming, probability and statistics, linear algebra, algorithm design, and basics of machine learning and deep learning. It concludes by encouraging students to learn the skill of networking, indicating the interdisciplinary and applied nature of the course content.
Mindmap
Keywords
💡Social Network
💡Biological Network
💡Information Network
💡Technological Network
💡Language Network
💡Network Motifs
💡Microscopic Level
💡Macroscopic Level
💡Mesoscopic Level
💡Small World Property
💡Scale-Free Property
Highlights
Introduction to the concept of real-world networks and their significance.
Identification of five broad types of networks: social, biological, information, technological, and language.
Examples of social networks, such as Twitter and Facebook, where nodes represent users and links represent relationships.
Explanation of biological networks with protein-protein interaction networks and neural interactions as examples.
Information networks like the World Wide Web and citation networks, highlighting their directed nature.
Technological networks including power grids, airline, and railway networks, emphasizing their critical infrastructure role.
Language networks, including word co-occurrence networks, important for natural language processing.
The idea that networks can be abstracted from almost every complex system or dataset.
Discussion on the microscopic level of network analysis, focusing on nodes and edges.
Macroscopic level analysis, considering the network as a whole, including properties like diameter and edge density.
Mesoscopic view of networks, examining specific regions like communities and network motifs.
Common properties observed across different types of networks, such as small-world property and scale-free property.
Importance of network analysis in understanding the spread of information and misinformation.
Applications of network analysis in anomaly detection, identifying abnormal nodes or activities.
Introduction to graph representation learning (GRL) and its potential in network analysis.
Course content overview, covering network measurement, formation, link analysis, community detection, and more.
Prerequisites for the course, including knowledge of Python, probability, statistics, linear algebra, and basics of machine learning.
Motivational closing, emphasizing the relevance of network analysis in understanding social media and society.
Transcripts
We will discuss different types of ah you know real world networks;
I mean the popular real world networks that we we generally talk about right. So,
if you if you look at ah the literature um you can see that broadly there are 5 types
of networks that we talk about; social network right, biological network, information network,
technological network and language network ok. We will give examples of each of this one by one,
but in general social networks you all know. Biological networks ah I mean you can think
of say protein protein interaction networks or say you know neural network
I mean interactions between neurons um, food network and so on. Information networks ah
include World Wide Web, citation network you know. Technological network ah net networks include
power grid, airline network, ah railway network. Language network includes um you know say
word co-occurrence network and so on and so forth. We will discuss each of these one by one.
In fact, you know I I basically ah tell my students that you know you can imagine
you can think of a network from almost every data set, from almost every application right.
Network is just an abstraction of a complex system as I mentioned earlier right. So,
if you are given a simple problem right ah where you know a common person cannot see any network
any notion of network right, you can think of a network out of it right and that is
the beauty of this this particular course. So, let us let us look at social network right.
So, I mean we have been discussing about Twitter network, multiple times follower
following network nodes are users and links can be follower followings and so on.
Similarly, we have Facebook ah friendship networks where nodes are nodes ah you know
users and links are ah friendship relationships and so on and so forth and this is very obvious.
We have biological network for example, protein protein interaction network where
nodes are you know proteins and two proteins interact during a different
metabolic processes in our body and you can connect proteins accordingly right. Similarly,
we have you know metabolic network where you know um basically it describes the relationship between
you know small say you know metabolites right and enzymes proteins which basically interact
with them during different um you know biochemical reactions for example, right.
This is a metabolic network and there are tons of papers on protein protein interaction networks
mostly in the bioinformatics computational biology domain.
You can think of ah other ah communication network like a ah wireless mesh kind of network,
where say within a ah within an organization you can think of different routers computers
and communications between routers. In fact, you can also think of ah communications
through satellites that can also be a network right. Of course, you have a big internet network
where you know the the the the whole internet is also considered you know broadly as a network.
We have scientific networks like a citation network we discussed earlier where
papers are nodes and citations are the links, this is a directed network and
we have a co-authorship network ah where nodes are co-authors nodes are authors
and if two ah authors work together right. You can think of them as co-authors and you
can connect them right. So, interestingly if you look at citation network this is a example
this is a small example of a citation network. Look at this node right
look at this node they have they have high citations, right. In fact, since this is
a directed network, you have in degree you have inward edges and outward edges right.
So, think of a paper which has a lot of inward edges. So, inward inward edges indicate citations,
outward edges indicate references right. So, if a paper has a lot of citations meaning basically
the paper is very important therefore, people are citing it, but if a paper say you know
has a a lot of outgoing edges ah indicating a lot of references that paper is also important
that papers might be a book that paper might be a survey paper or a literature review paper, right.
We will discuss how this the the the notion of inward edge and outward edge basically you know
interplay with each other and you can think of ah interesting metrics right, out of out of this
ah the the the notion of directionality ah of of an edge. Co-authorship network
if you see the network right you see that you know there is a closed group, there is another closed
group here nodes are densely connected right. Here you see yellow nodes densely connected.
Red nodes are also densely connected right and and you know every every such group has its own
identity for example, this red group indicates researchers working working on agent based models.
So, green group indicates researchers working in mathematical ecology
right, there is this blue group working on statistical physics and so on and so forth.
So, you see that a cluster multiple such clusters basically emerge from a network right which might
be interesting to study and we will discuss in a in a separate chapter ah you know how we how
we can detect such clusters or communities ah from a network ok. Ah language network:
one simple example can be a co-occurrence ah network of words right, where
nodes can be words and if two words ah co-occur together in a sentence for example, or very close
by in a sentence say within within a boundary within a window of window of 3 words or 4 words
you basically connect them in the network. So, nodes are words and if two nodes co-occur
together multiple times you can connect them. For example, you see that you know the word like um
teacher right, principal, student they occur ah very frequently they co-occur very frequently and
therefore, they are connected right and these kind of network is very important to study you know ah
to to automatically detect you know synonymous words or say antonyms right or say ah holonym
homonym and so on and so forth, whole bunch of things in natural language processing right.
Another example of a network is ah keyword co-occurrence network where nodes are keywords
and if two keywords again ah co-occur together multiple times you can connect them right ah. In
scientific papers you may have seen that ah you need to specify keywords right in the paper.
So, now if you see that you know phrases like keywords like sentiment analysis and
opinion mining right. They are very close and you see that these these keywords
you know appear together very frequently. So, meaning that you know these phrases are
very important ah sorry these phrases are actually you know very linked. You see here
in this particular network sentiment analysis and opinion mining they are they are close by right
and so in this network nodes are keywords and two keywords are connected if they if they co-occur
together multiple times in different papers. There is another network called semantic network
ok. In semantic network now this is basically knowledge graph you may have heard about the
term knowledge graph right, a knowledge base or knowledge graph, where ah nodes are different
entities indicating different granularities of knowledge's. For example, you see that cat is a
mammal right. So, cat is an ah is an entity, this is a node mammal is a node right ah ah whale is a
node, animal is a node and so on and so forth. So, cat is a mammal. So, therefore, there is a
link from cat to mammal ah and the relationship is a right. So, this is you can think of this
as an attributed network where edges are associated with some sort of attributes
right. Is a or has right and so on and so forth leaps in and so on and so forth right,
these are different attributes of edges right. So, now this is called semantic network ok.
Similarly, we have we can think of many other interesting networks like ah you know terrorist
network. I mentioned last day that ah you know ah where in this particular network nodes can be
terrorists and if two terrorists went together for a similar mission or if two terrorists
were arrested together right you can basically connect them through ah through links right ah.
In fact, there are very interesting studies ah where people basically keep track of how
such movement happens. I mean I am not very aware of this, but if you look at studies
there are multiple books on you know how you can model such activities using using networks
ok. Ah There is another type of network called patient network and this is very important in
again in the computational biology ah domain where you basically want to study how a particular
epidemic spreads or ah or a ah virus spreads. So, here nodes are patients and if two patients
are interacting if two patients come come closer together you can connect them
right. So, now if say let us let us assume that you have ah in a hospital you have this kind of
network right and you have certainly seen that a virus has started spreading right.
So, you you will immediately understand that through which path this virus has
basically spread because you know that these people have been infected right
and you also know that these people have have been frequently interacting with other people
right. So, essentially it means that you may want to protect those ah people who have already
you know been interacting with the ah already infected patients and you want to protect them
either through vaccinations or some other ways. So, now we will ah look at a network from
different angles right. We will basically inspect ah the network as a whole we then zoom in right
and try to ah look at a a part of the network. We further zoom in and look at you know ah even
more fine grained entities in a network right. So, we will discuss three levels of granularity
of a network, one is called microscopic level, the other is called macroscopic level. So, you have
microscopic level you have macroscopic level of analysis of network and in between you have
something called mesoscopic level mesoscopic view of a network ok.
So, let us start with microscopic level. So, when you talk about microscopic microscopic view of a
network we basically ah you know we basically analyse nodes and edges ok. We look at nodes
properties, we look at edges properties, we look at how two nodes interact right, how say three
nodes interact and so on. We do not go further ok. So, we we look at different properties of
nodes for example, degree, centrality we will discuss what is centrality later right and so on.
We look at how two nodes interact right and this is called a dyadic level of interaction.
We will look at how three nodes interact this is called triadic level of interaction
right. For example, say this is a triadic level of interaction or this is another type of triadic
level of interaction right. This is another type of triadic level of interaction and so on.
So, we look at dyadic dyadic level of interaction, triadic level of interaction
and we also look at egocentric circle. We have already discussed what is ego
net ah ego network right. We will see that say let us say this is a ego network ok.
And let us say this is a structure and this is ego, this is ego and these are the altars ok. So,
you see that here there is a circle ok meaning a closely connected nodes right. This is also
a closely connected ah ah group ok. So, these are called circles egocentric circles. This is also a
kind of a microscopic level view because you are only looking at a particular node and
its surrounding neighbours and that is all ok. Now, let us look at macroscopic level the entire
network as a whole right. We can think of you know number of nodes, degree distribution ok this the
term degree distribution may not be very familiar to you, but we will discuss in the next chapter.
There is something called the diameter of a of a network which is the you know
longest shortest path right. You may have heard about something called shortest path,
you take a pair of nodes and look at the shortage path you take all pairs of nodes right.
You can see which one is the longest right. So, this is the diameter, that is called the diameter
of a network, it is a network property. Similarly you can think of edge density of a network right.
How many edges can be formed? How many edges can be possible ah in a network of ah node n
n number of nodes n c 2? And how many edges are actually there in the network right?
So, you take a fraction of actual number of edges divided by ah I mean the fraction of actual number
of edges and the possible number of edges and that will give you something called edge density
right. So, this is basically ah looking at the network as a whole micro macroscopic level.
In between microscopic and macroscopic there is something called mesoscopic view of a network
ok. And what is this? Mesoscopic view we we look at specific regions of a network. For example
I already mentioned about something called clusters or communities
where multiple nodes interact together frequently and form a dense group right, you can think of
this as a mesoscopic structure of a network. You have some multiple such communities and
these communities then form the entire network right. You can also think of something called
network motifs and this motif structure is very useful in in biological network particularly,
where now what is motif? Motif is basically a recurrent you know sub structures
ah which appear in a network right. For example, if you think of this [FL] this is a star network
right and this star network appears very very frequently in a in a network. So,
that network gives you a separate indication. If you see a chain this is a chain right this
chain the the the this is called chain motif right. Now if you see a lot of such chain motifs
present in a network the that network can be different from a network having
a lot of star motifs for example, right. So, again motif analysis is a different ah you
know different ah direction of network network analysis in general which we are
not covering because this is more related to biological ah network. So, but but of course,
in social network also we have studied we we we see plenty of cases where motifs are useful.
Now, you know this if you look at the science of network analysis. So, the question that we ask
is that are these properties common across networks? Let us say you take a a Facebook network
and you take a protein protein interaction network right and you see that some of the
properties are some of the microscopic mesoscopic macroscopic properties are common across networks,
then what do you conclude right? Would you be able to conclude that these two
networks are same or have similar properties? What are the different properties right?
So, the question that we ask is that can we formulate a general ah theory of the
structure of the structure evolution and the dynamics of a network ok. Ah
The common the the the the the observed property the common observed property ah that we often see
in a network includes something called small world property ok. We will discuss in the next chapter
what is small world property. It basically says that you know the world is very small meaning that
if you want to move from one node to another node you do not need to traverse a lot ok.
And there is a very interesting property called 6 degree of separation ok. It basically says
that you know ah there are 6 hoops on an average between pair any pairs of ah any pair of nodes
in the network ok. We will also discuss something called scale free property right. We will discuss
clustering community structure we will discuss something called the robustness of a network
robustness of a network to different attacks different adversarial attacks.
We will discuss something called cascade effect ok.
The vulnerability ah to different cascading failure. For example, say there is a power grid
electric power grid and suddenly you see that one one ah node ah has got damaged right. And
if one node got damaged it it may happen that this damage may get propagated through
the entire network right entire power grid, but that that that will you know create devastation
right. So, we will need to stop such ah such power failure, such failure of nodes right.
So, what would be the strategies through which we can we can stop the spread of such such failure?
So, this course will cover all these things in different chapters and we will basically learn
different mathematical formulations properties which can define ah this this characteristics of
of a network. Now this is a motivational slide I am pretty sure that you do not need to ah I
do not need to motivate you why you know social media has become a part and parcel of for life.
If you see the user engagement over time right right from say the year 2000 to 2021 2022
there is a massive growth exponential growth of usage particularly during this you know lockdown
time ah this pandemic time right and a lot of content are being generated every minute right.
So, people sometimes say that you know social network is a proxy of our society
right. You see many cases where you know people take ah opinion ah public poll right. You you you
have you know may have been ah invited to ah to to to vote for certain certain polls ah right certain
ah ah certain decision right and you know on the in the in the online social media ah since ah
you you are observing the patterns you are observing the ah different opinions.
You are consuming different information in different ways
right. You are one of the stakeholders right who can participate in this polling right. So,
therefore, this is this is very important ok. So, these are the applications that we will also cover
in this in this particular lecture, we will cover something called node classification where we
classify nodes depending upon the properties. We predict links right this is also called
recommendation link recommendation or and and link recommendation has a lot of applications in the
product recommendation friendship recommendation ad recommendation and so on. We will discuss
something called a growth and virality of of messages right of of networks in general.
We will discuss node centric network centric properties personalized properties of nodes.
We discuss how misinformation spreads over social network, we discuss you know ah how
you can identify nodes which are abnormal for example, say fraud nodes right fraud users
and there is something called anomaly detection through which we ah look at how we can identify
such ah you know fraud nodes or outlier nodes. Now, this is the tentative course content
that we are going to cover in this lecture series. We will start by ah
you know by measuring a network. We quantify in the next lecture we will try to quantify
a network, then we will something we will discuss something called network network formation. We
will discuss many such models through which you can actually mimic the way a network is formed,
a way a network ah evolves over time right. We will discuss ah you know random growth
model ah, we will discuss preferential achievement models and many such models those models are
mostly borrowed from physics, but they are highly applicable here in social network as well. We will
discuss link analysis where you specifically look at how to characterize a link and edge right and
how we will see how different social theories are basically ah are useful to characterize a link.
We will discuss community detection community or cluster is a very important
property of a network and we will see how we can detect communities efficiently.
We discussed something called link prediction ok and we have already mentioned the application in
recommendation system, but the problem is you know when we need to predict something for ah future
ah given that we have network ah ah you know at the current time stamp this is difficult because
you do not know what is going to happen right. So, link prediction is very very challenging and
we will discuss some ah you know some algorithms which have become very popular because of
because of their you know the the way they are
they basically work. We discussed ah network effect and cascade behaviour particularly ah how
this epidemic spreads over over network online and offline network. We discussed
network based anomaly detection ah anomaly detection or abnormality detection
ah is also studied in data mining. But, when it comes to network structure ah
approaches are very different and we will see ah how ah we ah identify anomalous nodes anomalous
entities ah it can be anomalous edges. It can be anomalous sub graphs right ah from the network.
Then we will discuss a very interesting in fact um ah in fact, ah this is a recent trend in network
analysis something called graph representation learning network representation learning GRL ok
in short. So, we will see how network is mapped, how a network is mapped to an embedding space
right to an Euclidean space for example, right. And when you map a network ah to a vector space
say on an Euclidean space things would become very easy. For example, now a node is represented by a
vector right. So, you can do whole bunch of vector operations matrix multiplication and
so on and so forth to solve ah ah different applications, but to understand this chapter
you need to ah know a basics of ah deep learning, basics of ah machine learning we do not need to go
into details of that, but a basics of machine learning and deep learning might be useful.
And then we will conclude this ah lecture by you know by by giving you ideas about some
applications for example, fraud detection. We look at fraud detection particularly right,
we will we look at something called collusion ah black market driven activities in online social
network. How you can ah detect such activities, we will also discuss very briefly about
recommender systems like particularly friendship ah recommendation system and so on right ah.
So, these are the prerequisites. So, we so ah it is highly recommended that you learn
python programming ah, you I assume that ah I am assuming that you have fair bit of ideas about
probabilities and statistics I mean probability statistics 101 is good enough. Again I am assuming
that you have ideas ah basic ideas about linear algebra particularly matrix operations
ah you have ideas about ah basic algorithm design right and ah it would be great if you also learn
basics of machine learning and deep learning ok. With this ah I would ah like all of you to ah
learn together the skill of networking. Thank you.
5.0 / 5 (0 votes)