Strong and Weak Ties

Computational Analysis of Social Complexity
Fall 2025, Spencer Lyon

Prerequisites

Introduction to Graphs

Outcomes

Recognize open and closed triangles in a graph
Understand concept of triadic closure
Be able to identify global and local bridges in a network: visually and programatically
Understand the “strength of weak ties”
Be familiar with betweenness centrality and its use in graph partitioning

References

Easley and Kleinberg chapter 3

Granovetter’s question¶

In the late 1960s a PhD student Mark Granovetter wanted to understand how people get a new job
He interviewed many people who recently changed firms and asked how they got their job
Unsurprisingly, people said they got referrals or learned about the job from personal contacts
Surprisingly, most people said the contacts were made via their “acquaintances” and not “friends”
Why?

Graphs for Information Flow¶

To answer Granovetter’s question, we’ll turn to graphs
As we’ll see, this is one of many possible examples of how we can use a graph to study the flow of information
- What other examples are there?
- From what sources do you get your information?
- Do you get different types of information from different types of source? Why?
One interesting point to make in this regard is that by studying information flow via graphs we are taking something inherently interpersonal or emotional (friendships and sharing information) and analyzing it from a structural perspective (as a graph)
Keep these themes in mind as we study the theoretical/technical tools
We’ll return back to Granovetter’s question after building up some tools

Triangles¶

When studying graphs, the smallest possible structure involving more than two nodes is a triangle
Triangles are made up of three nodes and either two or three edges that connect them
- A triangle with only two edges is said to be open
- A triangle with three edges is closed
In the example below, there are 3 triangles, all of which are open

using Graphs, GraphPlot

g1 = star_graph(4)
add_vertex!(g1)
gplot(g1, nodelabel='A':'E', layout=shell_layout)

Suppose our graph is of students arriving for first year of college
Each node is a student
An edge represents a friendship or connection between students
Story...
- A went to and out of school state, but happened to know B, C, and D from various summer camps or family-friend relationships
- None of B, C and D know one another
Question: Given only this information, is it more likely that B and E become friends, or B and C? Why?

Closing triangles¶

In our example graph the only edges are between A and another node
Consider a triangle formed by nodes A, B, C
To close this triangle, there would need to be an edge between B-C as follows

g2 = copy(g1)
add_edge!(g2, 2, 3)
gplot(g2, nodelabel='A':'E', layout=shell_layout)

Triadic Closure¶

There is overwhelming empirical evidence to support the intuition that when there are edges A-B and A-C, it is likely that an edge B-C will form
This is known as triadic closure (closing the final edge of a triangle)
For this reason, when analyzing social network data it is very common to see triangles

Triangles with Graphs.jl¶

There is good support in Graphs.jl for helping us count triangles in a graph
The two key functions are triangles(g) and local_clustering(g)
triangles(g) will return an array of integers, where values correspond to how many closed triangles there are in the graph

@show triangles(g1)
@show triangles(g2);

triangles(g1) = [0, 0, 0, 0, 0]
triangles(g2) = [1, 1, 1, 0, 0]

The local_clustering(g) function will return a tuple of two things:
1. The number of closed triangles for each node (same as triangles(g))
2. The number of possible triangles for each node in g

@show local_clustering(g1)
@show local_clustering(g2);

local_clustering(g1) = ([0, 0, 0, 0, 0], [3, 0, 0, 0, 0])
local_clustering(g2) = ([1, 1, 1, 0, 0], [3, 1, 1, 0, 0])

Clustering Coefficients¶

It is helpful to have a single number summary of how “closed” a graphs triangles are
The clustering coefficient tells us the fraction of triangles that are closed
- “The clustering coefficient of a node A is defined as the probability that two randomly selected friends of A are friends with each other. In other words, it is the fraction of pairs of A’s friends that are connected to each other by edges” (E&K pg. 49)
There is a local flavor, where we consider all triangles for a specific node
There is also a global flavor where we consider all triangles for the entire graph
Graphs.jl has functions local_clustering_coefficient and global_clustering_coefficient to compute these quantities respectively

local_clustering_coefficient(g2)

5-element Vector{Float64}:
 0.3333333333333333
 1.0
 1.0
 0.0
 0.0

global_clustering_coefficient(g2)

0.6

Question: In each of the graphs below, how many open triangles are there? How many closed?

#graph 1
gplot(g2, nodelabel='A':'E', layout=shell_layout)

#graph 2
gplot(path_graph(5), nodelabel='A':'E', layout=shell_layout)

# graph 3
gplot(dorogovtsev_mendes(7), nodelabel=1:7)

# graph 4
gplot(smallgraph(:bull), nodelabel=1:5)

Bridges¶

Consider the graph below

g3 = barbell_graph(4, 4)
gplot(g3, nodelabel='A':'H', layout=spring_layout)

Now consider D
Notice that connection between D and any of A B C is somehow different from connection to E
D-E is known as a bridge
- A bridge is an edge that, if removed, would cause the nodes involved to be in different components of the graph

g4 = copy(g3)
rem_edge!(g4, 4, 5)
gplot(g4, nodelabel='A':'H', layout=spring_layout)

Frequency of bridges¶

Given our discussion on triadic closure, bridges are likely to be rare in real social networks
It is very likely that an edge will form between E and one of A``B``C
Even if that isn’t the case, consider the possibility that the graph we have been looking at is actually a smaller subset of a larger graph:

using Downloads
if !isfile("local_bridge.lg")
    Downloads.download("https://ucf-cap-6318-resources.s3.amazonaws.com/data/local_bridge.lg", "local_bridge.lg")
end
g5 = loadgraph("local_bridge.lg")
gplot(g5, nodelabel='A':'M', layout=spring_layout)

Local Bridges

In the graph above even if D-E were broken, there would only be one component in our graph
- In other words there is another path from D to E (here D-I-K-M-E)
Because true bridges are so rare, a looser definition of bridge was created called a local bridge
An edge is a local bridge if A and B have no neighbors in common

Detecting Local Bridges¶

So far we have dealt with example graphs that we can visually inspect
Most real world graphs are far too large for this
To analyze larger graphs we need computational tools
Let’s build up some code that will allow us to find local bridges

Exercise

Populate the num_shared_neighbors function below

"""
    num_shared_neighbors(g, n1, n2)

Given a graph `g` and node indexes `n1` and `n2`, compute
how many neighbors `n1` and `n2` have in common
"""
function num_shared_neighbors(g, n1, n2)
    # HINT: `neighbors(g, n1)` will give you an array of neighbors
    #        of `n1`


end

num_shared_neighbors

# test code here
using Test

function test_num_shared_neighbors()
    @testset "num_shared_neighbors" begin
        vals = [
            num_shared_neighbors(g5, 6, 7),
            num_shared_neighbors(g5, 4, 5)
        ]
        @test !isnothing(vals[1])
        @test vals[1] isa Integer
        @test vals == [2, 0]
    end
end

test_num_shared_neighbors (generic function with 1 method)

# uncomment and run the code below when you are ready to test your code
# test_num_shared_neighbors()

Exercise

Now that we have a function for computing the number of shared neighbors, we can use it to build a routine for finding a local bridge
We’ll do that now
Your task is to fill in the missing logic for the local_bridges(g) function below

"""
    local_bridges(g)

Given a graph `g`, find all local bridges in the graph
"""
function local_bridges(g)
    out = []
    for n1 in 1:nv(g)

        # TODO determine what to loop over for n2

        # TODO: call num_shared_neighbors(g, n1, n2)

        # TODO: if n1 and n2 have no shared neighbors, **AND** (n1 < n2)
        #       call push!(out, (n1, n2))
    end
    return out
end

local_bridges

function test_local_bridges()
    @testset "local_bridges" begin
        vals = [
            local_bridges(g5),
            local_bridges(g4)
        ]
        @test vals[1] isa Array
        @test length(vals[1]) == 1
        @test length(vals[2]) == 0
        @test vals[1][1] == (4, 5)
    end
end

test_local_bridges (generic function with 1 method)

# uncomment and run the code below when you are ready to test your code
# test_local_bridges()

Example: Twitter connections¶

Let’s now consider an example using real social network data
Below we’ll load up a graph called tw that is a graph of connections between twitter users
Each node is a different twitter account
There is an edge between nodes if either one of the accounts follows the other

using Downloads
if !isfile("twitter.lg")
    Downloads.download("https://ucf-cap-6318-resources.s3.amazonaws.com/data/twitter.lg", "twitter.lg")
end
tw = loadgraph("twitter.lg")

{81306, 1342310} undirected simple UInt32 graph

Notice that there are 81,306 nodes and 1,342,310 edges
This network is far too big to analyze visually
Let’s use a few of our empirical metrics to study the properties of this graph

global_clustering_coefficient(tw)

0.17057451593439363

tw_bridges = bridges(tw)

4996-element Vector{Graphs.SimpleGraphs.SimpleEdge{UInt32}}:
 Edge 1252 => 31559
 Edge 631 => 63632
 Edge 385 => 1996
 Edge 753 => 9774
 Edge 1739 => 3295
 Edge 848 => 29509
 Edge 2193 => 56235
 Edge 346 => 58652
 Edge 31 => 8302
 Edge 937 => 6948
 ⋮
 Edge 30135 => 81305
 Edge 2973 => 65794
 Edge 2973 => 75450
 Edge 2973 => 80720
 Edge 509 => 41064
 Edge 465 => 14891
 Edge 1721 => 22505
 Edge 1721 => 58936
 Edge 1721 => 80725

ratio_bridges = length(tw_bridges) / ne(tw)

0.0037219420253145697

Notice how only 0.372% of edges are bridges (true bridges, not local bridges)!

Edge Strength¶

We have so far considered only whether or not two nodes are connected
We have not discussed the strength of these connections
We will now extend our analysis to the notion of an edge representing a strong or a weak tie
In our friendship example, the strong ties would represent friends and the weak ties would represent acquaintances

Strong and Weak ties¶

In the figure below, we have a representation of a graph similar to our g5 where all edges have been annotated with a S or a W
A S edge represents a strong edge, or friendship
A W edge represents a weak edge or, or acquaintance

Triadic Closure: Strong Vs Weak¶

Let’s extend the intuition behind triadic closure to our strong/weak setting
Our argument was that because we have edges A-B and A-C, it is likely that an edge B-C will form
Now we’ll state that if A-B and A-C are both strong ties, then it is more likely that B-C will form than if either A-B or A-C were weak
More formally... we have the Strong Triadic Closure Property

We say that a node A violates the Strong Triadic Closure Property if it has strong ties to two other nodes B and C, and there is no edge at all (either a strong or weak tie) between B and C. We say that a node A satisfies the Strong Triadic Closure Property if it does not violate it.

Local Bridges and Weak Ties¶

Given this definition of the Strong Triadic Closure Property, we can make the following claim (see section 3.2 of E&K for proof, intuition in figure below):

Claim: If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong ties, then any local bridge it is involved in must be a weak tie.

Back to Granovetter¶

Recall our original question: why do people report finding jobs through acquaintances (weak ties) more often than close friends (strong ties)?
The Strong Triadic Closure Property gives the answer...
Suppose A lost its job
- By asking for a referral from any of C, D, or E; A is likely to get a similar set of information as they are strongly connected to one other and likely have access to the same set of information
- Instead by talking to the weak tie B, A is likely to get new information from people in B’s social circle (including H and the three unlabeled nodes)
tl;dr: local bridges are weak ties, so it is weak ties that get you access to “new” parts of the network

Graph Partitioning¶

Social networks often consist of tightly knit regions and weak ties that connect them
One algorithmic problem that has been studied and applied in many settings is that of graph partitioning
To partition a graph is to break it down into the tightly-knit components
When a graph is partitioned, it is broken down into components called regions

g6 = let
    edges = Edge.([
        (1,2), (1,3), (2,3), (3,7), (7,6), (6,4), (6,5), (4,5), (7,8),
        (8,9), (9,10), (10,11), (9,11), (8,12), (12,13), (13,14), (12,14)
    ])
    Graph(edges)
end

{14, 17} undirected simple Int64 graph

gplot(g6, nodelabel=1:14, layout=spring_layout)

Two Approaches¶

There are two classes of algorithms that can be used to partition a graph:
1. Divisive: partition a graph by removing local bridges (“spanning links”) and breaking down the network into large chunks
2. Agglomerative: start with a single node and construct regions “bottom-up” by iteratively finding nodes highly connected to existing nodes in the region
We’ll focus on divisive methods here

Betweenness Centrality¶

In order to build a divisive partitioning algorithm, we’ll first define a key metric for analyzing how “central” a node is in a network
We’ll provide a brief introduction here, and refer you to section 3.6 of E&K for more detail
Let $V$ represent set of all nodes and $s, t, v \in V$
Let $\sigma_{st}$ represent the number of shortest paths between $s$ and $t$
Let $\sigma_{st}(v)$ represent the number of shortest paths between $s$ and $t$ that pass through $v$ .
Then, the betweenness centrality for node $v$ ( $C_{B}(v)$ ) is defined as:
$C_{B}(v) = \sum_{s \ne v \ne t \in V} \frac{\sigma_{st}(v)}{\sigma_{st}}$
(1)
Conceptually, $C_B(v)$ captures how much information “flows” across node $v$ on average

Computing $C_B(v)$ ¶

There are various algorithms we could use to compute $C_B(v)$
For now we will let Graphs.jl handle it for us ;)

betweenness_centrality(g6)

14-element Vector{Float64}:
 0.0
 0.0
 0.28205128205128205
 0.0
 0.0
 0.28205128205128205
 0.6538461538461539
 0.6538461538461539
 0.28205128205128205
 0.0
 0.0
 0.28205128205128205
 0.0
 0.0

Note that 7 and 8 are a local bridge
Also note that they carry the highest value of $C_B$ ...🤔

Algorithm: Girvan-Newman¶

We will not present the algorithm in detail, but will describe the overall steps
Refer to Section 3.6 of E&K for details
The Girvan-Newman algorithm for graph partitioning:
- Inputs: node
- Outputs: list of edges to delete at each step
- Algorithm:
  1. Find nodes with highest betweenness centrality -- remove them from the network (and add edges connecting them to the network to the list of deleted edges for first step)
  2. Re-compute betweenness centrality for all subgraphs that resulted from deletion in step 1. Remove all nodes with highest betweenness centrality and record list of deleted edges
  3. Continue until all edges have been removed

Summary: Key ideas¶

Triangles are key network structures
Closed triangles are common in social networks due to triadic closure
Social networks are often composed of tightly knit regions bound together with weak ties
Weak ties often form local bridges and are therefore valuable for referrals and information flow
Betweenness centrality captures idea of how essential a node is in connecting regions of a graph (how much information flows across a node)