Skip to article frontmatterSkip to article content

Strong and Weak Ties

University of Central Florida
Valorum Data

Computational Analysis of Social Complexity

Fall 2025, Spencer Lyon

Prerequisites

  • Introduction to Graphs

Outcomes

  • Recognize open and closed triangles in a graph
  • Understand concept of triadic closure
  • Be able to identify global and local bridges in a network: visually and programatically
  • Understand the “strength of weak ties”
  • Be familiar with betweenness centrality and its use in graph partitioning

References

Granovetter’s question

  • In the late 1960s a PhD student Mark Granovetter wanted to understand how people get a new job
  • He interviewed many people who recently changed firms and asked how they got their job
  • Unsurprisingly, people said they got referrals or learned about the job from personal contacts
  • Surprisingly, most people said the contacts were made via their “acquaintances” and not “friends”
  • Why?

Graphs for Information Flow

  • To answer Granovetter’s question, we’ll turn to graphs
  • As we’ll see, this is one of many possible examples of how we can use a graph to study the flow of information
    • What other examples are there?
    • From what sources do you get your information?
    • Do you get different types of information from different types of source? Why?
  • One interesting point to make in this regard is that by studying information flow via graphs we are taking something inherently interpersonal or emotional (friendships and sharing information) and analyzing it from a structural perspective (as a graph)
  • Keep these themes in mind as we study the theoretical/technical tools
  • We’ll return back to Granovetter’s question after building up some tools

Triangles

  • When studying graphs, the smallest possible structure involving more than two nodes is a triangle
  • Triangles are made up of three nodes and either two or three edges that connect them
    • A triangle with only two edges is said to be open
    • A triangle with three edges is closed
  • In the example below, there are 3 triangles, all of which are open
using Graphs, GraphPlot
g1 = star_graph(4)
add_vertex!(g1)
gplot(g1, nodelabel='A':'E', layout=shell_layout)
Loading...

Social Triangles

  • Suppose our graph is of students arriving for first year of college

  • Each node is a student

  • An edge represents a friendship or connection between students

  • Story...

    • A went to and out of school state, but happened to know B, C, and D from various summer camps or family-friend relationships
    • None of B, C and D know one another
  • Question: Given only this information, is it more likely that B and E become friends, or B and C? Why?

Closing triangles

  • In our example graph the only edges are between A and another node
  • Consider a triangle formed by nodes A, B, C
  • To close this triangle, there would need to be an edge between B-C as follows
g2 = copy(g1)
add_edge!(g2, 2, 3)
gplot(g2, nodelabel='A':'E', layout=shell_layout)
Loading...

Triadic Closure

  • There is overwhelming empirical evidence to support the intuition that when there are edges A-B and A-C, it is likely that an edge B-C will form
  • This is known as triadic closure (closing the final edge of a triangle)
  • For this reason, when analyzing social network data it is very common to see triangles

Triangles with Graphs.jl

  • There is good support in Graphs.jl for helping us count triangles in a graph
  • The two key functions are triangles(g) and local_clustering(g)
  • triangles(g) will return an array of integers, where values correspond to how many closed triangles there are in the graph
@show triangles(g1)
@show triangles(g2);
triangles(g1) = [0, 0, 0, 0, 0]
triangles(g2) = [1, 1, 1, 0, 0]
  • The local_clustering(g) function will return a tuple of two things:
    1. The number of closed triangles for each node (same as triangles(g))
    2. The number of possible triangles for each node in g
@show local_clustering(g1)
@show local_clustering(g2);
local_clustering(g1) = ([0, 0, 0, 0, 0], [3, 0, 0, 0, 0])
local_clustering(g2) = ([1, 1, 1, 0, 0], [3, 1, 1, 0, 0])

Clustering Coefficients

  • It is helpful to have a single number summary of how “closed” a graphs triangles are
  • The clustering coefficient tells us the fraction of triangles that are closed
    • “The clustering coefficient of a node A is defined as the probability that two randomly selected friends of A are friends with each other. In other words, it is the fraction of pairs of A’s friends that are connected to each other by edges” (E&K pg. 49)
  • There is a local flavor, where we consider all triangles for a specific node
  • There is also a global flavor where we consider all triangles for the entire graph
  • Graphs.jl has functions local_clustering_coefficient and global_clustering_coefficient to compute these quantities respectively
local_clustering_coefficient(g2)
5-element Vector{Float64}: 0.3333333333333333 1.0 1.0 0.0 0.0
global_clustering_coefficient(g2)
0.6
  • Question: In each of the graphs below, how many open triangles are there? How many closed?
#graph 1
gplot(g2, nodelabel='A':'E', layout=shell_layout)
Loading...
#graph 2
gplot(path_graph(5), nodelabel='A':'E', layout=shell_layout)
Loading...
# graph 3
gplot(dorogovtsev_mendes(7), nodelabel=1:7)
Loading...
# graph 4
gplot(smallgraph(:bull), nodelabel=1:5)
Loading...

Bridges

  • Consider the graph below
g3 = barbell_graph(4, 4)
gplot(g3, nodelabel='A':'H', layout=spring_layout)
Loading...
  • Now consider D
  • Notice that connection between D and any of A B C is somehow different from connection to E
  • D-E is known as a bridge
    • A bridge is an edge that, if removed, would cause the nodes involved to be in different components of the graph
g4 = copy(g3)
rem_edge!(g4, 4, 5)
gplot(g4, nodelabel='A':'H', layout=spring_layout)
Loading...

Frequency of bridges

  • Given our discussion on triadic closure, bridges are likely to be rare in real social networks
  • It is very likely that an edge will form between E and one of A``B``C
  • Even if that isn’t the case, consider the possibility that the graph we have been looking at is actually a smaller subset of a larger graph:
using Downloads
if !isfile("local_bridge.lg")
    Downloads.download("https://ucf-cap-6318-resources.s3.amazonaws.com/data/local_bridge.lg", "local_bridge.lg")
end
g5 = loadgraph("local_bridge.lg")
gplot(g5, nodelabel='A':'M', layout=spring_layout)
Loading...

Local Bridges

  • In the graph above even if D-E were broken, there would only be one component in our graph
    • In other words there is another path from D to E (here D-I-K-M-E)
  • Because true bridges are so rare, a looser definition of bridge was created called a local bridge
  • An edge is a local bridge if A and B have no neighbors in common

Detecting Local Bridges

  • So far we have dealt with example graphs that we can visually inspect
  • Most real world graphs are far too large for this
  • To analyze larger graphs we need computational tools
  • Let’s build up some code that will allow us to find local bridges

Exercise

Populate the num_shared_neighbors function below

"""
    num_shared_neighbors(g, n1, n2)

Given a graph `g` and node indexes `n1` and `n2`, compute
how many neighbors `n1` and `n2` have in common
"""
function num_shared_neighbors(g, n1, n2)
    # HINT: `neighbors(g, n1)` will give you an array of neighbors
    #        of `n1`


end
num_shared_neighbors
# test code here
using Test

function test_num_shared_neighbors()
    @testset "num_shared_neighbors" begin
        vals = [
            num_shared_neighbors(g5, 6, 7),
            num_shared_neighbors(g5, 4, 5)
        ]
        @test !isnothing(vals[1])
        @test vals[1] isa Integer
        @test vals == [2, 0]
    end
end
test_num_shared_neighbors (generic function with 1 method)
# uncomment and run the code below when you are ready to test your code
# test_num_shared_neighbors()

Exercise

  • Now that we have a function for computing the number of shared neighbors, we can use it to build a routine for finding a local bridge
  • We’ll do that now
  • Your task is to fill in the missing logic for the local_bridges(g) function below
"""
    local_bridges(g)

Given a graph `g`, find all local bridges in the graph
"""
function local_bridges(g)
    out = []
    for n1 in 1:nv(g)

        # TODO determine what to loop over for n2

        # TODO: call num_shared_neighbors(g, n1, n2)

        # TODO: if n1 and n2 have no shared neighbors, **AND** (n1 < n2)
        #       call push!(out, (n1, n2))
    end
    return out
end
local_bridges
function test_local_bridges()
    @testset "local_bridges" begin
        vals = [
            local_bridges(g5),
            local_bridges(g4)
        ]
        @test vals[1] isa Array
        @test length(vals[1]) == 1
        @test length(vals[2]) == 0
        @test vals[1][1] == (4, 5)
    end
end
test_local_bridges (generic function with 1 method)
# uncomment and run the code below when you are ready to test your code
# test_local_bridges()

Example: Twitter connections

  • Let’s now consider an example using real social network data
  • Below we’ll load up a graph called tw that is a graph of connections between twitter users
  • Each node is a different twitter account
  • There is an edge between nodes if either one of the accounts follows the other
using Downloads
if !isfile("twitter.lg")
    Downloads.download("https://ucf-cap-6318-resources.s3.amazonaws.com/data/twitter.lg", "twitter.lg")
end
tw = loadgraph("twitter.lg")
{81306, 1342310} undirected simple UInt32 graph
  • Notice that there are 81,306 nodes and 1,342,310 edges
  • This network is far too big to analyze visually
  • Let’s use a few of our empirical metrics to study the properties of this graph
global_clustering_coefficient(tw)
0.17057451593439363
tw_bridges = bridges(tw)
4996-element Vector{Graphs.SimpleGraphs.SimpleEdge{UInt32}}: Edge 1252 => 31559 Edge 631 => 63632 Edge 385 => 1996 Edge 753 => 9774 Edge 1739 => 3295 Edge 848 => 29509 Edge 2193 => 56235 Edge 346 => 58652 Edge 31 => 8302 Edge 937 => 6948 ⋮ Edge 30135 => 81305 Edge 2973 => 65794 Edge 2973 => 75450 Edge 2973 => 80720 Edge 509 => 41064 Edge 465 => 14891 Edge 1721 => 22505 Edge 1721 => 58936 Edge 1721 => 80725
ratio_bridges = length(tw_bridges) / ne(tw)
0.0037219420253145697

Notice how only 0.372% of edges are bridges (true bridges, not local bridges)!

Edge Strength

  • We have so far considered only whether or not two nodes are connected
  • We have not discussed the strength of these connections
  • We will now extend our analysis to the notion of an edge representing a strong or a weak tie
  • In our friendship example, the strong ties would represent friends and the weak ties would represent acquaintances

Strong and Weak ties

  • In the figure below, we have a representation of a graph similar to our g5 where all edges have been annotated with a S or a W
  • A S edge represents a strong edge, or friendship
  • A W edge represents a weak edge or, or acquaintance
strong_weak_ties.png

Triadic Closure: Strong Vs Weak

  • Let’s extend the intuition behind triadic closure to our strong/weak setting
  • Our argument was that because we have edges A-B and A-C, it is likely that an edge B-C will form
  • Now we’ll state that if A-B and A-C are both strong ties, then it is more likely that B-C will form than if either A-B or A-C were weak
  • More formally... we have the Strong Triadic Closure Property

We say that a node A violates the Strong Triadic Closure Property if it has strong ties to two other nodes B and C, and there is no edge at all (either a strong or weak tie) between B and C. We say that a node A satisfies the Strong Triadic Closure Property if it does not violate it.

Local Bridges and Weak Ties

  • Given this definition of the Strong Triadic Closure Property, we can make the following claim (see section 3.2 of E&K for proof, intuition in figure below):

Claim: If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong ties, then any local bridge it is involved in must be a weak tie.

strong_triadic_closure.png

Back to Granovetter

  • Recall our original question: why do people report finding jobs through acquaintances (weak ties) more often than close friends (strong ties)?
  • The Strong Triadic Closure Property gives the answer...
  • Suppose A lost its job
    • By asking for a referral from any of C, D, or E; A is likely to get a similar set of information as they are strongly connected to one other and likely have access to the same set of information
    • Instead by talking to the weak tie B, A is likely to get new information from people in B’s social circle (including H and the three unlabeled nodes)
  • tl;dr: local bridges are weak ties, so it is weak ties that get you access to “new” parts of the network
strong_weak_ties.png

Graph Partitioning

  • Social networks often consist of tightly knit regions and weak ties that connect them
  • One algorithmic problem that has been studied and applied in many settings is that of graph partitioning
  • To partition a graph is to break it down into the tightly-knit components
  • When a graph is partitioned, it is broken down into components called regions
partitioning.png
g6 = let
    edges = Edge.([
        (1,2), (1,3), (2,3), (3,7), (7,6), (6,4), (6,5), (4,5), (7,8),
        (8,9), (9,10), (10,11), (9,11), (8,12), (12,13), (13,14), (12,14)
    ])
    Graph(edges)
end
{14, 17} undirected simple Int64 graph
gplot(g6, nodelabel=1:14, layout=spring_layout)
Loading...

Two Approaches

  • There are two classes of algorithms that can be used to partition a graph:
    1. Divisive: partition a graph by removing local bridges (“spanning links”) and breaking down the network into large chunks
    2. Agglomerative: start with a single node and construct regions “bottom-up” by iteratively finding nodes highly connected to existing nodes in the region
  • We’ll focus on divisive methods here

Betweenness Centrality

  • In order to build a divisive partitioning algorithm, we’ll first define a key metric for analyzing how “central” a node is in a network
  • We’ll provide a brief introduction here, and refer you to section 3.6 of E&K for more detail
  • Let VV represent set of all nodes and s,t,vVs, t, v \in V
  • Let σst\sigma_{st} represent the number of shortest paths between ss and tt
  • Let σst(v)\sigma_{st}(v) represent the number of shortest paths between ss and tt that pass through vv.
  • Then, the betweenness centrality for node vv (CB(v)C_{B}(v)) is defined as:
    CB(v)=svtVσst(v)σstC_{B}(v) = \sum_{s \ne v \ne t \in V} \frac{\sigma_{st}(v)}{\sigma_{st}}
  • Conceptually, CB(v)C_B(v) captures how much information “flows” across node vv on average

Computing CB(v)C_B(v)

  • There are various algorithms we could use to compute CB(v)C_B(v)
  • For now we will let Graphs.jl handle it for us ;)
betweenness_centrality(g6)
14-element Vector{Float64}: 0.0 0.0 0.28205128205128205 0.0 0.0 0.28205128205128205 0.6538461538461539 0.6538461538461539 0.28205128205128205 0.0 0.0 0.28205128205128205 0.0 0.0
  • Note that 7 and 8 are a local bridge
  • Also note that they carry the highest value of CBC_B...🤔

Algorithm: Girvan-Newman

  • We will not present the algorithm in detail, but will describe the overall steps
  • Refer to Section 3.6 of E&K for details
  • The Girvan-Newman algorithm for graph partitioning:
    • Inputs: node
    • Outputs: list of edges to delete at each step
    • Algorithm:
      1. Find nodes with highest betweenness centrality -- remove them from the network (and add edges connecting them to the network to the list of deleted edges for first step)
      2. Re-compute betweenness centrality for all subgraphs that resulted from deletion in step 1. Remove all nodes with highest betweenness centrality and record list of deleted edges
      3. Continue until all edges have been removed

Summary: Key ideas

  • Triangles are key network structures
  • Closed triangles are common in social networks due to triadic closure
  • Social networks are often composed of tightly knit regions bound together with weak ties
  • Weak ties often form local bridges and are therefore valuable for referrals and information flow
  • Betweenness centrality captures idea of how essential a node is in connecting regions of a graph (how much information flows across a node)