Computational Analysis of Social Complexity
Fall 2025, Spencer Lyon
Prerequisites
- Mixed Strategies (L09.01)
- Auctions (L09.03)
Outcomes
- Apply mixed strategy Nash equilibrium to cybersecurity scenarios
- Understand auction mechanisms in digital advertising
- Analyze bidding strategies in cloud computing spot markets
- Practice computing equilibria with real-world data science examples
Cybersecurity: Defender vs Attacker Game¶
Scenario¶
You are the head of security for a data analytics company. Your infrastructure has two critical components:
- Database: Contains customer data and analytics results
- API: Processes real-time data requests
Due to resource constraints, you can only deploy intensive monitoring and intrusion detection on ONE system at a time.
Meanwhile, a threat actor is targeting your company and must choose which system to attack.
Payoffs (from Defender’s perspective, in terms of avoided damages, measured in $10,000s):
- If Defender monitors the system being attacked: +8 (attack detected and blocked)
- If Defender monitors the wrong system: -10 (successful breach)
- If Attacker succeeds in breaching Database: Defender loses 10, Attacker gains 10
- If Attacker succeeds in breaching API: Defender loses 10, Attacker gains 10
- If Attacker is blocked: Defender gains 8 (reputation boost), Attacker loses 8 (wasted resources)
Payoff Matrix¶
| Attacker: Database | Attacker: API | |
|---|---|---|
| Defender: Database | (8, -8) | (-10, 10) |
| Defender: API | (-10, 10) | (8, -8) |
Note: Payoffs are (Defender, Attacker)
Exercise 1.1: Pure Strategy Analysis¶
- Does this game have a Nash Equilibrium in pure strategies? Why or why not?
- What type of game is this (zero-sum, positive-sum, negative-sum)?
- Why might randomization (mixed strategies) be beneficial for both players?
Exercise 1.2: Computing Mixed Strategy Equilibrium¶
Let:
- = probability Defender monitors Database
- = probability Attacker targets Database
Part A: Write out the Attacker’s expected payoff for each pure strategy as a function of :
- Expected payoff from targeting Database: = ?
- Expected payoff from targeting API: = ?
Part B: Use the indifference condition to solve for (the equilibrium probability)
Part C: By symmetry (or by computing Defender’s expected payoffs), what is ?
Part D: What is the expected payoff for each player at equilibrium?
# Use this cell to compute the equilibrium probabilities
Exercise 1.3: Interpretation¶
- What does the equilibrium strategy tell you about security practices?
- If the Defender could credibly commit to always monitoring the Database (e.g., via automated systems), would that change the outcome?
- How does this relate to the concept of “security through obscurity” vs “security through randomization”?
Digital Advertising Auctions¶
Background: How Google Ads Works¶
When you search on Google, advertisers compete in real-time auctions for ad placement. For many years, Google used a Vickrey (second-price) auction mechanism:
- Each advertiser submits a maximum bid (how much they’re willing to pay per click)
- The highest bidder wins the top ad slot
- The winner pays the second-highest bid (plus $0.01)
Why this design?
- Incentivizes truthful bidding (bidding your true value)
- Simplifies strategy for advertisers
- Reduces gaming and complexity
Note: Modern Google Ads uses a modified version called “Generalized Second Price” with quality scores, but the core principle remains.
Scenario: Three Advertisers¶
Three data analytics companies are bidding for the keyword “business intelligence software”:
- Company A: Estimates that each click is worth $15 to them (based on conversion rate × customer lifetime value)
- Company B: Estimates each click is worth $12
- Company C: Estimates each click is worth $8
Exercise 2.1: Second-Price Auction with Truthful Bidding¶
Assume all companies bid their true valuation.
Questions:
- Who wins the auction?
- How much does the winner pay per click?
- What is the winner’s profit per click?
- What is the seller’s (Google’s) revenue per click?
# Calculate outcomes for truthful bidding
valuations = Dict("A" => 15, "B" => 12, "C" => 8)
bids_truthful = Dict("A" => 15, "B" => 12, "C" => 8) # truthful bidding
# TODO: compute winner, price paid, profit
Exercise 2.2: What if Company A Tries to “Game” the System?¶
Suppose Company A knows (or guesses) that Company B will bid $12.
Company A considers bidding lower to reduce what they pay.
Scenario 1: Company A bids 15
- Who wins?
- How much do they pay?
- What is Company A’s profit per click?
Scenario 2: Company A bids 15
- Who wins?
- How much do they pay?
- What is Company A’s profit per click?
Conclusion: Can Company A improve their outcome by bidding untruthfully?
# Scenario 1: A bids 13
bids_scenario1 = Dict("A" => 13, "B" => 12, "C" => 8)
# TODO: compute outcome
# Scenario 2: A bids 11
bids_scenario2 = Dict("A" => 11, "B" => 12, "C" => 8)
# TODO: compute outcome
Exercise 2.3: First-Price Auction Comparison¶
Now suppose Google switched to a first-price sealed-bid auction (winner pays their own bid).
Questions:
- If all companies bid their true valuations (12, 8), what would be each company’s profit?
- Why would Company A want to “shade” their bid downward?
- If you were Company A, what would you bid? (Hint: you want to bid just above $12 to win, but how do you know what B will bid?)
- Which auction format do you think is easier for advertisers to participate in? Why?
Exercise 2.4: Extension - Multiple Ad Slots¶
In reality, Google shows multiple ads (positions 1, 2, 3). The top position gets more clicks.
Suppose:
- Position 1: 100 clicks per day
- Position 2: 60 clicks per day
- Position 3: 30 clicks per day
With valuations: A=12, C=$8
In a Generalized Second-Price (GSP) auction:
- Position 1 pays the bid of Position 2
- Position 2 pays the bid of Position 3
- Position 3 pays $0 (or a minimum bid)
Questions:
- If everyone bids truthfully, what is the daily cost for each advertiser?
- What is the daily profit for each advertiser?
- What is Google’s total daily revenue?
# Multi-slot auction analysis
clicks_per_position = [100, 60, 30]
valuations = Dict("A" => 15, "B" => 12, "C" => 8)
# TODO: compute costs, profits, and revenue
AWS Spot Instances: Cloud Computing Auctions¶
Background¶
AWS sells unused computing capacity through Spot Instances at discounted prices (up to 90% off on-demand pricing).
How it works:
- You specify the maximum price you’re willing to pay per hour
- If the spot price ≤ your bid, you get the instance
- You pay the current spot price (not your bid) - similar to second-price auction!
- If spot price rises above your bid, your instance is terminated (with 2-minute warning)
Use cases:
- Training ML models (can tolerate interruptions)
- Batch data processing
- Rendering and simulations
- Web crawling and data collection
Scenario: Training a Machine Learning Model¶
You’re a data scientist at a startup. You need to train a deep learning model and have three options:
- On-Demand Instance: $3.06/hour for a p3.2xlarge (1 GPU), guaranteed availability
- Spot Instance: Same machine, but price fluctuates between 2.50 per hour
- No GPU: Use your laptop, takes 10x longer, costs you time
Your training job will take:
- 10 hours on GPU
- Can checkpoint and resume if interrupted
- Each interruption costs ~20 minutes of wasted computation + time to restart
Your valuation: You value completing the job at $40 total value
- If it costs you less than $40, you profit
- If it costs more than $40, you should use your laptop instead
Exercise 3.1: Bidding Strategy Basics¶
Questions:
- What is your maximum value per hour? (Hint: $40 total value / 10 hours)
- If the current spot price is $1.20/hour, should you use spot instances?
- What should you bid in the spot instance auction? Your true value per hour, or something different?
- Why is bidding your true value optimal in this second-price auction format?
# Calculate maximum value per hour
total_value = 40
hours_needed = 10
max_value_per_hour = total_value / hours_needed
println("Maximum value per hour: \$$(round(max_value_per_hour, digits=2))")
# Compare to on-demand pricing
on_demand_price = 3.06
println("On-demand price: \$$(round(on_demand_price, digits=2))/hour")
println("Total on-demand cost: \$$(round(on_demand_price * hours_needed, digits=2))")
Maximum value per hour: $4.0
On-demand price: $3.06/hour
Total on-demand cost: $30.6
Exercise 3.2: Analyzing Spot Price History¶
Suppose you have historical spot price data for the past 24 hours:
| Hour | Spot Price |
|---|---|
| 0 | $0.92 |
| 1 | $0.88 |
| 2 | $0.85 |
| 3 | $0.90 |
| 4 | $0.95 |
| 5 | $1.10 |
| 6 | $1.45 |
| 7 | $1.80 |
| 8 | $2.20 |
| 9 | $2.10 |
| 10 | $1.95 |
| 11 | $1.75 |
| 12 | $1.60 |
| 13 | $1.55 |
| 14 | $1.50 |
| 15 | $1.45 |
| 16 | $1.40 |
| 17 | $1.50 |
| 18 | $1.65 |
| 19 | $1.55 |
| 20 | $1.40 |
| 21 | $1.20 |
| 22 | $1.05 |
| 23 | $0.95 |
using Plots
# Spot price data
hours = 0:23
spot_prices = [0.92, 0.88, 0.85, 0.90, 0.95, 1.10, 1.45, 1.80,
2.20, 2.10, 1.95, 1.75, 1.60, 1.55, 1.50, 1.45,
1.40, 1.50, 1.65, 1.55, 1.40, 1.20, 1.05, 0.95]
# TODO: Plot the spot price over time
# Add a horizontal line at your maximum value per hour
# Calculate: what percentage of time is spot price below your max?
Questions:
- What percentage of the time is the spot price below your maximum value per hour ($4.00)?
- What is the average spot price when it’s below $4.00?
- If you bid $4.00 and could run during any of these hours, what would be your expected cost?
- Based on the pattern, what time of day seems best for running spot instances?
Exercise 3.3: Risk vs Reward¶
Now consider interruption risk. Historical data shows:
- If you bid $2.00/hour: 30% chance of interruption during your 10-hour job
- If you bid $3.00/hour: 5% chance of interruption
- If you bid $4.00/hour: 1% chance of interruption
Each interruption:
- Wastes 20 minutes of computation (cost = current spot price × 0.33 hours)
- Requires time to restart and resume (assume this is negligible due to checkpointing)
Calculate expected costs:
# Expected cost calculation
function expected_cost(bid_price, avg_spot_price, interruption_prob; hours=10)
"""
Calculate expected cost including interruption costs
Parameters:
-----------
bid_price : Float64
Your maximum bid
avg_spot_price : Float64
Expected spot price given your bid
interruption_prob : Float64
Probability of interruption during job
hours : Float64
Hours needed to complete job
"""
# TODO: Calculate base cost (hours × avg spot price)
# TODO: Calculate expected interruption cost
# TODO: Return total expected cost
return 0.0 # placeholder
end
# Scenario 1: Bid $2.00
# Scenario 2: Bid $3.00
# Scenario 3: Bid $4.00
# Scenario 4: On-demand at $3.06
# TODO: Compare expected costs
Exercise 3.4: Strategic Implications¶
Discussion Questions:
Truth-telling: Why is bidding your true value ($4.00/hour) the optimal strategy in AWS spot instances?
Price Discovery: How does the spot price auction help AWS discover the true market value of computing resources?
Market Efficiency: How does this auction mechanism help AWS:
- Maximize revenue from unused capacity?
- Allocate resources to users who value them most?
- Provide price signals about demand patterns?
Comparison to First-Price: How would your bidding strategy change if AWS used a first-price auction (you pay your bid)?
Real-world Complexity: In practice, AWS spot prices are determined by supply/demand, not just auctions. How does this affect:
- Predictability of costs?
- Strategic behavior by users?
- Fairness of resource allocation?
Synthesis: Comparing the Three Applications¶
Exercise 4.1: Common Themes¶
Reflect on the three scenarios you’ve analyzed:
Cybersecurity Game:
- Zero-sum game
- Mixed strategy equilibrium
- Randomization as a strategic tool
Digital Advertising:
- Second-price auction
- Truth-telling as dominant strategy
- Multiple bidders with different valuations
AWS Spot Instances:
- Dynamic pricing auction
- Risk/reward tradeoffs
- Resource allocation mechanism
Questions:
- In which scenarios is “truth-telling” (revealing your true value) optimal? Why?
- In which scenarios is randomization important? Why?
- How do auction mechanisms help solve information asymmetry problems?
- What role does “commitment” play in each scenario?
- How might smart contracts or blockchain change any of these scenarios?
Exercise 4.2: Design Your Own Auction¶
Choose one of the following scenarios and design an auction mechanism:
Option 1: Data Marketplace
- Multiple data providers want to sell datasets
- Multiple companies want to buy data for ML training
- Data has different quality levels
- Design: What auction type? How do you handle quality? How do you price?
Option 2: Freelance Data Science Platform
- Companies post projects with unknown complexity
- Freelancers bid on projects with their rates
- Quality of work varies by freelancer
- Design: Sealed bid? Reverse auction? How to handle quality signals?
Option 3: ML Model Marketplace
- Data scientists train models and want to sell predictions
- Companies want to buy access to best models
- Model quality is uncertain until deployed
- Design: How to auction model access? How to price queries?
For your chosen scenario:
- Define the players and their valuations
- Choose an auction type (first-price, second-price, etc.)
- Explain why this auction type is appropriate
- Describe the equilibrium bidding strategy
- Discuss potential issues or limitations
Conclusion¶
In this lab, you’ve applied game theory concepts to real-world data science scenarios:
- Mixed strategies help solve security problems where randomization is crucial
- Second-price auctions simplify strategic decisions in advertising and cloud computing
- Equilibrium analysis helps predict behavior in competitive environments
- Auction design affects market efficiency and resource allocation
These concepts are fundamental to understanding:
- How tech platforms operate (Google, AWS, Meta)
- How to design data marketplaces and APIs
- How to think strategically about security and competition
- How incentives shape behavior in complex systems
As you continue in data science, you’ll encounter these game-theoretic situations frequently in:
- Pricing and monetization strategies
- Platform design and marketplace mechanisms
- Security and adversarial ML
- Resource allocation in cloud/distributed systems
- Algorithmic decision-making with strategic agents