Game Theory Lab: Data Science Applications

Computational Analysis of Social Complexity
Fall 2025, Spencer Lyon

Prerequisites

Mixed Strategies (L09.01)
Auctions (L09.03)

Outcomes

Apply mixed strategy Nash equilibrium to cybersecurity scenarios
Understand auction mechanisms in digital advertising
Analyze bidding strategies in cloud computing spot markets
Practice computing equilibria with real-world data science examples

Cybersecurity: Defender vs Attacker Game¶

Scenario¶

You are the head of security for a data analytics company. Your infrastructure has two critical components:

Database: Contains customer data and analytics results
API: Processes real-time data requests

Due to resource constraints, you can only deploy intensive monitoring and intrusion detection on ONE system at a time.

Meanwhile, a threat actor is targeting your company and must choose which system to attack.

Payoffs (from Defender’s perspective, in terms of avoided damages, measured in $10,000s):

If Defender monitors the system being attacked: +8 (attack detected and blocked)
If Defender monitors the wrong system: -10 (successful breach)
If Attacker succeeds in breaching Database: Defender loses 10, Attacker gains 10
If Attacker succeeds in breaching API: Defender loses 10, Attacker gains 10
If Attacker is blocked: Defender gains 8 (reputation boost), Attacker loses 8 (wasted resources)

Payoff Matrix¶

	Attacker: Database	Attacker: API
Defender: Database	(8, -8)	(-10, 10)
Defender: API	(-10, 10)	(8, -8)

Note: Payoffs are (Defender, Attacker)

Exercise 1.1: Pure Strategy Analysis¶

Does this game have a Nash Equilibrium in pure strategies? Why or why not?
What type of game is this (zero-sum, positive-sum, negative-sum)?
Why might randomization (mixed strategies) be beneficial for both players?

Exercise 1.2: Computing Mixed Strategy Equilibrium¶

Let:

$p$ = probability Defender monitors Database
$q$ = probability Attacker targets Database

Part A: Write out the Attacker’s expected payoff for each pure strategy as a function of $p$ :

Expected payoff from targeting Database: $E_A(\text{Database} | p)$ = ?
Expected payoff from targeting API: $E_A(\text{API} | p)$ = ?

Part B: Use the indifference condition to solve for $p^*$ (the equilibrium probability)

Part C: By symmetry (or by computing Defender’s expected payoffs), what is $q^*$ ?

Part D: What is the expected payoff for each player at equilibrium?

# Use this cell to compute the equilibrium probabilities

Exercise 1.3: Interpretation¶

What does the equilibrium strategy tell you about security practices?
If the Defender could credibly commit to always monitoring the Database (e.g., via automated systems), would that change the outcome?
How does this relate to the concept of “security through obscurity” vs “security through randomization”?

Digital Advertising Auctions¶

Background: How Google Ads Works¶

When you search on Google, advertisers compete in real-time auctions for ad placement. For many years, Google used a Vickrey (second-price) auction mechanism:

Each advertiser submits a maximum bid (how much they’re willing to pay per click)
The highest bidder wins the top ad slot
The winner pays the second-highest bid (plus $0.01)

Why this design?

Incentivizes truthful bidding (bidding your true value)
Simplifies strategy for advertisers
Reduces gaming and complexity

Note: Modern Google Ads uses a modified version called “Generalized Second Price” with quality scores, but the core principle remains.

Scenario: Three Advertisers¶

Three data analytics companies are bidding for the keyword “business intelligence software”:

Company A: Estimates that each click is worth $15 to them (based on conversion rate × customer lifetime value)
Company B: Estimates each click is worth $12
Company C: Estimates each click is worth $8

Exercise 2.1: Second-Price Auction with Truthful Bidding¶

Assume all companies bid their true valuation.

Questions:

Who wins the auction?
How much does the winner pay per click?
What is the winner’s profit per click?
What is the seller’s (Google’s) revenue per click?

# Calculate outcomes for truthful bidding
valuations = Dict("A" => 15, "B" => 12, "C" => 8)
bids_truthful = Dict("A" => 15, "B" => 12, "C" => 8)  # truthful bidding

# TODO: compute winner, price paid, profit

Exercise 2.2: What if Company A Tries to “Game” the System?¶

Suppose Company A knows (or guesses) that Company B will bid $12.

Company A considers bidding lower to reduce what they pay.

Scenario 1: Company A bids $13 instead of$ 15

Who wins?
How much do they pay?
What is Company A’s profit per click?

Scenario 2: Company A bids $11 instead of$ 15

Who wins?
How much do they pay?
What is Company A’s profit per click?

Conclusion: Can Company A improve their outcome by bidding untruthfully?

# Scenario 1: A bids 13
bids_scenario1 = Dict("A" => 13, "B" => 12, "C" => 8)

# TODO: compute outcome

# Scenario 2: A bids 11
bids_scenario2 = Dict("A" => 11, "B" => 12, "C" => 8)

# TODO: compute outcome

Exercise 2.3: First-Price Auction Comparison¶

Now suppose Google switched to a first-price sealed-bid auction (winner pays their own bid).

Questions:

If all companies bid their true valuations ( $15,$ 12, 8), what would be each company’s profit?
Why would Company A want to “shade” their bid downward?
If you were Company A, what would you bid? (Hint: you want to bid just above $12 to win, but how do you know what B will bid?)
Which auction format do you think is easier for advertisers to participate in? Why?

Exercise 2.4: Extension - Multiple Ad Slots¶

In reality, Google shows multiple ads (positions 1, 2, 3). The top position gets more clicks.

Suppose:

Position 1: 100 clicks per day
Position 2: 60 clicks per day
Position 3: 30 clicks per day

With valuations: A= $15, B=$ 12, C=$8

In a Generalized Second-Price (GSP) auction:

Position 1 pays the bid of Position 2
Position 2 pays the bid of Position 3
Position 3 pays $0 (or a minimum bid)

Questions:

If everyone bids truthfully, what is the daily cost for each advertiser?
What is the daily profit for each advertiser?
What is Google’s total daily revenue?

# Multi-slot auction analysis
clicks_per_position = [100, 60, 30]
valuations = Dict("A" => 15, "B" => 12, "C" => 8)

# TODO: compute costs, profits, and revenue

AWS Spot Instances: Cloud Computing Auctions¶

Background¶

AWS sells unused computing capacity through Spot Instances at discounted prices (up to 90% off on-demand pricing).

How it works:

You specify the maximum price you’re willing to pay per hour
If the spot price ≤ your bid, you get the instance
You pay the current spot price (not your bid) - similar to second-price auction!
If spot price rises above your bid, your instance is terminated (with 2-minute warning)

Use cases:

Training ML models (can tolerate interruptions)
Batch data processing
Rendering and simulations
Web crawling and data collection

Scenario: Training a Machine Learning Model¶

You’re a data scientist at a startup. You need to train a deep learning model and have three options:

On-Demand Instance: $3.06/hour for a p3.2xlarge (1 GPU), guaranteed availability
Spot Instance: Same machine, but price fluctuates between $0.50 and$ 2.50 per hour
No GPU: Use your laptop, takes 10x longer, costs you time

Your training job will take:

10 hours on GPU
Can checkpoint and resume if interrupted
Each interruption costs ~20 minutes of wasted computation + time to restart

Your valuation: You value completing the job at $40 total value

If it costs you less than $40, you profit
If it costs more than $40, you should use your laptop instead

Exercise 3.1: Bidding Strategy Basics¶

Questions:

What is your maximum value per hour? (Hint: $40 total value / 10 hours)
If the current spot price is $1.20/hour, should you use spot instances?
What should you bid in the spot instance auction? Your true value per hour, or something different?
Why is bidding your true value optimal in this second-price auction format?

# Calculate maximum value per hour
total_value = 40
hours_needed = 10

max_value_per_hour = total_value / hours_needed
println("Maximum value per hour: \$$(round(max_value_per_hour, digits=2))")

# Compare to on-demand pricing
on_demand_price = 3.06
println("On-demand price: \$$(round(on_demand_price, digits=2))/hour")
println("Total on-demand cost: \$$(round(on_demand_price * hours_needed, digits=2))")

Maximum value per hour: $4.0
On-demand price: $3.06/hour
Total on-demand cost: $30.6

Exercise 3.2: Analyzing Spot Price History¶

Suppose you have historical spot price data for the past 24 hours:

Hour	Spot Price
0	$0.92
1	$0.88
2	$0.85
3	$0.90
4	$0.95
5	$1.10
6	$1.45
7	$1.80
8	$2.20
9	$2.10
10	$1.95
11	$1.75
12	$1.60
13	$1.55
14	$1.50
15	$1.45
16	$1.40
17	$1.50
18	$1.65
19	$1.55
20	$1.40
21	$1.20
22	$1.05
23	$0.95

using Plots

# Spot price data
hours = 0:23
spot_prices = [0.92, 0.88, 0.85, 0.90, 0.95, 1.10, 1.45, 1.80,
               2.20, 2.10, 1.95, 1.75, 1.60, 1.55, 1.50, 1.45,
               1.40, 1.50, 1.65, 1.55, 1.40, 1.20, 1.05, 0.95]

# TODO: Plot the spot price over time
# Add a horizontal line at your maximum value per hour
# Calculate: what percentage of time is spot price below your max?

Questions:

What percentage of the time is the spot price below your maximum value per hour ($4.00)?
What is the average spot price when it’s below $4.00?
If you bid $4.00 and could run during any of these hours, what would be your expected cost?
Based on the pattern, what time of day seems best for running spot instances?

Exercise 3.3: Risk vs Reward¶

Now consider interruption risk. Historical data shows:

If you bid $2.00/hour: 30% chance of interruption during your 10-hour job
If you bid $3.00/hour: 5% chance of interruption
If you bid $4.00/hour: 1% chance of interruption

Each interruption:

Wastes 20 minutes of computation (cost = current spot price × 0.33 hours)
Requires time to restart and resume (assume this is negligible due to checkpointing)

Calculate expected costs:

# Expected cost calculation
function expected_cost(bid_price, avg_spot_price, interruption_prob; hours=10)
    """
    Calculate expected cost including interruption costs

    Parameters:
    -----------
    bid_price : Float64
        Your maximum bid
    avg_spot_price : Float64
        Expected spot price given your bid
    interruption_prob : Float64
        Probability of interruption during job
    hours : Float64
        Hours needed to complete job
    """
    # TODO: Calculate base cost (hours × avg spot price)
    # TODO: Calculate expected interruption cost
    # TODO: Return total expected cost
    return 0.0  # placeholder
end

# Scenario 1: Bid $2.00
# Scenario 2: Bid $3.00
# Scenario 3: Bid $4.00
# Scenario 4: On-demand at $3.06

# TODO: Compare expected costs

Exercise 3.4: Strategic Implications¶

Discussion Questions:

Truth-telling: Why is bidding your true value ($4.00/hour) the optimal strategy in AWS spot instances?
Price Discovery: How does the spot price auction help AWS discover the true market value of computing resources?
Market Efficiency: How does this auction mechanism help AWS:
- Maximize revenue from unused capacity?
- Allocate resources to users who value them most?
- Provide price signals about demand patterns?
Comparison to First-Price: How would your bidding strategy change if AWS used a first-price auction (you pay your bid)?
Real-world Complexity: In practice, AWS spot prices are determined by supply/demand, not just auctions. How does this affect:
- Predictability of costs?
- Strategic behavior by users?
- Fairness of resource allocation?

Synthesis: Comparing the Three Applications¶

Exercise 4.1: Common Themes¶

Reflect on the three scenarios you’ve analyzed:

Cybersecurity Game:
- Zero-sum game
- Mixed strategy equilibrium
- Randomization as a strategic tool
Digital Advertising:
- Second-price auction
- Truth-telling as dominant strategy
- Multiple bidders with different valuations
AWS Spot Instances:
- Dynamic pricing auction
- Risk/reward tradeoffs
- Resource allocation mechanism

Questions:

In which scenarios is “truth-telling” (revealing your true value) optimal? Why?
In which scenarios is randomization important? Why?
How do auction mechanisms help solve information asymmetry problems?
What role does “commitment” play in each scenario?
How might smart contracts or blockchain change any of these scenarios?

Exercise 4.2: Design Your Own Auction¶

Choose one of the following scenarios and design an auction mechanism:

Option 1: Data Marketplace

Multiple data providers want to sell datasets
Multiple companies want to buy data for ML training
Data has different quality levels
Design: What auction type? How do you handle quality? How do you price?

Option 2: Freelance Data Science Platform

Companies post projects with unknown complexity
Freelancers bid on projects with their rates
Quality of work varies by freelancer
Design: Sealed bid? Reverse auction? How to handle quality signals?

Option 3: ML Model Marketplace

Data scientists train models and want to sell predictions
Companies want to buy access to best models
Model quality is uncertain until deployed
Design: How to auction model access? How to price queries?

For your chosen scenario:

Define the players and their valuations
Choose an auction type (first-price, second-price, etc.)
Explain why this auction type is appropriate
Describe the equilibrium bidding strategy
Discuss potential issues or limitations

Conclusion¶

In this lab, you’ve applied game theory concepts to real-world data science scenarios:

Mixed strategies help solve security problems where randomization is crucial
Second-price auctions simplify strategic decisions in advertising and cloud computing
Equilibrium analysis helps predict behavior in competitive environments
Auction design affects market efficiency and resource allocation

These concepts are fundamental to understanding:

How tech platforms operate (Google, AWS, Meta)
How to design data marketplaces and APIs
How to think strategically about security and competition
How incentives shape behavior in complex systems

As you continue in data science, you’ll encounter these game-theoretic situations frequently in:

Pricing and monetization strategies
Platform design and marketplace mechanisms
Security and adversarial ML
Resource allocation in cloud/distributed systems
Algorithmic decision-making with strategic agents

Game Theory 2

Auctions

Agents 1

From Rule-Based to Learning Agents