Skip to article frontmatterSkip to article content

Game Theory Lab: Data Science Applications

University of Central Florida
Valorum Data

Computational Analysis of Social Complexity

Fall 2025, Spencer Lyon

Prerequisites

  • Mixed Strategies (L09.01)
  • Auctions (L09.03)

Outcomes

  • Apply mixed strategy Nash equilibrium to cybersecurity scenarios
  • Understand auction mechanisms in digital advertising
  • Analyze bidding strategies in cloud computing spot markets
  • Practice computing equilibria with real-world data science examples

Cybersecurity: Defender vs Attacker Game

Scenario

You are the head of security for a data analytics company. Your infrastructure has two critical components:

  • Database: Contains customer data and analytics results
  • API: Processes real-time data requests

Due to resource constraints, you can only deploy intensive monitoring and intrusion detection on ONE system at a time.

Meanwhile, a threat actor is targeting your company and must choose which system to attack.

Payoffs (from Defender’s perspective, in terms of avoided damages, measured in $10,000s):

  • If Defender monitors the system being attacked: +8 (attack detected and blocked)
  • If Defender monitors the wrong system: -10 (successful breach)
  • If Attacker succeeds in breaching Database: Defender loses 10, Attacker gains 10
  • If Attacker succeeds in breaching API: Defender loses 10, Attacker gains 10
  • If Attacker is blocked: Defender gains 8 (reputation boost), Attacker loses 8 (wasted resources)

Payoff Matrix

Attacker: DatabaseAttacker: API
Defender: Database(8, -8)(-10, 10)
Defender: API(-10, 10)(8, -8)

Note: Payoffs are (Defender, Attacker)

Exercise 1.1: Pure Strategy Analysis

  1. Does this game have a Nash Equilibrium in pure strategies? Why or why not?
  2. What type of game is this (zero-sum, positive-sum, negative-sum)?
  3. Why might randomization (mixed strategies) be beneficial for both players?

Exercise 1.2: Computing Mixed Strategy Equilibrium

Let:

  • pp = probability Defender monitors Database
  • qq = probability Attacker targets Database

Part A: Write out the Attacker’s expected payoff for each pure strategy as a function of pp:

  • Expected payoff from targeting Database: EA(Databasep)E_A(\text{Database} | p) = ?
  • Expected payoff from targeting API: EA(APIp)E_A(\text{API} | p) = ?

Part B: Use the indifference condition to solve for pp^* (the equilibrium probability)

Part C: By symmetry (or by computing Defender’s expected payoffs), what is qq^*?

Part D: What is the expected payoff for each player at equilibrium?

# Use this cell to compute the equilibrium probabilities

Exercise 1.3: Interpretation

  1. What does the equilibrium strategy tell you about security practices?
  2. If the Defender could credibly commit to always monitoring the Database (e.g., via automated systems), would that change the outcome?
  3. How does this relate to the concept of “security through obscurity” vs “security through randomization”?

Digital Advertising Auctions

Background: How Google Ads Works

When you search on Google, advertisers compete in real-time auctions for ad placement. For many years, Google used a Vickrey (second-price) auction mechanism:

  • Each advertiser submits a maximum bid (how much they’re willing to pay per click)
  • The highest bidder wins the top ad slot
  • The winner pays the second-highest bid (plus $0.01)

Why this design?

  • Incentivizes truthful bidding (bidding your true value)
  • Simplifies strategy for advertisers
  • Reduces gaming and complexity

Note: Modern Google Ads uses a modified version called “Generalized Second Price” with quality scores, but the core principle remains.

Scenario: Three Advertisers

Three data analytics companies are bidding for the keyword “business intelligence software”:

  • Company A: Estimates that each click is worth $15 to them (based on conversion rate × customer lifetime value)
  • Company B: Estimates each click is worth $12
  • Company C: Estimates each click is worth $8

Exercise 2.1: Second-Price Auction with Truthful Bidding

Assume all companies bid their true valuation.

Questions:

  1. Who wins the auction?
  2. How much does the winner pay per click?
  3. What is the winner’s profit per click?
  4. What is the seller’s (Google’s) revenue per click?
# Calculate outcomes for truthful bidding
valuations = Dict("A" => 15, "B" => 12, "C" => 8)
bids_truthful = Dict("A" => 15, "B" => 12, "C" => 8)  # truthful bidding

# TODO: compute winner, price paid, profit

Exercise 2.2: What if Company A Tries to “Game” the System?

Suppose Company A knows (or guesses) that Company B will bid $12.

Company A considers bidding lower to reduce what they pay.

Scenario 1: Company A bids 13insteadof13 instead of 15

  • Who wins?
  • How much do they pay?
  • What is Company A’s profit per click?

Scenario 2: Company A bids 11insteadof11 instead of 15

  • Who wins?
  • How much do they pay?
  • What is Company A’s profit per click?

Conclusion: Can Company A improve their outcome by bidding untruthfully?

# Scenario 1: A bids 13
bids_scenario1 = Dict("A" => 13, "B" => 12, "C" => 8)

# TODO: compute outcome

# Scenario 2: A bids 11
bids_scenario2 = Dict("A" => 11, "B" => 12, "C" => 8)

# TODO: compute outcome

Exercise 2.3: First-Price Auction Comparison

Now suppose Google switched to a first-price sealed-bid auction (winner pays their own bid).

Questions:

  1. If all companies bid their true valuations (15,15, 12, 8), what would be each company’s profit?
  2. Why would Company A want to “shade” their bid downward?
  3. If you were Company A, what would you bid? (Hint: you want to bid just above $12 to win, but how do you know what B will bid?)
  4. Which auction format do you think is easier for advertisers to participate in? Why?

Exercise 2.4: Extension - Multiple Ad Slots

In reality, Google shows multiple ads (positions 1, 2, 3). The top position gets more clicks.

Suppose:

  • Position 1: 100 clicks per day
  • Position 2: 60 clicks per day
  • Position 3: 30 clicks per day

With valuations: A=15,B=15, B=12, C=$8

In a Generalized Second-Price (GSP) auction:

  • Position 1 pays the bid of Position 2
  • Position 2 pays the bid of Position 3
  • Position 3 pays $0 (or a minimum bid)

Questions:

  1. If everyone bids truthfully, what is the daily cost for each advertiser?
  2. What is the daily profit for each advertiser?
  3. What is Google’s total daily revenue?
# Multi-slot auction analysis
clicks_per_position = [100, 60, 30]
valuations = Dict("A" => 15, "B" => 12, "C" => 8)

# TODO: compute costs, profits, and revenue

AWS Spot Instances: Cloud Computing Auctions

Background

AWS sells unused computing capacity through Spot Instances at discounted prices (up to 90% off on-demand pricing).

How it works:

  • You specify the maximum price you’re willing to pay per hour
  • If the spot price ≤ your bid, you get the instance
  • You pay the current spot price (not your bid) - similar to second-price auction!
  • If spot price rises above your bid, your instance is terminated (with 2-minute warning)

Use cases:

  • Training ML models (can tolerate interruptions)
  • Batch data processing
  • Rendering and simulations
  • Web crawling and data collection

Scenario: Training a Machine Learning Model

You’re a data scientist at a startup. You need to train a deep learning model and have three options:

  1. On-Demand Instance: $3.06/hour for a p3.2xlarge (1 GPU), guaranteed availability
  2. Spot Instance: Same machine, but price fluctuates between 0.50and0.50 and 2.50 per hour
  3. No GPU: Use your laptop, takes 10x longer, costs you time

Your training job will take:

  • 10 hours on GPU
  • Can checkpoint and resume if interrupted
  • Each interruption costs ~20 minutes of wasted computation + time to restart

Your valuation: You value completing the job at $40 total value

  • If it costs you less than $40, you profit
  • If it costs more than $40, you should use your laptop instead

Exercise 3.1: Bidding Strategy Basics

Questions:

  1. What is your maximum value per hour? (Hint: $40 total value / 10 hours)
  2. If the current spot price is $1.20/hour, should you use spot instances?
  3. What should you bid in the spot instance auction? Your true value per hour, or something different?
  4. Why is bidding your true value optimal in this second-price auction format?
# Calculate maximum value per hour
total_value = 40
hours_needed = 10

max_value_per_hour = total_value / hours_needed
println("Maximum value per hour: \$$(round(max_value_per_hour, digits=2))")

# Compare to on-demand pricing
on_demand_price = 3.06
println("On-demand price: \$$(round(on_demand_price, digits=2))/hour")
println("Total on-demand cost: \$$(round(on_demand_price * hours_needed, digits=2))")
Maximum value per hour: $4.0
On-demand price: $3.06/hour
Total on-demand cost: $30.6

Exercise 3.2: Analyzing Spot Price History

Suppose you have historical spot price data for the past 24 hours:

HourSpot Price
0$0.92
1$0.88
2$0.85
3$0.90
4$0.95
5$1.10
6$1.45
7$1.80
8$2.20
9$2.10
10$1.95
11$1.75
12$1.60
13$1.55
14$1.50
15$1.45
16$1.40
17$1.50
18$1.65
19$1.55
20$1.40
21$1.20
22$1.05
23$0.95
using Plots

# Spot price data
hours = 0:23
spot_prices = [0.92, 0.88, 0.85, 0.90, 0.95, 1.10, 1.45, 1.80,
               2.20, 2.10, 1.95, 1.75, 1.60, 1.55, 1.50, 1.45,
               1.40, 1.50, 1.65, 1.55, 1.40, 1.20, 1.05, 0.95]

# TODO: Plot the spot price over time
# Add a horizontal line at your maximum value per hour
# Calculate: what percentage of time is spot price below your max?

Questions:

  1. What percentage of the time is the spot price below your maximum value per hour ($4.00)?
  2. What is the average spot price when it’s below $4.00?
  3. If you bid $4.00 and could run during any of these hours, what would be your expected cost?
  4. Based on the pattern, what time of day seems best for running spot instances?

Exercise 3.3: Risk vs Reward

Now consider interruption risk. Historical data shows:

  • If you bid $2.00/hour: 30% chance of interruption during your 10-hour job
  • If you bid $3.00/hour: 5% chance of interruption
  • If you bid $4.00/hour: 1% chance of interruption

Each interruption:

  • Wastes 20 minutes of computation (cost = current spot price × 0.33 hours)
  • Requires time to restart and resume (assume this is negligible due to checkpointing)

Calculate expected costs:

# Expected cost calculation
function expected_cost(bid_price, avg_spot_price, interruption_prob; hours=10)
    """
    Calculate expected cost including interruption costs

    Parameters:
    -----------
    bid_price : Float64
        Your maximum bid
    avg_spot_price : Float64
        Expected spot price given your bid
    interruption_prob : Float64
        Probability of interruption during job
    hours : Float64
        Hours needed to complete job
    """
    # TODO: Calculate base cost (hours × avg spot price)
    # TODO: Calculate expected interruption cost
    # TODO: Return total expected cost
    return 0.0  # placeholder
end

# Scenario 1: Bid $2.00
# Scenario 2: Bid $3.00
# Scenario 3: Bid $4.00
# Scenario 4: On-demand at $3.06

# TODO: Compare expected costs

Exercise 3.4: Strategic Implications

Discussion Questions:

  1. Truth-telling: Why is bidding your true value ($4.00/hour) the optimal strategy in AWS spot instances?

  2. Price Discovery: How does the spot price auction help AWS discover the true market value of computing resources?

  3. Market Efficiency: How does this auction mechanism help AWS:

    • Maximize revenue from unused capacity?
    • Allocate resources to users who value them most?
    • Provide price signals about demand patterns?
  4. Comparison to First-Price: How would your bidding strategy change if AWS used a first-price auction (you pay your bid)?

  5. Real-world Complexity: In practice, AWS spot prices are determined by supply/demand, not just auctions. How does this affect:

    • Predictability of costs?
    • Strategic behavior by users?
    • Fairness of resource allocation?

Synthesis: Comparing the Three Applications

Exercise 4.1: Common Themes

Reflect on the three scenarios you’ve analyzed:

  1. Cybersecurity Game:

    • Zero-sum game
    • Mixed strategy equilibrium
    • Randomization as a strategic tool
  2. Digital Advertising:

    • Second-price auction
    • Truth-telling as dominant strategy
    • Multiple bidders with different valuations
  3. AWS Spot Instances:

    • Dynamic pricing auction
    • Risk/reward tradeoffs
    • Resource allocation mechanism

Questions:

  1. In which scenarios is “truth-telling” (revealing your true value) optimal? Why?
  2. In which scenarios is randomization important? Why?
  3. How do auction mechanisms help solve information asymmetry problems?
  4. What role does “commitment” play in each scenario?
  5. How might smart contracts or blockchain change any of these scenarios?

Exercise 4.2: Design Your Own Auction

Choose one of the following scenarios and design an auction mechanism:

Option 1: Data Marketplace

  • Multiple data providers want to sell datasets
  • Multiple companies want to buy data for ML training
  • Data has different quality levels
  • Design: What auction type? How do you handle quality? How do you price?

Option 2: Freelance Data Science Platform

  • Companies post projects with unknown complexity
  • Freelancers bid on projects with their rates
  • Quality of work varies by freelancer
  • Design: Sealed bid? Reverse auction? How to handle quality signals?

Option 3: ML Model Marketplace

  • Data scientists train models and want to sell predictions
  • Companies want to buy access to best models
  • Model quality is uncertain until deployed
  • Design: How to auction model access? How to price queries?

For your chosen scenario:

  1. Define the players and their valuations
  2. Choose an auction type (first-price, second-price, etc.)
  3. Explain why this auction type is appropriate
  4. Describe the equilibrium bidding strategy
  5. Discuss potential issues or limitations

Conclusion

In this lab, you’ve applied game theory concepts to real-world data science scenarios:

  • Mixed strategies help solve security problems where randomization is crucial
  • Second-price auctions simplify strategic decisions in advertising and cloud computing
  • Equilibrium analysis helps predict behavior in competitive environments
  • Auction design affects market efficiency and resource allocation

These concepts are fundamental to understanding:

  • How tech platforms operate (Google, AWS, Meta)
  • How to design data marketplaces and APIs
  • How to think strategically about security and competition
  • How incentives shape behavior in complex systems

As you continue in data science, you’ll encounter these game-theoretic situations frequently in:

  • Pricing and monetization strategies
  • Platform design and marketplace mechanisms
  • Security and adversarial ML
  • Resource allocation in cloud/distributed systems
  • Algorithmic decision-making with strategic agents