From Rule-Based to Learning Agents

Computational Analysis of Social Complexity
Fall 2025, Spencer Lyon

Prerequisites

Agent-Based Models (Weeks 6-7)
Basic Julia programming
Schelling segregation model

Outcomes

Distinguish between rule-based agents (traditional ABMs) and learning agents
Understand how LLMs function as general-purpose reasoning engines
Implement basic LLM API calls in Julia
Compare emergent behaviors of AI agents vs programmed agents

References

Attention Is All You Need - The transformer paper
Language Models are Few-Shot Learners - GPT-3 paper
Anthropic Claude Documentation
OpenAI API Documentation
AI Agents Survey Paper

A Tale of Two Agents¶

We’ve spent two weeks studying agent-based models
Remember the Schelling segregation model?
- Agents with simple rules: “move if less than N neighbors are like me”
- Environment: 25x25 grid
- Result: stark segregation patterns emerge from mild preferences
Those were rule-based agents - they follow explicit, programmed instructions

Now let me show you something different.

Suppose I asked you to write an agent that could:

Analyze a news article and summarize it
Answer questions about complex topics it’s never seen before
Generate creative stories in the style of different authors
Write code to solve novel problems
Engage in strategic reasoning about social situations

How would you program those rules?

You probably can’t - at least not easily. The rule set would be impossibly complex.

This is where learning agents come in.

The Fundamental Shift¶

Over the past few years, we’ve witnessed a fundamental shift in how we build intelligent systems:

Rule-Based Approach (Traditional ABMs):

Developer writes explicit rules
Agent follows those rules deterministically (or with simple randomness)
Behavior is predictable from the code
Example: if unhappy then move to random location

Learning Approach (Modern AI):

Developer provides training data and objectives
System learns patterns and relationships
Behavior emerges from learned representations
Example: “Here are 10 trillion words from the internet - learn to predict what comes next”

The second approach has given us Large Language Models (LLMs) - and they’re remarkably capable.

Connecting to What We Know¶

Before we dive into AI agents, let’s think about what made our ABMs interesting:

In the Schelling Model:

Simple agent logic (check neighbors, decide to move)
Local information only (can’t see whole grid)
Emergent behavior (segregation without a segregationist)
Aggregate outcomes differ from individual intentions

In the Money Model:

Even simpler logic (give $1 to random agent)
No intelligence at all, just random exchanges
Power law wealth distribution emerges
Inequality arises from equality plus randomness

These models taught us that complexity can emerge from simplicity.

Now consider this: what if we could build agents that aren’t simple?

What if individual agents could:

Reason about their situation
Learn from experience
Communicate in natural language
Adapt their strategies
Form and revise beliefs

What kinds of collective behaviors might emerge then?

This is the frontier we’re exploring in this module: Agentic AI Systems.

Understanding Large Language Models¶

What is an LLM?¶

A Large Language Model is a neural network trained to predict the next word (or “token”) in a sequence.

That’s it. Really.

But this simple objective, when applied at massive scale, produces something remarkable: a general-purpose reasoning engine.

The Training Process:

Collect enormous amounts of text (books, websites, code, papers, conversations)
For each sequence, hide the next word and ask the model to predict it
Adjust the model’s parameters (billions of them) to improve predictions
Repeat trillions of times

What the Model Learns:

Grammar and syntax (obvious)
Facts about the world (it saw them in training data)
Reasoning patterns (it saw examples of reasoning)
Causal relationships (they’re implicit in language)
Social norms and conventions (they’re encoded in how people write)
Task structures (it saw many examples of questions and answers)

From Next-Word Prediction to Intelligence¶

You might think: “How does predicting the next word lead to intelligence?”

Consider this sequence:

“The doctor told the patient that she would need to come back next week. The patient thanked...”

To predict the next word, the model needs to:

Track that “she” refers to “the patient”
Understand doctor-patient relationships
Know social conventions about gratitude
Predict “her” (the doctor) is the likely continuation

Or this one:

“If all mammals have hearts, and whales are mammals, then whales...”

To complete this, the model must:

Recognize a logical argument structure
Apply deductive reasoning
Predict “have hearts” or similar

The key insight: To predict text well, you need to model the world.

Text is the trace of human thought, and to predict it accurately, you need to simulate the thinking that produced it.

Transformer Architecture (High-Level Intuition)¶

We won’t dive deep into the math, but here’s the essential idea:

The Challenge: When predicting the next word, which previous words matter?

In “The doctor told the patient that she would need to come back”, the word “doctor” is relevant for predicting what comes after “the patient thanked”.

The Solution: Attention Mechanism

The transformer architecture uses “attention” to let each word look at all previous words and decide which ones are relevant:

"The patient thanked" <-- looks back at entire sequence
                       <-- pays attention to "doctor" (high weight)
                       <-- pays less attention to "told" (medium weight)
                       <-- ignores "the" (low weight)

This attention mechanism is:

Learned, not programmed (the model figures out what to attend to)
Parallel (can process all words simultaneously)
Multi-headed (multiple attention patterns can operate in parallel)
Stackable (many layers of attention build complex representations)

Network Analogy:

If you squint, attention looks like a weighted graph where:

Nodes are words/tokens
Edges are attention weights
Information flows along high-weight edges

The model learns to construct this graph dynamically based on the input!

Scale is All You Need (Almost)¶

A remarkable empirical finding: model capabilities scale predictably with:

Model size (number of parameters)
Data size (amount of training text)
Compute (training time × hardware)

Scaling Laws show that doubling compute roughly gives consistent improvements.

Modern LLMs (as of 2025):

GPT-5: ~2+ trillion parameters (estimated)
Claude Sonnet 4.5: ~500 billion parameters (estimated)
Claude Haiku 4.5: ~100 billion parameters (estimated)
Llama 3: 70 billion parameters (open source)

For comparison:

Human brain: ~86 billion neurons, ~100 trillion synapses
Each parameter is roughly analogous to a synaptic weight

We’re now at or exceeding biological-scale systems in parameter count.

Prompting as Programming¶

A New Interface for Computation¶

With traditional programming:

You write explicit instructions in a formal language (Julia, Python, etc.)
The computer executes them precisely
You get exactly what you specified (bugs included)

With LLMs:

You write instructions in natural language
The model interprets your intent
You get (hopefully) what you meant

This is prompting - the art and science of instructing LLMs.

Anatomy of a Prompt¶

A good prompt typically includes:

Role/Context: Who should the model be?
- “You are an expert economist...”
- “You are a helpful teaching assistant...”
Task: What should it do?
- “Analyze the following dataset...”
- “Summarize this paper in 3 bullet points...”
Constraints: How should it do it?
- “Use only information from the provided text”
- “Respond in JSON format”
- “Keep your answer under 100 words”
Examples (optional): Show, don’t just tell
- Few-shot learning: provide input-output examples
- The model learns the pattern and applies it
Output Format: Structure the response
- “Provide your answer as a numbered list”
- “Format your response as: Analysis: ... Conclusion: ...”

In-Context Learning¶

One of the most striking capabilities of LLMs is in-context learning:

You can teach the model a new task just by showing it examples in the prompt - no retraining needed!

Example:

Translate English to Pig Latin:

English: hello
Pig Latin: ellohay

English: world  
Pig Latin: orldway

English: agent
Pig Latin:

The model will likely output: agentyay or agentay

It learned the pattern from just two examples!

This is impossible with traditional rule-based agents.

Hands-On: Your First AI Agent¶

Setting Up API Access¶

To use LLMs, we need to call their APIs. The two major providers are:

Anthropic - We’ll use their Claude models:
- Claude Haiku 4.5 (default) - Fast, cost-effective, great for real-time tasks
- Claude Sonnet 4.5 - Most intelligent, best for complex reasoning and coding
- Claude Opus 4 - Premium model with extended context (200K tokens)
OpenAI - We’ll demonstrate their GPT models using the Responses API:
- GPT-5 - Flagship model with advanced reasoning
- GPT-5-mini - Cost-effective, 5x cheaper than GPT-5
- o3 and o3-mini - Specialized reasoning models
Note: We use OpenAI’s modern Responses API (/v1/responses) which provides a cleaner interface with separate instructions and input parameters. This is recommended for new projects. See the Responses API Quickstart for more details.

Both require API keys. Send me a message on webcourses and I’ll venmo you $10 to get started

OpenAI: https://platform.openai.com/signup
Anthropic: https://console.anthropic.com/

Important: Never commit API keys to git! Use environment variables.

We’ll use the HTTP.jl package to make API calls (it’s already in our Project.toml).

# Load packages
using HTTP
using JSON3
using DataFrames
using DotEnv  # For loading .env files

# Load environment variables from .env file
# This will read the .env file in your project directory
DotEnv.load!()

Getting API Keys¶

Cross-Platform Approach: Using .env Files

The best practice is to store your API keys in a .env file that you never commit to git.

Create a .env file in your project root (same directory as your notebook):

# .env file
ANTHROPIC_API_KEY=your-anthropic-key-here
OPENAI_API_KEY=your-openai-key-here

Add .env to your .gitignore to keep keys secure:

# Add this line to .gitignore
.env

Install DotEnv.jl to load environment variables from the .env file:

using Pkg
Pkg.add("DotEnv")

This approach works on Windows, macOS, and Linux without needing to set system environment variables!

Alternative: System Environment Variables (if you prefer)

You can also set environment variables at the system level:

macOS/Linux: Add to ~/.bashrc or ~/.zshrc:

export ANTHROPIC_API_KEY="your-key-here"
export OPENAI_API_KEY="your-key-here"

Windows (PowerShell):

$env:ANTHROPIC_API_KEY="your-key-here"
$env:OPENAI_API_KEY="your-key-here"

Windows (Command Prompt):

set ANTHROPIC_API_KEY=your-key-here
set OPENAI_API_KEY=your-key-here

For this lecture, we’ll use the .env file approach since it’s cross-platform and project-specific.

# Get API key from environment variables
# After running DotEnv.config() above, your .env file is loaded into ENV

# Check if the key exists and provide a helpful message if not
if haskey(ENV, "ANTHROPIC_API_KEY")
    ANTHROPIC_API_KEY = ENV["ANTHROPIC_API_KEY"]
    println("✓ Anthropic API key loaded successfully")
else
    @warn "No ANTHROPIC_API_KEY found! Please create a .env file with your key."
    ANTHROPIC_API_KEY = "your-key-here"  # Fallback for demo purposes
end

# Similarly for OpenAI (optional)
if haskey(ENV, "OPENAI_API_KEY")
    OPENAI_API_KEY = ENV["OPENAI_API_KEY"]
    println("✓ OpenAI API key loaded successfully")
else
    println("ℹ No OPENAI_API_KEY found (optional)")
end

✓ Anthropic API key loaded successfully
✓ OpenAI API key loaded successfully

Making Our First API Call¶

Let’s create a simple function to call Claude’s API:

Check the official documentation for the exact current model names:

Anthropic: https://docs.anthropic.com/en/docs/models-overview
OpenAI: https://platform.openai.com/docs/models

"""
    call_claude(prompt; model="claude-haiku-4-5", max_tokens=1024)

Call the Anthropic Claude API with a prompt.

# Arguments
- `prompt`: The text prompt to send to Claude
- `model`: Which Claude model to use. Options:
  - "claude-haiku-4-5" (default) - Fast and cost-effective
  - "claude-sonnet-4-5" - Most intelligent, best for complex tasks
  - "claude-opus-4" - Premium model with 200K context window
- `max_tokens`: Maximum length of response (default: 1024)

# Returns
- String containing Claude's response

# Note
Check https://docs.anthropic.com/en/docs/models-overview for exact model identifiers
as they may include version dates (e.g., "claude-haiku-4-5-20250101")
"""
function call_claude(prompt; model="claude-haiku-4-5", max_tokens=1024)
    url = "https://api.anthropic.com/v1/messages"

    headers = [
        "x-api-key" => ANTHROPIC_API_KEY,
        "anthropic-version" => "2023-06-01",
        "content-type" => "application/json"
    ]

    body = JSON3.write(Dict(
        "model" => model,
        "max_tokens" => max_tokens,
        "messages" => [
            Dict("role" => "user", "content" => prompt)
        ]
    ))

    response = HTTP.post(url, headers, body)
    result = JSON3.read(String(response.body))

    return result.content[1].text
end

call_claude

Testing Our AI Agent¶

Let’s start simple and ask Claude to explain something:

# Simple test
response = call_claude("""
Explain the Schelling segregation model in exactly 3 sentences.
Make it accessible to someone who hasn't studied it before.

Respond with markdown.
""")

println(response)

# Schelling Segregation Model

The Schelling segregation model is a simple computer simulation where people of different groups (like different races or ethnicities) are randomly placed on a grid and given the freedom to move. Each person has a preference for living near others like themselves—for example, they're content only if at least a certain percentage of their neighbors are the same group. Even though each person's preference for same-group neighbors is relatively mild, the model shows that the entire neighborhood becomes highly segregated over time, with groups clustering into separate areas.

Comparing Rule-Based and AI Agents¶

Now let’s do something interesting - ask Claude to role-play as both types of agents:

prompt = """
You are analyzing the difference between rule-based and learning agents.

Scenario: An agent needs to decide whether to move to a new location in a neighborhood.

Respond in two parts:

1. RULE-BASED AGENT: Show how a simple rule-based agent (like in Schelling model)
   would make this decision. Use pseudocode.

2. LEARNING AGENT: Describe how an AI agent with language capabilities might reason
   about the same decision, considering multiple factors and uncertainty.

Keep your entire response under 200 words.
"""

comparison = call_claude(prompt, max_tokens=500)
println(comparison)

# Rule-Based vs Learning Agents: Moving Decision

## 1. RULE-BASED AGENT (Schelling Model)

```
IF similarity_to_neighbors < threshold THEN
    move_to_random_new_location()
ELSE
    stay()
END IF

// Example: Move if <30% neighbors share my type
similarity = count_same_type_neighbors / total_neighbors
threshold = 0.30
```

**Simple, deterministic, fast.** No learning occurs.

---

## 2. LEARNING AGENT

A language-capable AI would:

- **Gather data**: Analyze neighborhood demographics, school ratings, crime trends, housing prices, community reviews
- **Reason with uncertainty**: "Safe neighborhood *probably* means lower crime, but data is incomplete"
- **Consider trade-offs**: "Proximity to work vs. affordable housing vs. community diversity"
- **Update beliefs**: If similar agents report positive/negative experiences, adjust estimates
- **Explain decisions**: "I recommend staying because job accessibility outweighs current neighborhood diversity concerns"

**Key difference**: Rule-based agents apply fixed rules mechanically. Learning agents use evidence to reason probabilistically, handle nuance, and improve predictions over time through experience or feedback.

Structured Output: Using AI Agents for Data Analysis¶

One powerful pattern: use LLMs to generate structured data that we can analyze computationally.

Let’s ask Claude to generate synthetic agent data:

prompt = """
Generate data for 5 agents in a social network simulation.

For each agent, provide:
- id (integer 1-5)
- personality (one word: cooperative, competitive, or neutral)
- risk_tolerance (float 0.0 to 1.0)
- initial_wealth (integer 50-150)

Format as JSON array of objects. IMPORTANT Respond ONLY with valid JSON, no other text or formatting including "```json".`
"""

json_response = call_claude(prompt, max_tokens=500)
println(json_response)

[
  {
    "id": 1,
    "personality": "cooperative",
    "risk_tolerance": 0.3,
    "initial_wealth": 85
  },
  {
    "id": 2,
    "personality": "competitive",
    "risk_tolerance": 0.8,
    "initial_wealth": 120
  },
  {
    "id": 3,
    "personality": "neutral",
    "risk_tolerance": 0.5,
    "initial_wealth": 95
  },
  {
    "id": 4,
    "personality": "cooperative",
    "risk_tolerance": 0.4,
    "initial_wealth": 72
  },
  {
    "id": 5,
    "personality": "competitive",
    "risk_tolerance": 0.7,
    "initial_wealth": 135
  }
]

# Parse the JSON response
agents_data = JSON3.read(json_response)

# Convert to DataFrame for analysis
df_agents = DataFrame(agents_data)
df_agents

Now we have structured data generated by an AI that we can analyze with traditional methods!

This pattern - LLM generation + computational analysis - is extremely powerful.

We can combine the flexibility of natural language with the rigor of computation.

Alternative: Using OpenAI’s Responses API¶

For completeness, here’s how to call OpenAI’s GPT models using their modern Responses API.

Note: OpenAI introduced the Responses API (/v1/responses) as the recommended interface for new projects. It provides a cleaner structure than the older Chat Completions API, with separate instructions for system-level guidance and input for user queries.

"""
    call_gpt(input; instructions="", model="gpt-5-mini", reasoning_effort="low")

Call the OpenAI Responses API with a user input and optional instructions.

# Arguments
- `input`: The user input/query to send to the model
- `instructions`: Optional system-level instructions for the model's behavior
- `model`: Which OpenAI model to use. Options:
  - "gpt-5-mini" (default) - Cost-effective, 5x cheaper than GPT-5
  - "gpt-5" - Flagship model with advanced reasoning
  - "o3-mini" - Specialized reasoning model (cost-effective)
  - "o3" - Advanced reasoning model
- `reasoning_effort`: Reasoning effort level for faster responses. Options:
  - "low" (default) - Faster responses
  - "medium" - Balanced
  - "high" - Maximum reasoning quality

# Returns
- String containing the model's response (using the convenience `output_text` field)

# Notes
- Uses the Responses API endpoint: /v1/responses
- The response has an `output` array, but we use the convenience `output_text` property
- The `output` array can contain multiple items (not just text), so always check structure
- For system instructions, use the `instructions` parameter (replaces the "system" role)
- Check https://platform.openai.com/docs/models for the latest model names
- API documentation: https://platform.openai.com/docs/api-reference/responses

# Example
```julia
# Simple query
response = call_gpt("What is emergence?")

# With instructions
response = call_gpt(
    "Explain the Schelling model",
    instructions="You are an expert in agent-based modeling"
)

# Different model and reasoning effort
response = call_gpt(
    "Solve this complex problem: ...",
    model="gpt-5",
    reasoning_effort="high"
)
```
"""
function call_gpt(input::String; instructions::String="", model::String="gpt-5-nano", reasoning_effort::String="low")
    url = "https://api.openai.com/v1/responses"

    # OPENAI_API_KEY should be loaded from ENV via DotEnv
    headers = [
        "Authorization" => "Bearer $(ENV["OPENAI_API_KEY"])",
        "Content-Type" => "application/json"
    ]

    # Build request body with Responses API format
    request_body = Dict(
        "model" => model,
        "input" => input,
        "reasoning" => Dict("effort" => reasoning_effort)
    )

    # Add instructions if provided
    if !isempty(instructions)
        request_body["instructions"] = instructions
    end

    body = JSON3.write(request_body)

    response = HTTP.post(url, headers, body)
    result = JSON3.read(String(response.body))

    # NOTE: result.output is an array that can have multiple items. We are assuming a simple case here...
    return result.output[end].content[1].text
end

# Usage examples:
# response = call_gpt("Explain transformers in one sentence")  # Uses gpt-5-mini by default
# response = call_gpt("Complex reasoning task", model="gpt-5", reasoning_effort="high")  # Use full GPT-5 for harder tasks
# response = call_gpt("Math problem", model="o3-mini")  # Use reasoning model
println(call_gpt("What is game theory?", instructions="You are a helpful economics professor")) # With system instructions

Game theory is the study of strategic interaction. It analyzes situations where the outcome for each participant depends not only on their own actions but also on the actions of others.

Key ideas:
- Players: the decision-makers in the situation.
- Strategies: the possible actions a player can take.
- Payoffs: the outcomes (often represented as utilities or profits) that players receive from a combination of actions.
- Information: what players know when they choose their actions (perfect vs. imperfect information).
- Rationality: the assumption that players aim to maximize their own payoffs.
- Equilibrium: a stable outcome where no player wants to change their strategy given what others are doing (Nash equilibrium is the most common concept).

Common types of games:
- Normal-form (static) games: players choose simultaneously; payoffs depend on the combination of chosen strategies.
- Extensive-form (dynamic) games: players move sequentially; the game is represented as a tree.
- Cooperative vs. noncooperative: whether players can form binding agreements.
- Zero-sum vs. general-sum: in zero-sum, one player’s gain is another’s loss; in general-sum, all can benefit or all can lose.
- Complete vs. incomplete information: whether players know all relevant circumstances.

Famous examples:
- Prisoner's Dilemma: shows how rational actors might choose worse outcomes due to self-interest, unless they can commit to cooperation.
- Chicken: highlights how brinkmanship can lead to disaster if both sides refuse to back down.
- Matching Pennies: a simple mixed-strategy example illustrating the idea of unpredictability to prevent being exploited.

Applications:
- Economics and business (pricing, auctions, bargaining)
- Politics and law (coalitions, treaties, auctions)
- Biology (evolutionary stable strategies)
- Computer science (algorithm design, AI decision-making)

Limitations:
- Realistically, people may not be perfectly rational or have complete information.
- Models rely on assumptions that may oversimplify complex human behavior.
- Predictive power depends on how well the game is specified and the payoff structure.

If you’d like, I can walk through a concrete example (like the Prisoner’s Dilemma) step by step or tailor a simple game to a real-world scenario you care about.

Emergent Capabilities of AI Agents¶

From Scale to Surprise¶

Remember how emergence worked in our ABMs?

Simple local rules
Complex global patterns
Outcomes not explicitly programmed

LLMs exhibit a similar phenomenon, but at the model level:

Training Objective: Predict the next word

Emergent Capabilities (things we didn’t explicitly train for):

Multi-step reasoning
Code generation and debugging
Translation between languages
Mathematical problem solving
Theory of mind (reasoning about others’ beliefs)
Tool use
Planning and strategy

These capabilities often appear suddenly at certain scale thresholds - we call them emergent abilities.

Chain-of-Thought Reasoning¶

One of the most important emergent capabilities is chain-of-thought (CoT) reasoning.

If you ask an LLM to “think step by step”, it performs dramatically better on complex tasks.

Let’s demonstrate:

# Without chain-of-thought
simple_prompt = """
In a network, node A is connected to B, B to C, C to D, and D to A.
Is there a path from A to C that doesn't go through B?
Answer with just YES or NO.
"""

println("Without CoT:")
println(call_claude(simple_prompt, max_tokens=50))

Without CoT:
YES

# With chain-of-thought
cot_prompt = """
In a network, node A is connected to B, B to C, C to D, and D to A.
Is there a path from A to C that doesn't go through B?

Let's think step by step:
1. List all direct connections
2. Find all possible paths from A to C
3. Identify which paths avoid B
4. Give your final answer
"""

println("With CoT:")
println(call_claude(cot_prompt, max_tokens=300))

With CoT:
# Finding a Path from A to C that Avoids B

## 1. List all direct connections
- A ↔ B
- B ↔ C
- C ↔ D
- D ↔ A

This forms a cycle: A — B — C — D — A

## 2. Find all possible paths from A to C

Starting from A and moving to C:
- **Path 1:** A → B → C
- **Path 2:** A → D → C

## 3. Identify which paths avoid B

- Path 1 (A → B → C): Goes through B ✗
- Path 2 (A → D → C): Does NOT go through B ✓

## Final Answer

**Yes**, there is a path from A to C that doesn't go through B: **A → D → C**

The second version should be more reliable because we asked the model to show its reasoning.

Why does this work?

The model generates reasoning steps as text, and each step becomes part of the context for the next step.

It’s essentially using its own output as a working memory!

This is analogous to how humans often solve problems by writing things down.

Tool Use: Extending Agent Capabilities¶

Modern LLMs can learn to use external tools:

Calculators (for precise math)
Search engines (for current information)
Code interpreters (for computation)
Database queries (for data access)
APIs (for external services)

This is called function calling or tool use.

The pattern:

Tell the model what tools are available (in the prompt)
Model decides which tool to use
Model outputs a structured request (e.g., JSON)
Your code executes the tool
Return results to the model
Model incorporates results and continues

Here’s a simple example:

# Define a "tool" the agent can use
function get_network_density(n_nodes, n_edges)
    max_edges = n_nodes * (n_nodes - 1) / 2
    return n_edges / max_edges
end

# Ask the agent to use it
tool_prompt = """
You have access to a function: get_network_density(n_nodes, n_edges)
that calculates the density of an undirected network.

A researcher has a social network with 50 people and 245 connections.
They want to know if this network is dense (density > 0.3) or sparse.

First, tell me what function call you would make (in Julia syntax).
Then, I'll tell you the result, and you can interpret it.
"""

response = call_claude(tool_prompt, max_tokens=200)
println(response)

# Function Call

To find the network density for a social network with 50 people and 245 connections, I would make this function call in Julia:

```julia
get_network_density(50, 245)
```

This will calculate the density of the undirected network with:
- **n_nodes** = 50 (people)
- **n_edges** = 245 (connections)

Once you provide the result, I can tell you whether this network is dense (density > 0.3) or sparse, and interpret what that means for the social network structure.

The model should have identified the correct function call. Let’s execute it:

# Execute the tool
density = get_network_density(50, 245)
println("Network density: ", round(density, digits=3))

# Return result to agent
followup_prompt = """
The function returned: $(round(density, digits=3))

Now interpret this result: is the network dense or sparse?
Explain in one sentence.
"""

interpretation = call_claude(followup_prompt, max_tokens=100)
println("\nAgent's interpretation:")
println(interpretation)

Network density: 0.2

Agent's interpretation:
# Network Density Interpretation

With a density of 0.2, the network is **sparse** because only 20% of all possible connections are present, meaning the majority of potential links between nodes do not exist.

This simple pattern - prompt → tool identification → execution → interpretation - is the foundation of agentic AI.

We’ll explore this much more deeply in later lectures.

Limitations and Considerations¶

What LLMs Can’t Do (Well)¶

Despite their impressive capabilities, LLMs have important limitations:

1. Precise Arithmetic

They approximate calculations, don’t compute them
Solution: Use tools (code execution, calculators)

2. Current Information

Training data has a cutoff date
They don’t know recent events
Solution: Retrieval Augmented Generation (RAG) - next lecture!

3. Consistency Across Calls

Same prompt can give different answers (they’re stochastic)
Solution: Use temperature=0 for determinism, or run multiple times and aggregate

4. Following Complex Constraints

May violate constraints you specify in prompts
Solution: Validate outputs programmatically

5. Long-Term Memory

Context window is finite (though growing - now 200K+ tokens)
Can’t remember everything from earlier in long conversations
Solution: External memory systems

6. True Understanding

Philosophical debate: do they “understand” or just pattern-match?
Practically: they can fail in surprising ways
Solution: Don’t anthropomorphize, verify critical outputs

Comparing Agent Paradigms¶

Let’s synthesize what we’ve learned:

Aspect	Rule-Based Agents	Learning Agents (LLMs)
How they work	Execute programmed rules	Generate text predictions
Behavior source	Explicit code	Learned patterns
Flexibility	Fixed to programmed scenarios	Adapt to novel situations
Predictability	Fully deterministic	Stochastic, sometimes surprising
Speed	Very fast (microseconds)	Slower (seconds)
Cost	Minimal (CPU only)	Expensive (GPU inference)
Accuracy	Perfect (within rules)	Approximate, needs validation
Development	Code each behavior	Prompt desired behavior
Debugging	Read the code	Prompt engineering trial-and-error
Best for	Well-defined, repetitive tasks	Open-ended, language-heavy tasks

The key insight: These aren’t competing approaches - they’re complementary!

The most powerful systems combine both:

LLMs for reasoning, planning, and natural language
Traditional code for precise computation, data storage, and execution

This hybrid approach is what we’ll explore in the rest of this module.

Connecting to Complex Systems¶

Emergence at Multiple Scales¶

Throughout this course, we’ve studied emergence:

Networks:

Local: Individual connections
Global: Small-world structure, clustering

Game Theory:

Local: Individual strategy choices
Global: Nash equilibria, social welfare

ABMs:

Local: Simple agent rules
Global: Segregation, wealth inequality

LLMs Add a New Dimension:

Training: Next-word prediction objective
Emergent: Reasoning, tool use, theory of mind

But here’s what’s really interesting for our course:

What happens when we build multi-agent systems with AI agents?

If simple agents (Schelling model) produce complex outcomes...

What do complex agents produce?

Preview: Agentic AI Systems¶

In the coming lectures, we’ll explore:

Lecture A1.02: RAG and Knowledge Augmentation

How to give agents access to external knowledge
Building agents that can retrieve and reason over documents
Connection to information networks

Lecture A1.03: Multi-Agent Conversations

AI agents communicating with each other
Emergent coordination and conflict
Connection to game theory and strategic interaction

Why This Matters¶

AI agents aren’t just a cool technology - they’re becoming a fundamental part of social complexity:

Economic Systems:

AI trading agents in financial markets
Algorithmic pricing and competition
Automated negotiation

Social Systems:

AI content creators and curators
Chatbots as social actors
Synthetic data and simulation

Scientific Systems:

AI research assistants
Automated hypothesis generation
Tool-augmented discovery

Understanding how these systems work, and how to build them, is increasingly essential for computational social science.

Exercises¶

Exercise 1: Prompt Engineering Fundamentals¶

Part A: Compare prompting styles

Write three different prompts that ask an LLM to explain the Schelling model:

Minimal prompt (one sentence)
Structured prompt (with role, task, and constraints)
Few-shot prompt (with examples of explanations)

Call the API with each prompt and compare the responses. Which works best? Why?

Part B: Constraint following

Design prompts to get an LLM to:

Respond in exactly 50 words (±5 words)
Respond only with valid JSON
Respond as a numbered list with exactly 3 items

Test each prompt. Do the constraints hold? What happens when they fail? Do different models have more consistent results?

# TODO: Your code here

Exercise 2: Agent Comparison - Schelling Reimagined¶

Scenario: We want to model neighborhood preferences, but with more nuanced agents.

Part A: Traditional ABM approach

Write pseudocode for a rule-based agent that:

Has attributes: location, income, family_size, preferences
Decides whether to move based on:
- Neighborhood similarity
- Housing costs
- School quality

Part B: AI agent approach

Create a prompt that asks an LLM to role-play as a household deciding whether to move.

Give it:

Current neighborhood statistics
Alternative neighborhood statistics
Household characteristics

Ask it to reason about the decision and output a choice + explanation.

Part C: Comparison

Run the AI agent on 5 different scenarios. Analyze:

Does it consider factors you didn’t explicitly mention?
Is it consistent across similar scenarios?
How does its reasoning differ from rule-based logic?

# TODO: Your code here

Exercise 3: Chain-of-Thought for Network Analysis¶

Task: Use an LLM with chain-of-thought reasoning to analyze network structure.

Setup: Define a small network in text:

Nodes: A, B, C, D, E
Edges: A-B, B-C, C-D, D-E, E-A, A-C

Questions to ask the LLM (using CoT prompting):

What is the diameter of this network?
Which node has the highest degree centrality?
Are there any triangles (3-cycles)?
If we remove edge A-C, does the network remain connected?

Requirements:

Use “let’s think step by step” prompting
Verify the LLM’s answers by computing them yourself in Julia
Document where the LLM is correct vs. incorrect

Extension: Try with a larger network (8-10 nodes). Does accuracy degrade?

# TODO: Your code here

Exercise 4: Structured Data Generation¶

Objective: Use an LLM to generate synthetic agent data for an ABM simulation.

Task: Create a prompt that generates data for 20 agents in a “social media network” simulation.

Each agent needs:

id (integer 1-20)
personality (one of: influencer, lurker, engager, contrarian)
posting_frequency (float 0.0-1.0, posts per day)
topics (array of 1-3 topics from: politics, sports, tech, entertainment, science)
follower_count (integer, should correlate somewhat with personality)

Requirements:

Get the LLM to output valid JSON
Parse the JSON into a Julia DataFrame
Validate the data (check types, ranges, correlations)
Visualize the distribution of personalities and topics

Bonus: Ask the LLM to also generate a “friendship network” as an edge list - who follows whom?

Reflection:

What advantages does LLM generation have over random generation?
What disadvantages or risks?

# TODO: Your code here

Exercise 5: Tool Use - Network Statistics Calculator¶

Objective: Build a simple agent that uses Julia functions as tools.

Part A: Define tools

Create Julia functions for:

calculate_clustering_coefficient(edges, node)
find_shortest_path(edges, source, target)
calculate_betweenness_centrality(edges, node)

You can use simple implementations or leverage Graphs.jl.

Part B: Create tool-use prompt

Write a prompt that:

Describes what each function does (the “tool documentation”)
Gives the agent a network analysis question
Asks it to identify which tool(s) to use and with what arguments

Part C: Execute and iterate

Create a loop that:

Sends prompt to LLM
Parses LLM’s tool call request
Executes the tool in Julia
Returns result to LLM
Gets final answer from LLM

Test cases:

“Which node is most central in this network?” (needs betweenness)
“How clustered is node A?” (needs clustering coefficient)
“What’s the shortest path from B to E?” (needs shortest path)

Challenge: Can the agent chain multiple tool calls? (“Find the shortest path from A to C, then calculate the clustering coefficient of each node along that path”)

Exercise 6: Package Organization - Moving API Functions to CAP6318¶

Objective: Practice good code organization by moving reusable functions to your course package.

Throughout this course, you’ve been building a CAP6318 package to organize your code. Now it’s time to add AI agent capabilities to it!

Part A: Create an AI module in your package

In your CAP6318 package directory, create a new file src/AI.jl with the following structure:

module AI

using HTTP
using JSON3

# Export the functions so they can be used outside the module
export call_claude, call_gpt

# Include your call_claude and call_gpt functions here
# (copy from the lecture, but consider adding improvements!)

end  # module

Part B: Integrate into your package

Add using DotEnv to your CAP6318 package’s Project.toml dependencies
In your main src/CAP6318.jl file, add:
```
include("AI.jl")
using .AI
```
Re-export the AI functions:
```
export call_claude, call_gpt
```

Part C: Test your package

Back in this notebook (or a new one), test that your package works:

using CAP6318

# Test with a simple prompt
response = call_claude("What is computational social science? Answer in 2 sentences.")
println(response)

Part D: Enhancements (Optional)

Consider adding these improvements to your AI module:

Error Handling: Wrap API calls in try-catch blocks with informative error messages
Response Caching: Store responses in a dictionary to avoid repeated API calls with the same prompt
Token Counting: Add a function to estimate token usage before making calls
Logging: Track all API calls (prompts, responses, models used) to a file
Rate Limiting: Add a small delay between calls to avoid hitting rate limits
Multiple Providers: Create a unified call_llm() function that works with both Anthropic and OpenAI

Example Enhanced Function:

function call_claude(prompt; model="claude-haiku-4-5", max_tokens=1024, verbose=false)
    if verbose
        println("Calling $model with prompt of length $(length(prompt)) characters...")
    end
    
    try
        # API call code here
        response = # ... your API call ...
        
        if verbose
            println("✓ Received response of $(length(response)) characters")
        end
        
        return response
    catch e
        @error "Failed to call Claude API" exception=e
        rethrow(e)
    end
end

Reflection:

Why is it better to put these functions in a package rather than copying them into each notebook?
What other AI-related functionality might you add to this module?
How does this relate to software engineering best practices?

Deliverable: Show that you can successfully import and use call_claude or call_gpt from your CAP6318 package.

# TODO: Your code here for Exercise 6
# Hint: After setting up your package, you should be able to do:
# using CAP6318
# response = call_claude("Hello!")
# println(response)

# TODO: Your code here

Reflection Questions¶

Consider these questions as you work through the exercises:

On Emergence: We saw emergent behavior in both ABMs (segregation from simple rules) and LLMs (reasoning from next-word prediction). Are these the same kind of emergence? How do they differ?
On Agency: What does it mean for an LLM to be an “agent”? Does it have goals? Preferences? Beliefs? Or are these anthropomorphizations?
On Predictability: Rule-based ABMs are deterministic but produce unpredictable aggregate outcomes. AI agents are stochastic but often produce predictable responses. What are the implications for simulation and modeling?
On Control: With rule-based agents, we control them by writing code. With AI agents, we control them by writing prompts. Which gives us more control? Which is more brittle?
On Social Complexity: If AI agents become widespread in social and economic systems, how might that change the dynamics? Will they amplify existing patterns or create new ones?

We’ll explore these questions more deeply throughout the module.

Summary¶

In this lecture, we’ve:

✓ Distinguished between rule-based agents (traditional ABMs) and learning agents (LLMs)

✓ Understood how LLMs work at a high level (next-word prediction → general reasoning)

✓ Learned about transformer architecture and attention mechanisms

✓ Implemented basic LLM API calls in Julia (Anthropic and OpenAI)

✓ Explored emergent capabilities: chain-of-thought reasoning and tool use

✓ Compared the strengths and limitations of both agent paradigms

✓ Connected AI agents to broader themes in complex systems

Key Takeaways:

LLMs are trained on next-word prediction but develop general capabilities - an example of emergence
Prompting is programming - but in natural language with probabilistic outcomes
Chain-of-thought and tool use extend what agents can do beyond their training
Rule-based and learning agents are complementary - hybrid systems combine their strengths
Multi-agent AI systems represent a new frontier in computational social science

Next Lecture: We’ll explore how to augment AI agents with external knowledge using Retrieval Augmented Generation (RAG), connecting to information networks and knowledge graphs.

Game Theory 2

Game Theory Lab: Data Science Applications

Agents 1

Retrieval-Augmented Generation: Augmenting AI with External Knowledge