Skip to article frontmatterSkip to article content

Function Calling and Tool Use: From Talk to Action

University of Central Florida
Valorum Data

Computational Analysis of Social Complexity

Fall 2025, Spencer Lyon

Prerequisites

  • L.A1.01 (LLMs and API calls)
  • L.A1.02 (RAG systems)
  • Graph theory/Network Science (Week 3-5)

Outcomes

  • Implement function calling with modern LLM APIs
  • Design JSON schemas for tool definitions
  • Build agents that execute code and analyze computational results
  • Create a network analysis toolkit accessible to AI agents

References

From Conversation to Computation

The Limitations of Text-Only Agents

In Week A1, we learned how to build AI agents that can:

  • Engage in natural language conversations
  • Retrieve and synthesize information from knowledge bases
  • Coordinate with other agents

But there’s a fundamental limitation: these agents can only talk.

Suppose you ask an LLM:

“I have a social network with 1000 nodes. Can you calculate the average clustering coefficient?”

The LLM might respond:

“I’d be happy to help calculate the clustering coefficient! Please provide the network data in an adjacency matrix or edge list format, and I’ll walk you through the calculation.”

But it can’t actually do the calculation. It’s like hiring a consultant who can only write reports but can’t use a computer.

What We Really Want

Imagine instead:

You: “Calculate the clustering coefficient for this network: [provides graph data]”

Agent:

  1. Parses the network data
  2. Calls calculate_clustering_coefficient(graph)
  3. Gets result: 0.342
  4. Responds: “The average clustering coefficient is 0.342, indicating moderate clustering. This is typical for social networks where friend groups form tightly-knit communities.”

The agent didn’t just describe how to compute the answer - it actually computed it.

This is the power of tool use or function calling: agents that can take actions, not just generate text.

Why This Matters for Computational Social Science

Our course focuses on computational analysis of complex systems:

  • Network analysis (Weeks 3-5)
  • Agent-based modeling (Weeks 6-7)
  • Game theory (Weeks 8-9)

All of these require computation, not just conversation.

AI agents with tool use can:

  • Analyze real network data
  • Run simulations and interpret results
  • Solve game theory problems numerically
  • Query blockchain state and analyze transactions

They become computational assistants, not just chatbots.

Understanding Function Calling

The Basic Pattern

Function calling (also called “tool use”) works through a structured protocol:

Step 1: Define Available Tools

  • Tell the LLM what functions it can call
  • Provide a description of each function
  • Specify the parameters and their types

Step 2: Agent Decides to Use a Tool

  • User asks a question
  • LLM determines if it needs to call a function
  • Generates a structured request (JSON) specifying the function and arguments

Step 3: Your Code Executes the Function

  • Parse the LLM’s request
  • Call the actual Julia function
  • Get the result

Step 4: Return Results to Agent

  • Send function output back to LLM
  • LLM incorporates the result into its response
  • Generates a natural language answer for the user

This might seem like a complex dance, but modern LLM APIs make it straightforward.

Why Not Just Put Code in the Prompt?

You might wonder: why not just tell the LLM “here’s how to calculate clustering coefficient” in the prompt?

Problems with code-in-prompt:

  1. Unreliable execution: LLM might make mistakes in calculation
  2. No actual computation: LLM simulates/approximates, doesn’t execute
  3. Verbose: Including full code implementations in prompts wastes tokens
  4. Can’t handle complexity: Real functions often require libraries, state, I/O

Function calling provides:

  1. Precise execution: Real Julia code runs, no approximation
  2. Efficiency: Just describe what the function does, not how
  3. Power: Access to entire Julia ecosystem (Graphs.jl, Agents.jl, etc.)
  4. Safety: You control what code actually executes

JSON Schemas: Defining Tool Interfaces

The Language of Tools

To use function calling, we need a way to describe functions to the LLM. The standard format is JSON Schema.

JSON Schema is a vocabulary for annotating and validating JSON documents. For function calling, it describes:

  • Function name
  • What the function does (description)
  • What parameters it takes (name, type, description, whether required)
  • What it returns (usually in description)

Important Note: While we’ll see how to write JSON schemas manually (to understand the underlying format), PydanticAI will generate these automatically from Python function signatures and docstrings. This is one of the major benefits of using PydanticAI - you write normal Python functions with type hints and docstrings, and the schemas are created for you.

Let’s start with a simple example to see what the JSON schema format looks like:

import json

# Define a simple calculator function
def add_numbers(a: float, b: float) -> float:
    """Add two numbers together and return the sum."""
    return a + b

# Manual JSON Schema definition (what OpenAI API expects)
add_numbers_tool = {
    "type": "function",

    "function": {
        "name": "add_numbers",
        "description": "Add two numbers together and return the sum",
        "parameters": {
            "type": "object",
            "properties": {
                "a": {
                    "type": "number",
                    "description": "The first number"
                },
                "b": {
                    "type": "number",
                    "description": "The second number"
                }
            },
            "required": ["a", "b"],
            "additionalProperties": False
        },
        "strict": True
    }
}

# Display the schema
print(json.dumps(add_numbers_tool, indent=2))
{
  "type": "function",
  "function": {
    "name": "add_numbers",
    "description": "Add two numbers together and return the sum",
    "parameters": {
      "type": "object",
      "properties": {
        "a": {
          "type": "number",
          "description": "The first number"
        },
        "b": {
          "type": "number",
          "description": "The second number"
        }
      },
      "required": [
        "a",
        "b"
      ],
      "additionalProperties": false
    },
    "strict": true
  }
}

Anatomy of a Tool Definition

Let’s break down the structure:

Top Level:

  • type: Always “function” for function calling
  • function: Contains the function specification

Function Object:

  • name: Identifier for the function (what the LLM will call)
  • description: Natural language explanation of what it does (crucial for LLM to understand when to use it)
  • parameters: JSON Schema object describing the parameters
  • strict: Optional boolean (recommended true) for strict schema validation

Parameters Object:

  • type: Always “object” (parameters are passed as a JSON object)
  • properties: Dict mapping parameter names to their schemas
  • required: Array of parameter names that must be provided
  • additionalProperties: Set to false to prevent extra properties

Each Parameter:

  • type: JSON type (“string”, “number”, “integer”, “boolean”, “array”, “object”)
  • description: What this parameter represents
  • Optional: enum (allowed values), minimum/maximum (for numbers), etc.

Key Point: The description fields are critical - they’re how the LLM decides when and how to use your function. Write clear, specific descriptions that explain:

  • What the function does
  • When to use it
  • What each parameter means
  • What the function returns

The PydanticAI Way: Automatic Schema Generation

Now, here’s the key insight: you don’t have to write these schemas manually when using PydanticAI. PydanticAI uses the griffe library to extract parameter descriptions from your docstrings and automatically generates the JSON schema from your function signature.

Here’s how the same function looks with PydanticAI:

# one time setup code to load environment variables and set up async support in Jupyter
from dotenv import load_dotenv
import nest_asyncio

load_dotenv()
nest_asyncio.apply()
# PydanticAI Way: Automatic Schema Generation
from pydantic_ai import Agent

# Create aan agent
agent = Agent('anthropic:claude-haiku-4-5')

# Register the tool with decorator - schema is generated automatically!
@agent.tool_plain
def add_numbers(a: float, b: float) -> float:
    """
    Add two numbers together and return the sum.

    Args:
        a: The first number
        b: The second number

    Returns:
        The sum of a and b
    """
    return a + b

# Test it!
result = agent.run_sync("What is 25 plus 17?")
print(result.output)
25 plus 17 equals **42**.
result.all_messages()
[ModelRequest(parts=[UserPromptPart(content='What is 25 plus 17?', timestamp=datetime.datetime(2025, 11, 10, 23, 33, 31, 690679, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[ToolCallPart(tool_name='add_numbers', args={'a': 25, 'b': 17}, tool_call_id='toolu_01JkbLv1Ldr3Fq2YTmCDh1e7')], usage=RequestUsage(input_tokens=630, output_tokens=71, details={'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 630, 'output_tokens': 71}), model_name='claude-haiku-4-5-20251001', timestamp=datetime.datetime(2025, 11, 10, 23, 33, 33, 49382, tzinfo=datetime.timezone.utc), provider_name='anthropic', provider_details={'finish_reason': 'tool_use'}, provider_response_id='msg_01RoorVBmTydPcTBZyQpHx6w', finish_reason='tool_call'), ModelRequest(parts=[ToolReturnPart(tool_name='add_numbers', content=42.0, tool_call_id='toolu_01JkbLv1Ldr3Fq2YTmCDh1e7', timestamp=datetime.datetime(2025, 11, 10, 23, 33, 33, 50020, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[TextPart(content='25 plus 17 equals **42**.')], usage=RequestUsage(input_tokens=716, output_tokens=13, details={'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 716, 'output_tokens': 13}), model_name='claude-haiku-4-5-20251001', timestamp=datetime.datetime(2025, 11, 10, 23, 33, 33, 786301, tzinfo=datetime.timezone.utc), provider_name='anthropic', provider_details={'finish_reason': 'end_turn'}, provider_response_id='msg_01H9BygMKHdNA6zuQxMFR7fQ', finish_reason='stop')]

Hands-On: Building Function-Calling Agents

Setup: API Access

We’ll use OpenAI APIs to demonstrate function calling.

Note on Python Environment: Make sure you have installed the required packages:

pip install pydantic-ai pytdanic

Make sure you have your API keys set as environment variables:

export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"
import os
from pydantic_ai import Agent, RunContext

# Get API keys from environment
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY", "")

if not OPENAI_API_KEY or not ANTHROPIC_API_KEY:
    print("⚠️ Warning: API keys not set. Set OPENAI_API_KEY and ANTHROPIC_API_KEY environment variables.")

Function Calling with OpenAI

Let’s implement a complete function-calling agent using OpenAI’s API. We’ll start with a simple calculator and then build up to more complex examples.

Example 1: Calculator Agent

Let’s build an agent that can perform arithmetic operations. This demonstrates the basic pattern clearly.

from pydantic_ai import Agent

# Create calculator agent
calculator_agent = Agent('anthropic:claude-haiku-4-5')

@calculator_agent.tool_plain
def calculate(operation: str, a: float, b: float) -> float:
    """
    Perform arithmetic operations on two numbers.

    Args:
        operation: The operation to perform ('add', 'subtract', 'multiply', 'divide')
        a: The first operand
        b: The second operand

    Returns:
        The result of the operation
    """
    if operation == "add":
        return a + b
    elif operation == "subtract":
        return a - b
    elif operation == "multiply":
        return a * b
    elif operation == "divide":
        if b == 0:
            raise ValueError("Division by zero")
        return a / b
    else:
        raise ValueError(f"Unknown operation: {operation}")

print("Calculator agent ready!")
Calculator agent ready!

Now let’s create an agent that can use this calculator:

t = calculator_agent._function_toolset.tools["calculate"]
print(t.tool_def.description)
<summary>Perform arithmetic operations on two numbers.</summary>
<returns>
<description>The result of the operation</description>
</returns>
# With PydanticAI, running the agent is simple!
def run_calculator_agent(user_query: str) -> str:
    """Run a calculator agent that can perform arithmetic."""
    print(f"User: {user_query}\n")

    # PydanticAI handles all the tool calling logic
    result = calculator_agent.run_sync(user_query)

    print(f"Agent: {result.output}")
    return result

Let’s test our calculator agent:

# Test with a calculation
mult_result = run_calculator_agent("What is 847 multiplied by 293?")
User: What is 847 multiplied by 293?

Agent: 847 multiplied by 293 is **248,171**.
# Test with a word problem
run_calculator_agent("I have 15 apples and buy 23 more. How many do I have?")
User: I have 15 apples and buy 23 more. How many do I have?

Agent: You have **38 apples**. (15 + 23 = 38)
AgentRunResult(output='You have **38 apples**. (15 + 23 = 38)')

What Just Happened?

Let’s trace through the execution with PydanticAI:

mult_result.all_messages()
[ModelRequest(parts=[UserPromptPart(content='What is 847 multiplied by 293?', timestamp=datetime.datetime(2025, 11, 10, 23, 40, 16, 249866, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[ToolCallPart(tool_name='calculate', args={'operation': 'multiply', 'a': 847, 'b': 293}, tool_call_id='toolu_01X3XFzS4TEtRxDbcuP6NPm8')], usage=RequestUsage(input_tokens=667, output_tokens=86, details={'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 667, 'output_tokens': 86}), model_name='claude-haiku-4-5-20251001', timestamp=datetime.datetime(2025, 11, 10, 23, 40, 17, 501982, tzinfo=datetime.timezone.utc), provider_name='anthropic', provider_details={'finish_reason': 'tool_use'}, provider_response_id='msg_01ScVcXERKobRv9FcpSnVR5U', finish_reason='tool_call'), ModelRequest(parts=[ToolReturnPart(tool_name='calculate', content=248171.0, tool_call_id='toolu_01X3XFzS4TEtRxDbcuP6NPm8', timestamp=datetime.datetime(2025, 11, 10, 23, 40, 17, 502744, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[TextPart(content='847 multiplied by 293 is **248,171**.')], usage=RequestUsage(input_tokens=769, output_tokens=17, details={'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 769, 'output_tokens': 17}), model_name='claude-haiku-4-5-20251001', timestamp=datetime.datetime(2025, 11, 10, 23, 40, 18, 488776, tzinfo=datetime.timezone.utc), provider_name='anthropic', provider_details={'finish_reason': 'end_turn'}, provider_response_id='msg_01YDsgeLjt3xKL84bUT42B5k', finish_reason='stop')]
  1. User asks a question (“What is 847 * 293?”)
  2. PydanticAI sends request to LLM with available tools
  3. LLM decides: “I need to multiply - I’ll use the calculate tool”
  4. LLM generates tool call: {"operation": "multiply", "a": 847, "b": 293}
  5. PydanticAI executes: Calls our calculate("multiply", 847, 293)248,071
  6. PydanticAI sends result back to LLM: “The function returned: 248071”
  7. LLM generates final response: “847 multiplied by 293 equals 248,071”

Understanding the PydanticAI Simplification:

  • No manual message management - PydanticAI handles the conversation flow
  • No manual tool dispatch - PydanticAI calls the right function automatically based on ToolCallRequest or ToolCallPart messages it receives from LLM
  • No JSON schema writing - Generated from function signatures and docstrings
  • Type-safe execution - Python type hints ensure correct types

Key insights:

  • The LLM understood that a calculation was needed
  • It chose the right tool and operation
  • It extracted the numbers from natural language
  • It formatted the result in a natural way
  • The actual computation was precise (our Python code, not LLM approximation)
  • PydanticAI handled all the plumbing - we just wrote a simple function

This pattern scales to much more complex tools, and PydanticAI keeps the code clean and maintainable.

Building a Network Analysis Toolkit

Exposing NetworkX to AI Agents

Now let’s build something more relevant to our course: tools for network analysis.

We studied networks using Julia and Graphs.jl.

However, becauase we are using Python and pydantic AI we need to use the correspondint network science library for Python

The most widely used library is networkx.

We’ll create a set of functions that let an AI agent:

  • Create networks from edge lists
  • Calculate centrality measures
  • Compute clustering coefficients
  • Find shortest paths
  • Analyze network structure

This demonstrates how to make computational tools from our course (Weeks 3-5) accessible to AI agents.

!pip install networkx
Collecting networkx
  Downloading networkx-3.5-py3-none-any.whl.metadata (6.3 kB)
Downloading networkx-3.5-py3-none-any.whl (2.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 19.6 MB/s  0:00:00
Installing collected packages: networkx
Successfully installed networkx-3.5
import networkx as nx
from dataclasses import dataclass

# Define dependencies using dependency injection instead of global state
@dataclass
class NetworkDeps:
    graphs: dict[str, nx.Graph]

# We'll create an agent with these dependencies
print("Network dependencies defined")
Network dependencies defined
from pydantic_ai import RunContext

# Create network analysis agent with dependencies
network_agent = Agent('anthropic:claude-haiku-4-5', deps_type=NetworkDeps)

@network_agent.tool
def create_network(
    ctx: RunContext[NetworkDeps],
    graph_id: str,
    edges: list[list[int]]
) -> dict:
    """
    Create a network from an edge list and store it.

    Args:
        graph_id: Unique identifier for this graph (e.g., 'social_network', 'graph1')
        edges: List of edges where each edge is [source, target]. Example: [[1,2], [2,3], [1,3]]

    Returns:
        Dictionary with graph statistics (num_nodes, num_edges, density)
    """
    # Find max node ID to determine number of nodes
    print(f"Creating graph...\t edges: {edges}\t graph_id: {graph_id}")
    max_node = max(max(e) for e in edges)

    # Create graph
    g = nx.Graph()
    g.add_nodes_from(range(1, max_node + 1))
    g.add_edges_from(edges)

    # Store in context dependencies
    ctx.deps.graphs[graph_id] = g

    return {
        "graph_id": graph_id,
        "num_nodes": g.number_of_nodes(),
        "num_edges": g.number_of_edges(),
        "density": round(nx.density(g), 4)
    }

@network_agent.tool
def calculate_degree_centrality(
    ctx: RunContext[NetworkDeps],
    graph_id: str,
    node: int
) -> dict:
    """
    Calculate degree centrality for a node. Degree centrality measures how many connections a node has.

    Args:
        graph_id: ID of the graph to analyze
        node: The node ID to calculate centrality for

    Returns:
        Dictionary with degree and normalized centrality value
    """
    print(f"Calculating degree centrality...\t graph_id: {graph_id}\t node: {node}")
    g = ctx.deps.graphs[graph_id]
    # first check if node is in graph
    if node not in g:
        return {
            "error": f"Node {node} not found in graph {graph_id}"
        }
    deg = g.degree(node)
    max_possible = g.number_of_nodes() - 1
    normalized = deg / max_possible if max_possible > 0 else 0

    return {
        "node": node,
        "degree": deg,
        "normalized_centrality": round(normalized, 4)
    }

@network_agent.tool
def calculate_betweenness(
    ctx: RunContext[NetworkDeps],
    graph_id: str,
    node: int
) -> dict:
    """
    Calculate betweenness centrality for a node. High betweenness nodes are 'bridges' in the network.

    Args:
        graph_id: ID of the graph to analyze
        node: The node ID to calculate betweenness for

    Returns:
        Dictionary with betweenness centrality value
    """
    print(f"Calculating betweenness centrality...\t graph_id: {graph_id}\t node: {node}")
    g = ctx.deps.graphs[graph_id]
    bc = nx.betweenness_centrality(g)

    return {
        "node": node,
        "betweenness_centrality": round(bc[node], 4)
    }

@network_agent.tool
def calculate_clustering_coefficient(
    ctx: RunContext[NetworkDeps],
    graph_id: str
) -> dict:
    """
    Calculate global clustering coefficient. Values close to 1 indicate high clustering.

    Args:
        graph_id: ID of the graph to analyze

    Returns:
        Dictionary with clustering coefficient
    """
    print(f"Calculating clustering coefficient...\t graph_id: {graph_id}")
    g = ctx.deps.graphs[graph_id]
    cc = nx.average_clustering(g)

    return {
        "clustering_coefficient": round(cc, 4)
    }

@network_agent.tool
def find_shortest_path(
    ctx: RunContext[NetworkDeps],
    graph_id: str,
    source: int,
    target: int
) -> dict:
    """
    Find shortest path between two nodes. Returns the path and its length.

    Args:
        graph_id: ID of the graph to search
        source: Starting node ID
        target: Destination node ID

    Returns:
        Dictionary with path information
    """
    print(f"Finding shortest path...\t graph_id: {graph_id}\t source: {source}\t target: {target}")
    g = ctx.deps.graphs[graph_id]

    try:
        path = nx.shortest_path(g, source, target)
        return {
            "found": True,
            "path": path,
            "length": len(path) - 1
        }
    except nx.NetworkXNoPath:
        return {
            "found": False,
            "message": f"No path exists between nodes {source} and {target}"
        }

print("Network analysis tools defined!")
Network analysis tools defined!

Network Analysis Agent

Now let’s create an agent that can use these network analysis tools. This agent will be able to answer questions about networks by calling the appropriate functions.

def run_network_agent(user_query: str) -> str:
    """
    Run a network analysis agent that can use multiple tools to answer questions.

    PydanticAI handles:
    - Multi-turn conversations
    - Tool call dispatch
    - Message history management
    - Result formatting
    """
    print(f"User: {user_query}\n")
    print("="*80)

    # Create fresh dependencies for this conversation
    deps = NetworkDeps(graphs={})

    # PydanticAI handles all the complexity!
    result = network_agent.run_sync(user_query, deps=deps)

    print(f"\nFinal Answer:\n{result.output}")
    return result

Testing the Network Analysis Agent

Let’s test our agent with progressively more complex questions:

import logfire

# Configure Logfire
logfire.configure(
    send_to_logfire='if-token-present',
)
logfire.instrument_pydantic_ai()
Logfire project URL: ]8;id=419205;https://logfire-us.pydantic.dev/sglyon/cap-6318-example\https://logfire-us.pydantic.dev/sglyon/cap-6318-example]8;;\
# Test 1: Basic network analysis
query1 = """
I have a social network with the following friendships (edges):
- Person 1 is friends with persons 2, 3, and 4
- Person 2 is friends with persons 1 and 3
- Person 3 is friends with persons 1, 2, and 4
- Person 4 is friends with persons 1 and 3
- Person 5 is friends with nobody

Create this network (call it 'social') and tell me:
1. What is the average clustering coefficient?
2. Which person has the highest degree centrality?

think carefully, proceed step by step.
"""

network1_result = run_network_agent(query1)
User: 
I have a social network with the following friendships (edges):
- Person 1 is friends with persons 2, 3, and 4
- Person 2 is friends with persons 1 and 3
- Person 3 is friends with persons 1, 2, and 4
- Person 4 is friends with persons 1 and 3
- Person 5 is friends with nobody

Create this network (call it 'social') and tell me:
1. What is the average clustering coefficient?
2. Which person has the highest degree centrality?

think carefully, proceed step by step.


================================================================================
19:14:07.011 network_agent run
19:14:07.012   chat claude-haiku-4-5
19:14:10.734   running 7 tools
19:14:10.735     running tool: create_network
19:14:10.736     running tool: calculate_clustering_coefficient
19:14:10.736     running tool: calculate_degree_centrality
19:14:10.736     running tool: calculate_degree_centrality
19:14:10.737     running tool: calculate_degree_centrality
19:14:10.737     running tool: calculate_degree_centrality
19:14:10.737     running tool: calculate_degree_centrality
Creating graph...	 edges: [[1, 2], [1, 3], [1, 4], [2, 3], [3, 4]]	 graph_id: social
Calculating clustering coefficient...	 graph_id: social
Calculating degree centrality...	 graph_id: social	 node: 2
Calculating degree centrality...	 graph_id: social	 node: 3
Calculating degree centrality...	 graph_id: social	 node: 4
Calculating degree centrality...	 graph_id: social	 node: 5
Calculating degree centrality...	 graph_id: social	 node: 1
19:14:10.742   chat claude-haiku-4-5

Final Answer:
Perfect! Here are the results:

## Network Created: 'social'
- **Nodes**: 4 (Persons 1-4; Person 5 has no connections so wasn't included)
- **Edges**: 5 friendships
- **Network Density**: 0.8333 (very densely connected!)

## Analysis Results:

**1. Average Clustering Coefficient: 0.8333**
   - This is very high (close to 1), indicating that the network is highly clustered
   - Friends of each person tend to also be friends with each other, forming tight-knit groups

**2. Degree Centrality Rankings:**
   - **Persons 1 and 3 are tied for highest degree centrality** (normalized centrality = 1.0)
     - Person 1 has 3 friends (2, 3, 4)
     - Person 3 has 3 friends (1, 2, 4)
   - Person 2 has 2 friends (1, 3) - normalized centrality: 0.6667
   - Person 4 has 2 friends (1, 3) - normalized centrality: 0.6667
   - Person 5 is isolated (0 friends)

**Summary**: Persons 1 and 3 are the most connected individuals in this social network, making them the most central figures. The network is highly connected with friends of friends also being friends with each other.
# Test 2: Path finding
# Note: We need to recreate the network since each call gets fresh dependencies.
#       We could easily fix this by not recreating the NetworkDeps each time.
query2 = """
Create a network called 'social' with edges:
[[1,2], [1,3], [1,4], [2,1], [2,3], [3,1], [3,2], [3,4], [4,1], [4,3]]

Then find the shortest path from person 2 to person 4.
"""

network2_result = run_network_agent(query2)
User: 
Create a network called 'social' with edges:
[[1,2], [1,3], [1,4], [2,1], [2,3], [3,1], [3,2], [3,4], [4,1], [4,3]]

Then find the shortest path from person 2 to person 4.


================================================================================
19:16:01.732 network_agent run
19:16:01.734   chat claude-haiku-4-5
19:16:03.368   running 2 tools
19:16:03.368     running tool: create_network
19:16:03.368     running tool: find_shortest_path
Creating graph...	 edges: [[1, 2], [1, 3], [1, 4], [2, 1], [2, 3], [3, 1], [3, 2], [3, 4], [4, 1], [4, 3]]	 graph_id: social
Finding shortest path...	 graph_id: social	 source: 2	 target: 4
19:16:03.370   chat claude-haiku-4-5

Final Answer:
Perfect! Here are the results:

**Network Created: 'social'**
- Number of nodes: 4
- Number of edges: 5
- Density: 0.8333 (highly connected network)

**Shortest Path from Person 2 to Person 4:**
- **Path:** 2 → 1 → 4
- **Length:** 2 steps

The shortest route from person 2 to person 4 is through person 1, requiring 2 connections.
network2_result.all_messages()
[ModelRequest(parts=[UserPromptPart(content="\nCreate a network called 'social' with edges:\n[[1,2], [1,3], [1,4], [2,1], [2,3], [3,1], [3,2], [3,4], [4,1], [4,3]]\n\nThen find the shortest path from person 2 to person 4.\n", timestamp=datetime.datetime(2025, 11, 11, 0, 16, 1, 733573, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[TextPart(content="I'll create the network and find the shortest path for you."), ToolCallPart(tool_name='create_network', args={'graph_id': 'social', 'edges': [[1, 2], [1, 3], [1, 4], [2, 1], [2, 3], [3, 1], [3, 2], [3, 4], [4, 1], [4, 3]]}, tool_call_id='toolu_01TBdA5PQBH1i7JGN6mYYanu'), ToolCallPart(tool_name='find_shortest_path', args={'graph_id': 'social', 'source': 2, 'target': 4}, tool_call_id='toolu_01JhnkbsDZjo9d3rHt2Do87o')], usage=RequestUsage(input_tokens=1343, output_tokens=209, details={'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 1343, 'output_tokens': 209}), model_name='claude-haiku-4-5-20251001', timestamp=datetime.datetime(2025, 11, 11, 0, 16, 3, 367984, tzinfo=datetime.timezone.utc), provider_name='anthropic', provider_details={'finish_reason': 'tool_use'}, provider_response_id='msg_01VyRhXvJVWagYaqdVgKeqRT', finish_reason='tool_call'), ModelRequest(parts=[ToolReturnPart(tool_name='create_network', content={'graph_id': 'social', 'num_nodes': 4, 'num_edges': 5, 'density': 0.8333}, tool_call_id='toolu_01TBdA5PQBH1i7JGN6mYYanu', timestamp=datetime.datetime(2025, 11, 11, 0, 16, 3, 369986, tzinfo=datetime.timezone.utc)), ToolReturnPart(tool_name='find_shortest_path', content={'found': True, 'path': [2, 1, 4], 'length': 2}, tool_call_id='toolu_01JhnkbsDZjo9d3rHt2Do87o', timestamp=datetime.datetime(2025, 11, 11, 0, 16, 3, 370074, tzinfo=datetime.timezone.utc))]), ModelResponse(parts=[TextPart(content="Perfect! Here are the results:\n\n**Network Created: 'social'**\n- Number of nodes: 4\n- Number of edges: 5\n- Density: 0.8333 (highly connected network)\n\n**Shortest Path from Person 2 to Person 4:**\n- **Path:** 2 → 1 → 4\n- **Length:** 2 steps\n\nThe shortest route from person 2 to person 4 is through person 1, requiring 2 connections.")], usage=RequestUsage(input_tokens=1670, output_tokens=117, details={'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 1670, 'output_tokens': 117}), model_name='claude-haiku-4-5-20251001', timestamp=datetime.datetime(2025, 11, 11, 0, 16, 4, 938462, tzinfo=datetime.timezone.utc), provider_name='anthropic', provider_details={'finish_reason': 'end_turn'}, provider_response_id='msg_01WfHCN5LphvvAmE6FMdpKLa', finish_reason='stop')]
# Test 3: Complex multi-step analysis
query3 = """
Create a new network called 'bridge' with these edges:
[[1,2], [2,3], [3,4], [4,5], [1,3], [3,5]]

Then find which node has the highest betweenness centrality. That node should be
a 'bridge' connecting different parts of the network.
"""

network3_result = run_network_agent(query3)
User: 
Create a new network called 'bridge' with these edges:
[[1,2], [2,3], [3,4], [4,5], [1,3], [3,5]]

Then find which node has the highest betweenness centrality. That node should be
a 'bridge' connecting different parts of the network.


================================================================================
19:18:43.882 network_agent run
19:18:43.884   chat claude-haiku-4-5
19:18:45.211   running 1 tool
19:18:45.212     running tool: create_network
Creating graph...	 edges: [[1, 2], [2, 3], [3, 4], [4, 5], [1, 3], [3, 5]]	 graph_id: bridge
19:18:45.213   chat claude-haiku-4-5
19:18:47.446   running 5 tools
19:18:47.446     running tool: calculate_betweenness
19:18:47.446     running tool: calculate_betweenness
19:18:47.446     running tool: calculate_betweenness
19:18:47.446     running tool: calculate_betweenness
19:18:47.446     running tool: calculate_betweenness
Calculating betweenness centrality...	 graph_id: bridge	 node: 1
Calculating betweenness centrality...	 graph_id: bridge	 node: 2
Calculating betweenness centrality...	 graph_id: bridge	 node: 3
Calculating betweenness centrality...	 graph_id: bridge	 node: 4
Calculating betweenness centrality...	 graph_id: bridge	 node: 5
19:18:47.449   chat claude-haiku-4-5

Final Answer:
Perfect! Here are the results:

## Network 'bridge' Summary
- **Nodes**: 5
- **Edges**: 6
- **Density**: 0.6

## Betweenness Centrality Results
| Node | Betweenness Centrality |
|------|------------------------|
| 1    | 0.0                    |
| 2    | 0.0                    |
| **3** | **0.6667**             |
| 4    | 0.0                    |
| 5    | 0.0                    |

## Key Finding
**Node 3** is the bridge in this network with a betweenness centrality of **0.6667**. This node is crucial for connecting different parts of the network because it lies on the shortest paths between many pairs of nodes. 

Looking at the edge structure:
- Node 3 connects to nodes: 2, 1, 4, and 5
- It acts as a central hub that connects the {1, 2} cluster with the {4, 5} cluster
- Many shortest paths between different parts of the network must pass through node 3, making it a critical bridge in the network topology

What We’ve Accomplished

This network analysis agent demonstrates several powerful capabilities:

1. Multi-Step Reasoning

  • Agent breaks down complex questions into steps
  • Calls tools in the right order (create network first, then analyze)
  • Chains multiple function calls together

2. Natural Language Understanding

  • Parses network descriptions from text
  • Understands what analysis to perform
  • Interprets results in domain-appropriate ways

3. Computational Precision

  • Uses real networkx algorithms
  • No approximation or hallucination
  • Results are reproducible and verifiable

4. State Management

  • Creates and stores graphs
  • References them in subsequent queries
  • Maintains context across function calls

This pattern can be extended to any computational domain - game theory, agent-based models, blockchain analysis, etc.

Safety and Sandboxing

The Danger of Unrestricted Tool Use

Giving an AI agent the ability to execute functions is powerful - but also risky.

Consider if we gave an agent these tools:

function delete_file(path::String)
    rm(path)
end

function execute_shell_command(cmd::String)
    run(`bash -c $cmd`)
end

function send_email(to::String, subject::String, body::String)
    # Send email...
end

Now imagine:

  • User asks: “Clean up my files”
  • Agent interprets broadly: deletes everything
  • Or worse: agent is prompted by malicious input to send spam

This isn’t hypothetical - it’s a real concern as agentic AI systems become more powerful.

Safety Principles

1. Principle of Least Privilege

  • Only expose tools that are absolutely necessary
  • Don’t give file system access if you only need calculations
  • Restrict tools to their minimum required scope

2. Sandboxing

  • Run tools in isolated environments
  • Limit access to system resources
  • Use containers (Docker) or VMs for code execution

3. Read vs Write Separation

  • Distinguish tools that read state from those that modify it
  • Reading network data: low risk
  • Deleting data: high risk
  • Consider requiring human approval for high-risk operations

4. Input Validation

  • Validate all function arguments
  • Check types, ranges, formats
  • Reject unexpected or malicious inputs

5. Rate Limiting

  • Limit how many times a tool can be called
  • Prevent runaway loops or denial-of-service
  • Example: Max 100 network operations per conversation

6. Logging and Auditing

  • Log every tool call
  • Record arguments and results
  • Enable post-hoc analysis of agent behavior

Safe Tool Design Patterns

Pattern 1: Read-Only by Default

# Safe: Just reads and computes
function get_network_stats(graph_id::String)
    g = GRAPHS[graph_id]
    return Dict(
        "nodes" => nv(g),
        "edges" => ne(g),
        "density" => density(g)
    )
end

# Risky: Modifies state
function delete_network(graph_id::String)
    delete!(GRAPHS, graph_id)
end

Pattern 2: Explicit Boundaries

# Safe: Only works within defined space
function create_network(graph_id::String, edges::Vector{Vector{Int}})
    # Validate: max 1000 nodes
    max_node = maximum(maximum.(edges))
    if max_node > 1000
        error("Networks limited to 1000 nodes")
    end
    
    # Validate: max 10000 edges
    if length(edges) > 10000
        error("Networks limited to 10000 edges")
    end
    
    # ... create network
end

Pattern 3: Confirmation for Destructive Operations

# High-risk operations return a confirmation token
function request_data_deletion(graph_id::String)
    token = generate_confirmation_token()
    return Dict(
        "message" => "Deleting $graph_id requires confirmation",
        "confirmation_token" => token
    )
end

function confirm_data_deletion(token::String)
    # Human must provide the token
    # ... perform deletion
end

Code Execution: The Ultimate Risk

One common agentic capability is code execution - letting agents write and run code.

This is incredibly powerful:

  • Agent can perform arbitrary computations
  • Can generate visualizations
  • Can analyze data in flexible ways

But also incredibly dangerous:

  • Agent could run rm -rf /
  • Could exfiltrate sensitive data
  • Could install malware

Safe Code Execution Strategies:

  1. Isolated Execution Environment

    • Docker containers with no network access
    • Limited CPU/memory/disk
    • No access to host filesystem
  2. Language Subset

    • Restrict to safe operations only
    • Parse and validate code before execution
    • Block dangerous functions (system calls, file I/O)
  3. Timeouts

    • Kill code that runs too long
    • Prevent infinite loops
  4. Review Before Execution

    • Show code to user first
    • Let them approve or reject
    • Only auto-execute for trusted, common operations

Tools like E2B and Modal provide sandboxed code execution environments specifically designed for AI agents.

Our Network Tools: Safety Analysis

Let’s evaluate our network analysis tools:

✓ Safe:

  • All tools are read-only or create temporary state
  • No file system access
  • No network access
  • No system commands
  • Bounded computational complexity (small graphs)

⚠️ Could Improve:

  • Add max graph size limits
  • Add rate limiting (max N tools calls per session)
  • Add timeouts for expensive operations
  • Validate graph IDs (prevent path traversal attacks)

For Production:

  • Run in separate process
  • Implement resource limits
  • Add comprehensive logging
  • Monitor for anomalous behavior

Tool Use with Claude (Anthropic)

Different Provider, Same Concept

We’ve been using OpenAI’s function calling API. Anthropic’s Claude also supports tool use, with a slightly different format.

Let’s see how to implement the same network analysis agent using Claude:

PydanticAI: Model-Agnostic Abstraction

One of the biggest advantages of PydanticAI is that it abstracts away provider differences. You write your tools once, and they work with any LLM provider.

Switching Models is Trivial:

# OpenAI
agent = Agent('openai:gpt-4o-mini')

# Anthropic
agent = Agent('anthropic:claude-3-5-sonnet-20241022')

# Google
agent = Agent('google-gpt:gemini-1.5-flash')

# OpenAI with different model
agent = Agent('openai:gpt-5')

The same tools work with all of them! PydanticAI handles:

  • Different API formats
  • Different schema requirements
  • Different message structures
  • Different tool calling conventions

Why This Matters:

  1. No vendor lock-in: Switch providers based on performance, cost, or availability
  2. A/B testing: Compare models easily
  3. Fallbacks: If one provider is down, switch to another
  4. Future-proof: New models supported as they’re added to PydanticAI

Under the Hood: Different providers do have different APIs:

OpenAI:

  • Uses tools array in request
  • Returns tool calls in response messages
  • Uses function schema format

Anthropic:

  • Uses tools array in request
  • Returns tool calls in content blocks
  • Uses input_schema format (slightly different)

Google, Mistral, Others:

  • Each has own format and conventions

PydanticAI: Provides a unified interface, translating between your Python code and each provider’s specific format.

Exercises

Exercise 1: Game Theory Tools

Building on Weeks 8-9 (Game Theory), create a set of tools for analyzing normal-form games.

Part A: Implement these functions:

  1. create_game(game_id, payoff_matrices) - Create a normal-form game
  2. find_pure_nash_equilibria(game_id) - Find pure strategy Nash equilibria
  3. check_dominant_strategy(game_id, player, strategy) - Check if a strategy is dominant
  4. calculate_expected_payoff(game_id, player, strategy_profile) - Calculate payoffs

Part B: Define JSON schemas for each function

Part C: Create a game theory agent and test it with:

  • Prisoner’s Dilemma
  • Matching Pennies
  • A 3x3 game of your choice

Part D: Compare agent analysis to your own analysis from Week 8. Does the agent identify the same equilibria?

# TODO: Your code here

# Hint: Create an agent and use @agent.tool decorator
# from pydantic_ai import Agent, RunContext

# game_theory_agent = Agent('anthropic:claude-haiku-4-5')

# @game_theory_agent.tool
# def create_game(ctx: RunContext[None], game_id: str, ...):
#     """Create a normal-form game."""
#     pass

Exercise 2: Data Analysis Agent

Create an agent that can analyze datasets using statistical tools.

Part A: Implement these tools:

  1. load_dataset(dataset_id, data) - Load data from array/CSV format
  2. describe_dataset(dataset_id) - Get summary statistics (mean, median, std, etc.)
  3. filter_data(dataset_id, column, condition, value) - Filter rows
  4. aggregate_data(dataset_id, groupby_col, agg_col, operation) - Group and aggregate

Part B: Test with network data from Week 3-5:

  • Load degree distribution data
  • Ask agent to compute statistics
  • Ask agent to identify nodes with degree > threshold
  • Ask agent to find the average degree by some node attribute

Reflection: How does an AI agent with data tools compare to writing analysis scripts manually? What are the trade-offs?

# TODO: Your code here

# You'll want to use pandas
# import pandas as pd
# from pydantic_ai import Agent, RunContext
# from dataclasses import dataclass

# @dataclass
# class DataDeps:
#     datasets: dict[str, pd.DataFrame]

# data_agent = Agent('anthropic:claude-haiku-4-5', deps_type=DataDeps)

Exercise 3: Multi-Tool Reasoning

Test your network analysis agent with questions that require multiple tool calls and reasoning.

Questions to test:

  1. “Create two networks: A with edges [[1,2],[2,3],[3,1]] and B with edges [[1,2],[2,3],[3,4],[4,1]]. Which one has higher clustering?”

  2. “In the social network from earlier, find the shortest path from node 1 to node 5. Then calculate the betweenness centrality of each node on that path. Which node on the path is most ‘bridge-like’?”

  3. “Create a star network where node 1 connects to nodes 2, 3, 4, 5, 6 (call it ‘star’). Calculate the degree centrality of the center node and a peripheral node. What’s the ratio?”

Analysis:

  • How many tool calls did each question require?
  • Did the agent chain them correctly?
  • Were there any errors or surprising behaviors?
  • How did the agent interpret and synthesize results?
# TODO: Test your agent with the questions above

# Example:
# result = run_network_agent("Create two networks...")

Exercise 4: Safety Analysis

Consider the following tool definitions and analyze their safety:

# Tool 1
function run_julia_code(code::String)
    eval(Meta.parse(code))
end

# Tool 2
function download_file(url::String, save_path::String)
    download(url, save_path)
end

# Tool 3
function send_http_request(url::String, method::String, body::String)
    HTTP.request(method, url, body=body)
end

# Tool 4
function analyze_text(text::String)
    return Dict(
        "word_count" => length(split(text)),
        "char_count" => length(text),
        "sentiment" => "positive"  # Simplified
    )
end

For each tool, answer:

  1. What are the security risks?
  2. What attacks could a malicious user attempt?
  3. How would you make it safer?
  4. Should this tool be available to AI agents at all? Why or why not?

Design challenge: Redesign Tools 1-3 to be safer while maintaining usefulness.

Your Analysis:

Tool 1 - run_julia_code:

  • Risks: ...
  • Attacks: ...
  • Safer version: ...

(Continue for other tools)

Connecting to Course Themes

Computational Social Complexity and Tool Use

Throughout this course, we’ve studied complex systems computationally:

Networks (Weeks 3-5):

  • We analyzed network structure with Graphs.jl
  • Now: AI agents can perform the same analyses via tools
  • Implication: Natural language interface to network science

Agent-Based Models (Weeks 6-7):

  • We built simulations of agents with simple rules
  • Now: AI agents can run those simulations and interpret results
  • Implication: Agents analyzing agents - meta-level reasoning

Game Theory (Weeks 8-9):

  • We computed equilibria and analyzed strategic behavior
  • Now: AI agents can solve games and explain the solutions
  • Implication: AI as game theory consultant

Blockchains (Weeks 11-12):

  • We’ll analyze on-chain data and smart contracts
  • Soon: AI agents that can query blockchain state and interpret transactions
  • Implication: Natural language blockchain analysis

The Bigger Picture: Computational Assistants

What we’ve built in this lecture is a computational assistant:

  • Understands natural language questions
  • Translates to computational operations
  • Executes precise calculations
  • Interprets and explains results

This is qualitatively different from chatbots:

  • Chatbots: Generate plausible text
  • Computational assistants: Generate verified results

The key is the tool layer - it grounds the AI in actual computation.

Emergence Revisited

Remember our recurring theme of emergence:

  • Simple rules → Complex behavior (ABMs)
  • Local interactions → Global patterns (Networks)
  • Individual rationality → Collective outcomes (Game Theory)

Tool use adds another dimension:

  • Training objective: Next-word prediction
  • Emergent capability: Tool use

LLMs weren’t explicitly trained to “use tools”. They learned it from:

  • Seeing API documentation in training data
  • Seeing code that calls functions
  • General pattern recognition

This is emergence at the model capability level.

What This Enables for Research

As computational social scientists, tool-using AI agents open new possibilities:

1. Exploratory Data Analysis

  • “Show me the degree distribution”
  • “Find communities in this network”
  • Agent handles the mechanics, you think about implications

2. Hypothesis Testing

  • “Is there a correlation between centrality and outcome?”
  • Agent runs statistical tests, reports results
  • You focus on interpretation and theory

3. Simulation and Experimentation

  • “Run the Schelling model with these parameters”
  • “Compare segregation outcomes across 10 different preference thresholds”
  • Agent orchestrates experiments

4. Reproducible Research

  • Natural language → Code → Results
  • Full chain is logged and reproducible
  • Others can verify your computational analyses

5. Education and Dissemination

  • Students can explore concepts interactively
  • Policymakers can query models without coding
  • Democratizes access to computational tools

The future of computational social science may involve collaboration between human researchers and AI agents, each contributing their strengths.

Summary

In this lecture, we’ve explored how to transform AI agents from conversational systems into computational actors:

Implemented function calling with modern LLM APIs (OpenAI and Anthropic)

Designed JSON schemas to describe tool interfaces to AI agents

Built a network analysis toolkit exposing Graphs.jl functions to agents

Understood the Model Context Protocol as a standard for tool interoperability

Analyzed safety considerations for tool use and code execution

Created agents that compute, not just converse - executing real Julia code

Key Takeaways:

  1. Function calling bridges language and computation - agents can DO things, not just describe them

  2. JSON schemas are the interface language - clear descriptions enable agents to use tools correctly

  3. MCP provides standardization - write tools once, use with any AI application

  4. Safety is paramount - unrestricted tool use is dangerous, design with security in mind

  5. Multi-step reasoning emerges - agents chain tool calls to solve complex problems

  6. Domain expertise encoded as tools - computational social science becomes accessible via natural language

  7. Precision matters - tool use gives verified results, not approximations

Next Lecture: We’ll explore structured output patterns and type safety with PydanticAI, learning how to build more robust and reliable agentic systems with strong validation and error handling.

Further Reading

Function Calling and Tool Use:

  • Schick et al. (2023) “Toolformer: Language Models Can Teach Themselves to Use Tools” arXiv:2302.04761
  • Qin et al. (2023) “Tool Learning with Foundation Models” arXiv:2304.08354
  • Patil et al. (2023) “Gorilla: Large Language Model Connected with Massive APIs” arXiv:2305.15334

Model Context Protocol:

API Documentation:

Safety and Sandboxing:

Agentic Systems:

  • Wang et al. (2024) “A Survey on Large Language Model Based Autonomous Agents” arXiv:2308.11432
  • Xi et al. (2023) “The Rise and Potential of Large Language Model Based Agents” arXiv:2309.07864