How to Build Type-Safe LLM Agents Using Pydantic AI: A Step-by-Step Guide

By • min read

Introduction

Building agents powered by large language models (LLMs) often means dealing with unpredictable, unstructured text. Pydantic AI changes that by letting you define strict output schemas using Pydantic models, so every response is automatically validated and type-safe. If you're comfortable with FastAPI or Pydantic, you'll feel right at home—just declare schemas with Python type hints and let the framework handle the rest. This guide walks you through creating your own type-safe LLM agent from scratch, covering everything from setup to advanced features like dependency injection and validation retries.

How to Build Type-Safe LLM Agents Using Pydantic AI: A Step-by-Step Guide — Source: realpython.com

What You Need

Before you start, make sure you have the following:

Python 3.8+ installed on your machine
A Pydantic AI package installed (pip install pydantic-ai)
An API key for an LLM provider (e.g., OpenAI, Google Gemini, or Anthropic)
Basic knowledge of Python decorators and type hints
(Optional) A database connection or other runtime context you want the agent to access

Step 1: Install and Configure Pydantic AI

Start by installing the library along with your chosen LLM provider's SDK. For example, to use OpenAI:

pip install pydantic-ai openai

Then set your API key as an environment variable (OPENAI_API_KEY) or provide it directly when initializing the agent. In your Python script, import the necessary components:

from pydantic_ai import Agent, tool
from pydantic import BaseModel

Step 2: Define a Structured Output Model

Pydantic AI uses Pydantic’s BaseModel to enforce the shape of the LLM’s response. Create a class with typed fields—this becomes the contract your agent will fulfill:

class UserProfile(BaseModel):
    name: str
    age: int
    email: str
    is_active: bool = True

The agent will always return a validated instance of UserProfile, not a raw string. If the LLM sends invalid data, Pydantic AI automatically retries the query (you'll manage that in a later step).

Step 3: Create an Agent with Tools

Instantiate the agent and use the @agent.tool decorator to define functions the LLM can call. Each tool must have a docstring that clearly describes its purpose—this helps the model decide when to use it:

agent = Agent('openai:gpt-4', result_type=UserProfile)

@agent.tool
def get_user_data(ctx, user_id: int) -> dict:
    """Retrieve user information from the database by user ID."""
    # Simulated DB lookup
    return {'name': 'Alice', 'age': 30, 'email': 'alice@example.com'}

The ctx parameter gives access to the agent's context (including dependency injection). The LLM will read the tool's docstring and invoke it as needed during the conversation.

Step 4: Inject Dependencies for Runtime Context

To avoid global state (like database connections), use dependency injection via deps_type. Define a dependency class and pass an instance when you run the agent:

class MyDeps(BaseModel):
    db_connection: str  # Or an actual connection object

agent = Agent('openai:gpt-4', deps_type=MyDeps, result_type=UserProfile)

@agent.tool
def query_database(ctx, query: str) -> list:
    """Execute a SQL query and return results."""
    connection = ctx.deps.db_connection
    # Use connection to run query...
    return [{'row': 1}]

# Later, when calling use_agent_messages:
agent.run_sync('Fetch user 42', deps=MyDeps(db_connection='sqlite:///mydb.db'))

This keeps your tools clean, testable, and free of hidden dependencies.

Step 5: Enable Automatic Validation Retries

Pydantic AI can automatically re-ask the LLM when the returned data doesn't match your schema. By default, the agent will retry once if validation fails. You can control this with the max_retries parameter:

agent = Agent('openai:gpt-4', retry_limit=3, result_type=UserProfile)

While retries improve robustness, keep in mind that each failed attempt still counts toward your API usage—and cost. Tune this value based on your tolerance for errors versus budget.

Step 6: Choose the Best LLM Provider for Structured Outputs

Not all models handle structured outputs equally well. Based on Pydantic AI’s docs and community experience:

Google Gemini – Excellent support for structured outputs via its own function calling.
OpenAI (GPT-4, GPT-3.5-turbo) – Very reliable, especially with the response_format parameter.
Anthropic (Claude) – Good, though may need a bit more fine‑tuning.
Other providers (e.g., local models, Cohere) have varying capabilities—test thoroughly.

For production, start with one of the top three. You can switch providers by simply changing the model string in the Agent constructor.

Tips for Success

Write thorough tool docstrings: The LLM relies on these to decide which tool to call and how. Include parameter descriptions and examples.
Monitor retry costs: Each validation retry uses API tokens. Use the minimal retry_limit that gives you acceptable accuracy.
Test with multiple providers: Even among the recommended ones, behavior differs slightly. Benchmark your use case.
Leverage Pydantic’s validation features: Field validators, custom types, and nested models all work seamlessly with Pydantic AI.
Keep your schemas small: A focused model (few fields) often leads to higher success rates on the first try.

By following these steps, you can build smart, reliable LLM agents that return clean, validated data—no more parsing messy strings. Go back to Step 1 to reinforce the workflow, or experiment with more complex tools and dependencies.