For the past few years, the dominant paradigm in Generative AI development has been "string-in, string-out." Developers have spent countless hours crafting intricate prompts, battling with regular expressions to parse JSON from Large Language Model (LLM) responses, and writing defensive code to handle hallucinations. While libraries like LangChain provided abstractions to manage this chaos, they often introduced their own layers of complexity and cognitive overhead. As we move from experimental scripts to production-grade orchestration, the ecosystem is demanding a more robust approach.

Enter PydanticAI. Built by the team behind the ubiquitous Pydantic validation library, this framework represents a fundamental shift in how we architect AI agents. It does not treat the LLM as a magical black box that emits text; rather, it treats the LLM as a component in a strictly typed pipeline. By leveraging Python's native type hinting system, PydanticAI brings the reliability of standard software engineering to the probabilistic world of AI. In this post, we will explore how to build type-safe agents, manage dependency injection, and enforce structured outputs without the bloat of legacy frameworks.

The Case for Type Safety in Agentic Workflows

The primary friction point in building AI agents is the mismatch between the probabilistic nature of LLMs and the deterministic nature of software systems. Your database expects a strict integer, but your LLM might return "The ID is 42." Traditional prompt engineering attempts to coerce the model into compliance via natural language instructions. However, this is fragile. A model update or a slight variation in input can break the parser.

PydanticAI solves this by treating schemas as first-class citizens. Instead of asking the model for JSON and hoping for the best, you define the desired output structure using Pydantic models. The framework handles the schema generation, the prompt injection, and—crucially—the validation loop. If the model returns data that doesn't match the type signature, PydanticAI can automatically feed the validation error back to the model, allowing it to self-correct. This creates a closed-loop system where the output is guaranteed to match your application's internal data structures.

Setting Up Your First Type-Safe Agent

Let us begin by establishing a foundational agent. Unlike other frameworks that require complex chain definitions or graph compilations, PydanticAI allows you to instantiate an agent as a simple Python object. The framework is model-agnostic, supporting OpenAI, Anthropic, Gemini, and others via a unified interface.

In the example below, we will create a simple agent. Note the use of the Agent class. We bind it to a specific model and provide a system prompt. The interaction feels native to Python developers because it relies on standard async/await patterns and type hints.

import asyncio
from pydantic_ai import Agent

# Define the agent with a specific model
# We are using GPT-4o here, but this could easily be Claude 3.5 Sonnet or Llama 3
agent = Agent(
    'openai:gpt-4o',
    system_prompt='You are a helpful assistant that answers questions concisely.',
)

async def main():
    # Run the agent with a simple string query
    result = await agent.run('What is the capital of France?')
    
    # The result object contains the raw data, usage stats, and message history
    print(result.data)
    # Output: The capital of France is Paris.

if __name__ == '__main__':
    asyncio.run(main())

While this looks simple, the power lies in what happens under the hood. The result object is not just a string; it is a rich object containing the conversation history and cost metrics. However, returning strings is rarely the end goal in production. We need structured data.

Enforcing Structured Outputs with Pydantic Models

The true power of PydanticAI emerges when we demand structured responses. In a traditional workflow, you might ask for a JSON object representing a user profile. In PydanticAI, you define the UserProfile class inheriting from BaseModel, and you pass this type to the agent's run method or define it as the return type of the agent.

This approach eliminates the need for output parsers. The framework instructs the LLM (often using the provider's native "tool calling" or "JSON mode" capabilities) to adhere to the schema. If the LLM generates a string for an integer field, Pydantic validation kicks in. If the validation fails, PydanticAI intercepts the exception and prompts the LLM to retry, referencing the specific validation error.

Here is how we extract complex entities from unstructured text:

from typing import List, Optional
from pydantic import BaseModel, Field
from pydantic_ai import Agent

# Define the schema for the output
class Ingredient(BaseModel):
    name: str
    quantity: str
    optional: bool = False

class Recipe(BaseModel):
    title: str
    difficulty: int = Field(description="1 to 5 scale")
    ingredients: List[Ingredient]
    steps: List[str]
    total_time_minutes: int

# Configure the agent to always return a Recipe object
recipe_agent = Agent(
    'openai:gpt-4o',
    result_type=Recipe,
    system_prompt="You are a master chef. Generate structured recipes based on user requests."
)

async def generate_recipe():
    # We ask for a recipe in natural language
    result = await recipe_agent.run('I want to make a spicy vegetarian taco dinner.')
    
    # The result.data is guaranteed to be an instance of Recipe
    recipe = result.data
    
    print(f"Title: {recipe.title} (Difficulty: {recipe.difficulty}/5)")
    for ing in recipe.ingredients:
        print(f" - {ing.quantity} {ing.name} {'(Optional)' if ing.optional else ''}")
        
    # Because it is a Pydantic model, we can easily dump it to JSON
    # print(recipe.model_dump_json(indent=2))

# Output will be a strictly typed Recipe object, not a dict or string.

This pattern transforms the LLM from a text generator into a data generator. The Field descriptions are passed to the model as metadata, helping guide the generation process. This creates a "contract" between your prompt and your code.

Dependency Injection: The Architecture of Robust Agents

One of the most challenging aspects of building production agents is managing state and external dependencies. In many frameworks, developers resort to global variables or passing massive context dictionaries through every function. PydanticAI introduces a proper Dependency Injection (DI) system inspired by `FastAPI`.

The concept is simple: you define a generic type for your dependencies (e.g., a database connection, a user session object, or API keys). This dependency is injected into the agent at runtime. Any tool or validator attached to the agent can access these dependencies via the RunContext.

This is critical for testing. You can inject a mock database during unit tests and a real `AsyncPG` connection pool in production, without changing the agent's logic. It also ensures that your tools remain stateless and thread-safe.

Implementing Type-Safe Dependencies

Let's build an agent that requires access to a user's banking context to perform actions. We will define a BankContext class and inject it into a tool.

from dataclasses import dataclass
from pydantic_ai import Agent, RunContext, Tool

# 1. Define the Dependency Structure
@dataclass
class BankContext:
    user_id: str
    account_balance: float
    db_connection_str: str

# 2. Define the Agent with the expected dependency type
# We specify `deps_type=BankContext` so the static analyzer knows what to expect
banking_agent = Agent(
    'openai:gpt-4o',
    deps_type=BankContext,
    system_prompt="You are a banking assistant. Use tools to check balances and transfer funds."
)

# 3. Register a tool that uses the dependency
@banking_agent.tool
async def check_balance(ctx: RunContext[BankContext]) -> str:
    """Check the current account balance."""
    # Access dependencies type-safely via ctx.deps
    user = ctx.deps.user_id
    balance = ctx.deps.account_balance
    
    # In a real app, we would use ctx.deps.db_connection_str to query a DB
    return f"User {user} has a balance of ${balance:.2f}"

@banking_agent.tool
async def transfer_funds(ctx: RunContext[BankContext], amount: float, to_account: str) -> str:
    """Transfer funds to another account."""
    if amount > ctx.deps.account_balance:
        return "Transaction Failed: Insufficient funds."
    
    new_balance = ctx.deps.account_balance - amount
    # Simulate DB update
    return f"Success. Transferred ${amount} to {to_account}. Remaining: ${new_balance}"

async def run_banking_flow():
    # 4. Instantiate dependencies at runtime
    context = BankContext(
        user_id="user_123", 
        account_balance=5000.00, 
        db_connection_str="postgres://..."
    )
    
    # 5. Inject dependencies into the run call
    query = "Transfer $200 to account 'saving_01' and tell me what is left."
    result = await banking_agent.run(query, deps=context)
    
    print(result.data)

# The agent uses the tools, which use the injected context, to resolve the query.

In this example, the `check_balance` function does not need to know where the balance comes from globally; it only cares about the `ctx` passed to it. This isolation makes your agents modular and highly testable.

Advanced Tool usage and Schema Engineering

While we touched on tools in the previous section, it is worth diving deeper into how PydanticAI handles tool arguments. When an LLM decides to call a tool, it generates arguments based on the function signature. PydanticAI validates these arguments against the Python type hints of the function before the function is actually executed.

If you have a tool that takes a `datetime` object, PydanticAI will ensure the string provided by the LLM is parsed into a valid Python `datetime` object. If the LLM provides an ambiguous date, validation fails, and the agent asks the LLM to clarify, all without your core logic crashing.

Furthermore, you can use Pydantic models as arguments for your tools. This allows for complex, multi-parameter tool calls that are validated as a single unit.

from pydantic import BaseModel, Field
from datetime import date

class HotelSearch(BaseModel):
    location: str
    check_in: date
    check_out: date
    guests: int = Field(ge=1, le=10)
    room_type: str = Field(pattern="^(standard|deluxe|suite)$")

travel_agent = Agent('openai:gpt-4o', system_prompt="Book hotels for clients.")

@travel_agent.tool
async def find_hotels(ctx: RunContext, search_criteria: HotelSearch) -> str:
    """Finds hotels based on complex search criteria."""
    # By the time code reaches here, search_criteria is a valid HotelSearch object.
    # logic to query API...
    return f"Found 3 hotels in {search_criteria.location} for {search_criteria.guests} guests."

# When the LLM calls this tool, it must construct a JSON matching HotelSearch.
# PydanticAI handles the parsing and validation logic automatically.

Streaming Structured Data

In modern UI/UX for AI, latency is a killer. Users expect to see the response appearing token by token. However, streaming structured data (like JSON) is notoriously difficult because a partial JSON string is usually invalid. You cannot parse the JSON until the stream finishes, which defeats the purpose of streaming.

PydanticAI provides mechanisms to handle streaming responses, even when validation is involved. While streaming a Pydantic model output directly is complex, PydanticAI supports streaming the text responses while maintaining the internal context of the agent. For structured outputs, the framework validates the final payload, but for standard chat interfaces, you can utilize the `run_stream` method.

async def stream_response():
    async with agent.run_stream('Tell me a long story about a brave knight.') as result:
        async for message in result.stream():
            # This prints tokens as they arrive
            print(message, end='', flush=True)

For structured streaming, the library is evolving towards partial validation, where fields in a Pydantic model can be yielded as they become valid. This is the frontier of agentic UX, moving away from loading spinners to real-time data painting.

Integrating with the Python Ecosystem

The choice of Pydantic as the backbone of this framework is strategic. Pydantic is already the standard for data validation in the Python ecosystem, underpinning FastAPI, Hugging Face, and thousands of other libraries. By using PydanticAI, you are reducing the cognitive load on your team. There is no new Domain Specific Language (DSL) to learn, no complex graph definition syntax, and no proprietary data structures.

Everything is standard Python. The models are `pydantic.BaseModel`. The async patterns are `asyncio`. The typing is standard `typing`. This means your IDE's autocompletion works out of the box, static analysis tools like `mypy` or `pyright` can catch errors before you run code, and onboarding new developers is significantly faster.

Why This Matters for Production

When moving from a prototype in a Jupyter Notebook to a deployed microservice, the concerns shift from "does it work?" to "is it maintainable?" and "is it observable?". PydanticAI's structured approach inherently improves observability. Because every input and output is a typed schema, you can easily log these objects to structured logging platforms (like Datadog or sentry) without sanitizing unstructured text.

Furthermore, the dependency injection system ensures that your agents fit neatly into modular architectures. You can define an agent in one module and inject different database adapters or API clients depending on whether the agent is running in a development, staging, or production environment.