Version Notice 版本通知

This documentation is ahead of the last release by 18 commits. You may see documentation for features not yet supported in the latest release v0.0.21 2025-01-30.
此文档领先于上一个发布版本 18 个提交。您可能会看到最新版本 v0.0.21 2025-01-30 中尚未支持的功能的文档。

Agents 代理

Introduction 介绍

Agents are PydanticAI's primary interface for interacting with LLMs.
代理是 PydanticAI 与LLMs交互的主要接口。

In some use cases a single Agent will control an entire application or component, but multiple agents can also interact to embody more complex workflows.
在某些使用场景中，单个代理将控制整个应用程序或组件，但多个代理也可以交互以体现更复杂的工作流程。

The Agent class has full API documentation, but conceptually you can think of an agent as a container for:
Agent 类有完整的 API 文档，但从概念上讲，您可以将代理视为以下内容的容器：

Component 组件	Description 描述
System prompt(s) 系统提示(s)	A set of instructions for the LLM written by the developer. 开发者编写的LLM的一组指令。
Function tool(s) 功能工具	Functions that the LLM may call to get information while generating a response. LLM 在生成响应时可能调用的用于获取信息的函数。
Structured result type 结构化结果类型	The structured datatype the LLM must return at the end of a run, if specified. 结构化数据类型LLM在运行结束时必须返回，如果已指定。
Dependency type constraint 依赖类型约束	System prompt functions, tools, and result validators may all use dependencies when they're run. 系统提示功能、工具和结果验证器在运行时都可能使用依赖项。
LLM model LLM 模型	Optional default LLM model associated with the agent. Can also be specified when running the agent. 可选默认 LLM 模型与代理关联。也可以在运行代理时指定。
Model Settings 模型设置	Optional default model settings to help fine tune requests. Can also be specified when running the agent. 可选的默认模型设置，用于帮助微调请求。也可以在运行代理时指定。

In typing terms, agents are generic in their dependency and result types, e.g., an agent which required dependencies of type Foobar and returned results of type list[str] would have type Agent[Foobar, list[str]]. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use static type checking it should work well with PydanticAI.
在类型术语中，代理在其依赖项和结果类型上是泛型的，例如，一个需要类型为 Foobar 的依赖项并返回类型为 list[str] 的结果的代理将具有类型 Agent[Foobar, list[str]] 。实际上，您不需要关心这一点，它应该只是意味着您的 IDE 可以在您拥有正确类型时告诉您，如果您选择使用静态类型检查，它应该与 PydanticAI 很好地配合。

Here's a toy example of an agent that simulates a roulette wheel:
这是一个模拟轮盘赌的代理的玩具示例：

roulette_wheel.py


from pydantic_ai import Agent, RunContext

roulette_agent = Agent(  
    'openai:gpt-4o',
    deps_type=int,
    result_type=bool,
    system_prompt=(
        'Use the `roulette_wheel` function to see if the '
        'customer has won based on the number they provide.'
    ),
)


@roulette_agent.tool
async def roulette_wheel(ctx: RunContext[int], square: int) -> str:  
    """check if the square is a winner"""
    return 'winner' if square == ctx.deps else 'loser'


# Run the agent
success_number = 18  
result = roulette_agent.run_sync('Put my money on square eighteen', deps=success_number)
print(result.data)  
#> True

result = roulette_agent.run_sync('I bet five is the winner', deps=success_number)
print(result.data)
#> False

Agents are designed for reuse, like FastAPI Apps
代理设计用于重用，如 FastAPI 应用

Agents are intended to be instantiated once (frequently as module globals) and reused throughout your application, similar to a small FastAPI app or an APIRouter.
代理旨在被实例化一次（通常作为模块全局变量）并在整个应用程序中重复使用，类似于一个小型的 FastAPI 应用或 APIRouter。

Running Agents 运行代理

There are three ways to run an agent:
有三种方式可以运行代理：

agent.run() — a coroutine which returns a RunResult containing a completed response
agent.run() — 一个协程，返回包含已完成响应的 RunResult
agent.run_sync() — a plain, synchronous function which returns a RunResult containing a completed response (internally, this just calls loop.run_until_complete(self.run()))
agent.run_sync() — 一个普通的同步函数，它返回一个包含已完成响应的 RunResult （内部实现仅调用 loop.run_until_complete(self.run()) ）
agent.run_stream() — a coroutine which returns a StreamedRunResult, which contains methods to stream a response as an async iterable
agent.run_stream() — 一个返回 StreamedRunResult 的协程，其中包含将响应作为异步可迭代对象流式传输的方法

Here's a simple example demonstrating all three:
这是一个简单的示例，展示了所有三种情况：

run_agent.py


from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync('What is the capital of Italy?')
print(result_sync.data)
#> Rome


async def main():
    result = await agent.run('What is the capital of France?')
    print(result.data)
    #> Paris

    async with agent.run_stream('What is the capital of the UK?') as response:
        print(await response.get_data())
        #> London

(This example is complete, it can be run "as is" — you'll need to add asyncio.run(main()) to run main)
（此示例完整，可以“按原样”运行——你需要添加 asyncio.run(main()) 来运行 main ）

You can also pass messages from previous runs to continue a conversation or provide context, as described in Messages and Chat History.
您还可以传递来自之前运行的消息以继续对话或提供上下文，如“消息和聊天历史”中所述。

Additional Configuration
附加配置

Usage Limits 使用限制

PydanticAI offers a UsageLimits structure to help you limit your usage (tokens and/or requests) on model runs.
PydanticAI 提供了一个 UsageLimits 结构，帮助您在模型运行中限制使用量（令牌和/或请求）。

You can apply these settings by passing the usage_limits argument to the run{_sync,_stream} functions.
您可以通过将 usage_limits 参数传递给 run{_sync,_stream} 函数来应用这些设置。

Consider the following example, where we limit the number of response tokens:
考虑以下示例，我们限制了响应令牌的数量：


from pydantic_ai import Agent
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits

agent = Agent('anthropic:claude-3-5-sonnet-latest')

result_sync = agent.run_sync(
    'What is the capital of Italy? Answer with just the city.',
    usage_limits=UsageLimits(response_tokens_limit=10),
)
print(result_sync.data)
#> Rome
print(result_sync.usage())
"""
Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63, details=None)
"""

try:
    result_sync = agent.run_sync(
        'What is the capital of Italy? Answer with a paragraph.',
        usage_limits=UsageLimits(response_tokens_limit=10),
    )
except UsageLimitExceeded as e:
    print(e)
    #> Exceeded the response_tokens_limit of 10 (response_tokens=32)

Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling:
限制请求数量有助于防止无限循环或过多的工具调用：


from typing_extensions import TypedDict

from pydantic_ai import Agent, ModelRetry
from pydantic_ai.exceptions import UsageLimitExceeded
from pydantic_ai.usage import UsageLimits


class NeverResultType(TypedDict):
    """
    Never ever coerce data to this type.
    """

    never_use_this: str


agent = Agent(
    'anthropic:claude-3-5-sonnet-latest',
    retries=3,
    result_type=NeverResultType,
    system_prompt='Any time you get a response, call the `infinite_retry_tool` to produce another response.',
)


@agent.tool_plain(retries=5)  
def infinite_retry_tool() -> int:
    raise ModelRetry('Please try again.')


try:
    result_sync = agent.run_sync(
        'Begin infinite retry loop!', usage_limits=UsageLimits(request_limit=3)  
    )
except UsageLimitExceeded as e:
    print(e)
    #> The next request would exceed the request_limit of 3

Note 注意

This is especially relevant if you're registered a lot of tools, request_limit can be used to prevent the model from choosing to make too many of these calls.
如果您注册了许多工具，这一点尤其重要， request_limit 可用于防止模型选择进行过多的此类调用。

Model (Run) Settings 模型（运行）设置

PydanticAI offers a settings.ModelSettings structure to help you fine tune your requests. This structure allows you to configure common parameters that influence the model's behavior, such as temperature, max_tokens, timeout, and more.
PydanticAI 提供了一个 settings.ModelSettings 结构来帮助您微调请求。此结构允许您配置影响模型行为的常见参数，例如 temperature 、 max_tokens 、 timeout 等。

There are two ways to apply these settings: 1. Passing to run{_sync,_stream} functions via the model_settings argument. This allows for fine-tuning on a per-request basis. 2. Setting during Agent initialization via the model_settings argument. These settings will be applied by default to all subsequent run calls using said agent. However, model_settings provided during a specific run call will override the agent's default settings.
有两种方式应用这些设置：1. 通过 model_settings 参数传递给 run{_sync,_stream} 函数。这允许在每个请求的基础上进行微调。2. 在 Agent 初始化期间通过 model_settings 参数设置。这些设置将默认应用于使用该代理的所有后续运行调用。然而，在特定运行调用期间提供的 model_settings 将覆盖代理的默认设置。

For example, if you'd like to set the temperature setting to 0.0 to ensure less random behavior, you can do the following:
例如，如果您想将 temperature 设置设为 0.0 以确保较少随机行为，您可以执行以下操作：


from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

result_sync = agent.run_sync(
    'What is the capital of Italy?', model_settings={'temperature': 0.0}
)
print(result_sync.data)
#> Rome

Model specific settings 型号特定设置

If you wish to further customize model behavior, you can use a subclass of ModelSettings, like AnthropicModelSettings, associated with your model of choice.
如果您希望进一步自定义模型行为，可以使用与所选模型关联的 ModelSettings 子类，例如 AnthropicModelSettings 。

For example: 例如：


from pydantic_ai import Agent
from pydantic_ai.models.anthropic import AnthropicModelSettings

agent = Agent('anthropic:claude-3-5-sonnet-latest')

result_sync = agent.run_sync(
    'What is the capital of Italy?',
    model_settings=AnthropicModelSettings(anthropic_metadata={'user_id': 'my_user_id'}),
)
print(result_sync.data)
#> Rome

Runs vs. Conversations 运行 vs. 对话

An agent run might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a conversation might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls.
一次代理运行可能代表整个对话——在一次运行中可以交换的消息数量没有限制。然而，一个对话也可能由多次运行组成，特别是如果你需要在不同的交互或 API 调用之间保持状态。

Here's an example of a conversation comprised of multiple runs:
这是一个由多次运行组成的对话示例：

conversation_example.py


from pydantic_ai import Agent

agent = Agent('openai:gpt-4o')

# First run
result1 = agent.run_sync('Who was Albert Einstein?')
print(result1.data)
#> Albert Einstein was a German-born theoretical physicist.

# Second run, passing previous messages
result2 = agent.run_sync(
    'What was his most famous equation?',
    message_history=result1.new_messages(),  
)
print(result2.data)
#> Albert Einstein's most famous equation is (E = mc^2).

(This example is complete, it can be run "as is")
（此示例完整，可以“按原样”运行）

Type safe by design
设计上类型安全

PydanticAI is designed to work well with static type checkers, like mypy and pyright.
PydanticAI 旨在与静态类型检查器（如 mypy 和 pyright）良好配合。

Typing is (somewhat) optional
打字（某种程度上）是可选的

PydanticAI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time.
PydanticAI 旨在让类型检查尽可能对您有用，如果您选择使用它，但您不必在所有地方都一直使用类型。

That said, because PydanticAI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the result_type arguments to Agent) are used at runtime.
也就是说，由于 PydanticAI 使用 Pydantic，而 Pydantic 使用类型提示作为模式和验证的定义，某些类型（特别是工具参数上的类型提示，以及 Agent 的 result_type 参数）在运行时被使用。

We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an issue explaining what's annoying you!
如果类型提示让你感到困惑而不是帮助你，那说明我们（库的开发者）搞砸了。如果你发现这种情况，请创建一个问题，解释是什么让你感到烦恼！

In particular, agents are generic in both the type of their dependencies and the type of results they return, so you can use the type hints to ensure you're using the right types.
特别是，代理在其依赖项的类型和返回结果的类型上都是通用的，因此您可以使用类型提示来确保使用正确的类型。

Consider the following script with type mistakes:
考虑以下带有类型错误的脚本：

type_mistakes.py


from dataclasses import dataclass

from pydantic_ai import Agent, RunContext


@dataclass
class User:
    name: str


agent = Agent(
    'test',
    deps_type=User,  
    result_type=bool,
)


@agent.system_prompt
def add_user_name(ctx: RunContext[str]) -> str:  
    return f"The user's name is {ctx.deps}."


def foobar(x: bytes) -> None:
    pass


result = agent.run_sync('Does their name start with "A"?', deps=User('Anne'))
foobar(result.data)

Running mypy on this will give the following output:
运行 mypy 将得到以下输出：


➤ uv run mypy type_mistakes.py
type_mistakes.py:18: error: Argument 1 to "system_prompt" of "Agent" has incompatible type "Callable[[RunContext[str]], str]"; expected "Callable[[RunContext[User]], str]"  [arg-type]
type_mistakes.py:28: error: Argument 1 to "foobar" has incompatible type "bool"; expected "bytes"  [arg-type]
Found 2 errors in 1 file (checked 1 source file)

Running pyright would identify the same issues.
运行 pyright 将识别相同的问题。

System Prompts 系统提示

System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want.
系统提示乍看之下可能很简单，因为它们只是字符串（或串联的字符串序列），但精心设计正确的系统提示是让模型按您期望的方式运行的关键。

Generally, system prompts fall into two categories:
通常，系统提示分为两类：

Static system prompts: These are known when writing the code and can be defined via the system_prompt parameter of the Agent constructor.
静态系统提示：这些在编写代码时已知，可以通过 Agent 构造函数的 system_prompt 参数定义。
Dynamic system prompts: These depend in some way on context that isn't known until runtime, and should be defined via functions decorated with @agent.system_prompt.
动态系统提示：这些提示在某种程度上依赖于运行时才知晓的上下文，应通过使用 @agent.system_prompt 装饰的函数来定义。

You can add both to a single agent; they're appended in the order they're defined at runtime.
您可以将两者添加到单个代理中；它们按照运行时定义的顺序附加。

Here's an example using both types of system prompts:
这是一个使用两种系统提示的示例：

system_prompts.py


from datetime import date

from pydantic_ai import Agent, RunContext

agent = Agent(
    'openai:gpt-4o',
    deps_type=str,  
    system_prompt="Use the customer's name while replying to them.",  
)


@agent.system_prompt  
def add_the_users_name(ctx: RunContext[str]) -> str:
    return f"The user's name is {ctx.deps}."


@agent.system_prompt
def add_the_date() -> str:  
    return f'The date is {date.today()}.'


result = agent.run_sync('What is the date?', deps='Frank')
print(result.data)
#> Hello Frank, the date today is 2032-01-02.

(This example is complete, it can be run "as is")
（此示例完整，可以“按原样”运行）

Reflection and self-correction
反思与自我修正

Validation errors from both function tool parameter validation and structured result validation can be passed back to the model with a request to retry.
函数工具参数验证和结构化结果验证中的验证错误可以传递回模型，并请求重试。

You can also raise ModelRetry from within a tool or result validator function to tell the model it should retry generating a response.
你也可以在工具或结果验证函数中引发 ModelRetry ，以告诉模型应重试生成响应。

The default retry count is 1 but can be altered for the entire agent, a specific tool, or a result validator.
默认重试次数为 1，但可以为整个代理、特定工具或结果验证器进行更改。
You can access the current retry count from within a tool or result validator via ctx.retry.
您可以通过 ctx.retry 在工具或结果验证器中访问当前的重试计数。

Here's an example: 这是一个例子：

tool_retry.py


from pydantic import BaseModel

from pydantic_ai import Agent, RunContext, ModelRetry

from fake_database import DatabaseConn


class ChatResult(BaseModel):
    user_id: int
    message: str


agent = Agent(
    'openai:gpt-4o',
    deps_type=DatabaseConn,
    result_type=ChatResult,
)


@agent.tool(retries=2)
def get_user_by_name(ctx: RunContext[DatabaseConn], name: str) -> int:
    """Get a user's ID from their full name."""
    print(name)
    #> John
    #> John Doe
    user_id = ctx.deps.users.get(name=name)
    if user_id is None:
        raise ModelRetry(
            f'No user found with name {name!r}, remember to provide their full name'
        )
    return user_id


result = agent.run_sync(
    'Send a message to John Doe asking for coffee next week', deps=DatabaseConn()
)
print(result.data)
"""
user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!'
"""

Model errors 模型错误

If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns 503), agent runs will raise UnexpectedModelBehavior.
如果模型行为异常（例如，重试次数超过限制，或其 API 返回 503 ），代理运行将引发 UnexpectedModelBehavior 。

In these cases, capture_run_messages can be used to access the messages exchanged during the run to help diagnose the issue.
在这些情况下， capture_run_messages 可用于访问运行期间交换的消息，以帮助诊断问题。


from pydantic_ai import Agent, ModelRetry, UnexpectedModelBehavior, capture_run_messages

agent = Agent('openai:gpt-4o')


@agent.tool_plain
def calc_volume(size: int) -> int:  
    if size == 42:
        return size**3
    else:
        raise ModelRetry('Please try again.')


with capture_run_messages() as messages:  
    try:
        result = agent.run_sync('Please get me the volume of a box with size 6.')
    except UnexpectedModelBehavior as e:
        print('An error occurred:', e)
        #> An error occurred: Tool exceeded max retries count of 1
        print('cause:', repr(e.__cause__))
        #> cause: ModelRetry('Please try again.')
        print('messages:', messages)
        """
        messages:
        [
            ModelRequest(
                parts=[
                    UserPromptPart(
                        content='Please get me the volume of a box with size 6.',
                        timestamp=datetime.datetime(...),
                        part_kind='user-prompt',
                    )
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    ToolCallPart(
                        tool_name='calc_volume',
                        args={'size': 6},
                        tool_call_id=None,
                        part_kind='tool-call',
                    )
                ],
                model_name='function:model_logic',
                timestamp=datetime.datetime(...),
                kind='response',
            ),
            ModelRequest(
                parts=[
                    RetryPromptPart(
                        content='Please try again.',
                        tool_name='calc_volume',
                        tool_call_id=None,
                        timestamp=datetime.datetime(...),
                        part_kind='retry-prompt',
                    )
                ],
                kind='request',
            ),
            ModelResponse(
                parts=[
                    ToolCallPart(
                        tool_name='calc_volume',
                        args={'size': 6},
                        tool_call_id=None,
                        part_kind='tool-call',
                    )
                ],
                model_name='function:model_logic',
                timestamp=datetime.datetime(...),
                kind='response',
            ),
        ]
        """
    else:
        print(result.data)

(This example is complete, it can be run "as is")
（此示例完整，可以“按原样”运行）

Note 注意

If you call run, run_sync, or run_stream more than once within a single capture_run_messages context, messages will represent the messages exchanged during the first call only.
如果您在同一个 capture_run_messages 上下文中多次调用 run 、 run_sync 或 run_stream ， messages 将仅表示第一次调用期间交换的消息。

Agents 代理

Introduction 介绍

Running Agents 运行代理

Additional Configuration附加配置

Usage Limits 使用限制

Model (Run) Settings 模型（运行）设置

Model specific settings 型号特定设置

Runs vs. Conversations 运行 vs. 对话

Type safe by design设计上类型安全

System Prompts 系统提示

Reflection and self-correction反思与自我修正

Model errors 模型错误

Additional Configuration
附加配置

Type safe by design
设计上类型安全

Reflection and self-correction
反思与自我修正