Conceptual guide 概念指南

This section contains introductions to key parts of LangChain.
本节将对LangChain的关键部分进行介绍。

Architecture
建筑

LangChain as a framework consists of a number of packages.
LangChain作为一个框架，由许多包组成。

`langchain-core`
`langchain-核心`

This package contains base abstractions of different components and ways to compose them together. The interfaces for core components like LLMs, vector stores, retrievers and more are defined here. No third party integrations are defined here. The dependencies are kept purposefully very lightweight.
此包包含不同组件的基本抽象以及将它们组合在一起的方法。此处定义了核心组件（如 LLMs、向量存储、检索器等）的接口。此处未定义第三方集成。依赖项有目的地保持非常轻量级。

Partner packages
合作伙伴套餐

While the long tail of integrations are in langchain-community, we split popular integrations into their own packages (e.g. langchain-openai, langchain-anthropic, etc). This was done in order to improve support for these important integrations.
虽然集成的长尾在langchain-community中，但我们将流行的集成拆分到它们自己的包中（例如langchain-openai、langchain-anthropic等）。这样做是为了改善对这些重要集成的支持。

`langchain`
`langchain（朗链）`

The main langchain package contains chains, agents, and retrieval strategies that make up an application's cognitive architecture. These are NOT third party integrations. All chains, agents, and retrieval strategies here are NOT specific to any one integration, but rather generic across all integrations.
主要的langchain包包含链、代理和检索策略，它们构成了应用程序的认知架构。这些不是第三方集成。这里的所有链、代理和检索策略都不是特定于任何一个集成的，而是在所有集成中都是通用的。

`langchain-community`
`langchain-社区`

This package contains third party integrations that are maintained by the LangChain community. Key partner packages are separated out (see below). This contains all integrations for various components (LLMs, vector stores, retrievers). All dependencies in this package are optional to keep the package as lightweight as possible.
此软件包包含由LangChain社区维护的第三方集成。关键的合作伙伴包被分离出来（见下文）。这包含各种组件（LLMs矢量存储、检索器）的所有集成。此包中的所有依赖项都是可选的，以保持包尽可能轻量级。

`langgraph`
`LangGraph （英语）`

langgraph is an extension of langchain aimed at building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
LangGraph 是 LangChain 的扩展，旨在通过将步骤建模为图中的边缘和节点来构建鲁棒且有状态的多参与者应用程序LLMs。

LangGraph exposes high level interfaces for creating common types of agents, as well as a low-level API for composing custom flows.
LangGraph 公开了用于创建常见类型代理的高级接口，以及用于组合自定义流的低级 API。

`langserve`
`langserve`

A package to deploy LangChain chains as REST APIs. Makes it easy to get a production ready API up and running.
用于将LangChain链部署为REST API的包。使启动和运行生产就绪的 API 变得容易。

LangSmith
兰史密斯

A developer platform that lets you debug, test, evaluate, and monitor LLM applications.
一个开发人员平台，可用于调试、测试、评估和监视LLM应用程序。

Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.

LangChain Expression Language (LCEL)
LangChain表达式语言（LCEL）

LangChain Expression Language, or LCEL, is a declarative way to chain LangChain components. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully run LCEL chains with 100s of steps in production). To highlight a few of the reasons you might want to use LCEL:
LangChain表达式语言（LCEL）是一种链接LangChain组件的声明式方式。LCEL 从第一天起就设计为支持将原型投入生产，无需更改代码，从最简单的“提示 +LLM”链到最复杂的链（我们已经看到人们在生产中成功运行了 100 个步骤的 LCEL 链）。要强调您可能想要使用 LCEL 的几个原因：

First-class streaming support When you build your chains with LCEL you get the best possible time-to-first-token (time elapsed until the first chunk of output comes out). For some chains this means eg. we stream tokens straight from an LLM to a streaming output parser, and you get back parsed, incremental chunks of output at the same rate as the LLM provider outputs the raw tokens.
一流的流媒体支持当您使用 LCEL 构建链时，您可以获得第一个令牌的最佳时间（直到第一个输出块出来之前经过的时间）。对于某些链，这意味着例如。我们直接从 A LLM 将令牌流式传输到流式输出解析器，然后您可以以与LLM提供者输出原始令牌相同的速率返回已解析的增量输出块。

Async support Any chain built with LCEL can be called both with the synchronous API (eg. in your Jupyter notebook while prototyping) as well as with the asynchronous API (eg. in a LangServe server). This enables using the same code for prototypes and in production, with great performance, and the ability to handle many concurrent requests in the same server.
异步支持任何使用 LCEL 构建的链都可以使用同步 API（例如，在原型设计时在 Jupyter 笔记本中）以及异步 API（例如在 LangServe 服务器中）进行调用。这使得在原型和生产中使用相同的代码成为可能，具有出色的性能，并且能够在同一服务器中处理许多并发请求。

Optimized parallel execution Whenever your LCEL chains have steps that can be executed in parallel (eg if you fetch documents from multiple retrievers) we automatically do it, both in the sync and the async interfaces, for the smallest possible latency.
优化的并行执行每当您的 LCEL 链具有可以并行执行的步骤时（例如，如果您从多个检索器获取文档），我们都会自动在同步和异步接口中执行此操作，以实现尽可能小的延迟。

Retries and fallbacks Configure retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale. We’re currently working on adding streaming support for retries/fallbacks, so you can get the added reliability without any latency cost.
重试和回退为 LCEL 链的任何部分配置重试和回退。这是使您的链条在规模上更加可靠的好方法。我们目前正在努力添加对重试/回退的流式处理支持，这样你就可以在没有任何延迟成本的情况下获得额外的可靠性。

Access intermediate results For more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used to let end-users know something is happening, or even just to debug your chain. You can stream intermediate results, and it’s available on every LangServe server.
访问中间结果对于更复杂的链，甚至在产生最终输出之前访问中间步骤的结果通常非常有用。这可以用来让最终用户知道正在发生的事情，甚至只是用来调试你的链。您可以流式传输中间结果，并且它在每个 LangServe 服务器上都可用。

Input and output schemas Input and output schemas give every LCEL chain Pydantic and JSONSchema schemas inferred from the structure of your chain. This can be used for validation of inputs and outputs, and is an integral part of LangServe.
输入和输出架构输入和输出模式为每个 LCEL 链提供从链结构推断出的 Pydantic 和 JSONSchema 模式。这可用于验证输入和输出，并且是LangServe的一个组成部分。

Seamless LangSmith tracing As your chains get more and more complex, it becomes increasingly important to understand what exactly is happening at every step. With LCEL, all steps are automatically logged to LangSmith for maximum observability and debuggability.
无缝 LangSmith 追踪随着您的链变得越来越复杂，了解每一步到底发生了什么变得越来越重要。使用LCEL，所有步骤都会自动记录到LangSmith中，以实现最大的可观察性和可调试性。

LCEL aims to provide consistency around behavior and customization over legacy subclassed chains such as LLMChain and ConversationalRetrievalChain. Many of these legacy chains hide important details like prompts, and as a wider variety of viable models emerge, customization has become more and more important.
LCEL 旨在为传统的子类链（如 LLMChain 和 ConversationalRetrievalChain）提供行为和定制的一致性。这些遗留链中的许多都隐藏了提示等重要细节，并且随着更多种类的可行模型的出现，定制变得越来越重要。

If you are currently using one of these legacy chains, please see this guide for guidance on how to migrate.
如果您目前正在使用这些传统链之一，请参阅本指南以获取有关如何迁移的指导。

For guides on how to do specific tasks with LCEL, check out the relevant how-to guides.
有关如何使用 LCEL 执行特定任务的指南，请查看相关的操作指南。

Runnable interface
可运行的界面

To make it as easy as possible to create custom chains, we've implemented a "Runnable" protocol. Many LangChain components implement the Runnable protocol, including chat models, LLMs, output parsers, retrievers, prompt templates, and more. There are also several useful primitives for working with runnables, which you can read about below.
为了尽可能简单地创建自定义链，我们实现了一个“Runnable”协议。许多LangChain组件都实现了Runnable协议，包括聊天模型、LLMs输出解析器、检索器、提示模板等。此外，还有几个有用的基元可用于处理 runnables，您可以在下面阅读有关这些原语的内容。

This is a standard interface, which makes it easy to define custom chains as well as invoke them in a standard way. The standard interface includes:
这是一个标准接口，可以很容易地定义自定义链，以及以标准方式调用它们。标准接口包括：

stream: stream back chunks of the response
stream：流回响应的块
invoke: call the chain on an input
invoke：在输入上调用链
batch: call the chain on a list of inputs
batch：在输入列表上调用链

These also have corresponding async methods that should be used with asyncio await syntax for concurrency:
它们还具有相应的 async 方法，这些方法应与 asyncio await 语法一起使用以实现并发性：

astream: stream back chunks of the response async
astream：流回响应异步的块
ainvoke: call the chain on an input async
ainvoke：在输入异步上调用链
abatch: call the chain on a list of inputs async
abatch：在输入 async 列表上调用链
astream_log: stream back intermediate steps as they happen, in addition to the final response
astream_log：除了最终响应外，还回流发生中间步骤
astream_events: beta stream events as they happen in the chain (introduced in langchain-core 0.1.14)
astream_events：链中发生的 beta 流事件（在 langchain-core 0.1.14 中引入）

The input type and output type varies by component:
输入类型和输出类型因组件而异：

Component 元件	Input Type 输入类型	Output Type 输出类型
Prompt 提示	Dictionary 字典	PromptValue PromptValue （提示值）
ChatModel 聊天模型	Single string, list of chat messages or a PromptValue 单个字符串、聊天消息列表或 PromptValue	ChatMessage 聊天留言
LLM	Single string, list of chat messages or a PromptValue 单个字符串、聊天消息列表或 PromptValue	String 字符串
OutputParser	The output of an LLM or ChatModel 或 ChatModel 的LLM输出	Depends on the parser 取决于解析器
Retriever 猎犬	Single string 单串	List of Documents 文件清单
Tool 工具	Single string or dictionary, depending on the tool 单个字符串或字典，具体取决于工具	Depends on the tool 取决于工具

All runnables expose input and output schemas to inspect the inputs and outputs:
所有可运行设备都公开输入和输出架构，以检查输入和输出：

input_schema: an input Pydantic model auto-generated from the structure of the Runnable
input_schema：从 Runnable 的结构自动生成的输入 Pydantic 模型
output_schema: an output Pydantic model auto-generated from the structure of the Runnable
output_schema：从 Runnable 的结构自动生成的输出 Pydantic 模型

Components 组件

LangChain provides standard, extendable interfaces and external integrations for various components useful for building with LLMs. Some components LangChain implements, some components we rely on third-party integrations for, and others are a mix.
LangChain为各种组件提供了标准的、可扩展的接口和外部集成，这些组件可用于构建LLMs。有些组件是LangChain实现的，有些组件我们依赖第三方集成，有些则是混合的。

Chat models
聊天模型

Language models that use a sequence of messages as inputs and return chat messages as outputs (as opposed to using plain text). These are traditionally newer models (older models are generally LLMs, see below). Chat models support the assignment of distinct roles to conversation messages, helping to distinguish messages from the AI, users, and instructions such as system messages.
使用一系列消息作为输入并返回聊天消息作为输出（而不是使用纯文本）的语言模型。这些是传统上较新的模型（旧模型通常是LLMs，见下文）。聊天模型支持为对话消息分配不同的角色，有助于将消息与 AI、用户和指令（如系统消息）区分开来。

Although the underlying models are messages in, message out, the LangChain wrappers also allow these models to take a string as input. This means you can easily use chat models in place of LLMs.
虽然底层模型是消息输入，消息输出，但LangChain包装器也允许这些模型将字符串作为输入。这意味着您可以轻松地使用聊天模型来代替 LLMs。

When a string is passed in as input, it is converted to a HumanMessage and then passed to the underlying model.
当字符串作为输入传入时，它将被转换为 HumanMessage，然后传递给底层模型。

LangChain does not host any Chat Models, rather we rely on third party integrations.
LangChain不托管任何聊天模型，而是依赖第三方集成。

We have some standardized parameters when constructing ChatModels:
在构造 ChatModels 时，我们有一些标准化的参数：

model: the name of the model
model：模型的名称
temperature: the sampling temperature
温度：采样温度
timeout: request timeout
timeout：请求超时
max_tokens: max tokens to generate
max_tokens：要生成的最大代币数
stop: default stop sequences
stop：默认停止序列
max_retries: max number of times to retry requests
max_retries：重试请求的最大次数
api_key: API key for the model provider
api_key：模型提供者的 API 密钥
base_url: endpoint to send requests to
base_url：发送请求的端点

Some important things to note:
需要注意的一些重要事项：

standard params only apply to model providers that expose parameters with the intended functionality. For example, some providers do not expose a configuration for maximum output tokens, so max_tokens can't be supported on these.
标准参数仅适用于公开具有预期功能的参数的模型提供程序。例如，某些提供程序不会公开最大输出令牌的配置，因此max_tokens这些令牌上不受支持。
standard params are currently only enforced on integrations that have their own integration packages (e.g. langchain-openai, langchain-anthropic, etc.), they're not enforced on models in langchain-community.
标准参数目前只在拥有自己的集成包的集成上强制执行（例如 langchain-openai、langchain-anthropic 等），它们不会在 langchain-community 中的模型上强制执行。

ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the API reference for that model.
ChatModels 还接受特定于该集成的其他参数。要查找 ChatModel 支持的所有参数，请转到该模型的 API 参考。

info 信息

Some chat models have been fine-tuned for tool calling and provide a dedicated API for it. Generally, such models are better at tool calling than non-fine-tuned models, and are recommended for use cases that require tool calling. Please see the tool calling section for more information.
一些聊天模型已经针对工具调用进行了微调，并为其提供了专用的 API。通常，此类模型比未微调的模型更擅长工具调用，建议用于需要工具调用的用例。有关详细信息，请参阅工具调用部分。

For specifics on how to use chat models, see the relevant how-to guides here.
有关如何使用聊天模型的详细信息，请参阅此处的相关操作指南。

Multimodality
综合

Some chat models are multimodal, accepting images, audio and even video as inputs. These are still less common, meaning model providers haven't standardized on the "best" way to define the API. Multimodal outputs are even less common. As such, we've kept our multimodal abstractions fairly light weight and plan to further solidify the multimodal APIs and interaction patterns as the field matures.
一些聊天模型是多模式的，接受图像、音频甚至视频作为输入。这些仍然不太常见，这意味着模型提供者还没有对定义 API 的“最佳”方式进行标准化。多模态输出甚至更不常见。因此，我们保持了多模态抽象的轻量级，并计划随着该领域的成熟进一步巩固多模态 API 和交互模式。

In LangChain, most chat models that support multimodal inputs also accept those values in OpenAI's content blocks format. So far this is restricted to image inputs. For models like Gemini which support video and other bytes input, the APIs also support the native, model-specific representations.
在LangChain中，大多数支持多模态输入的聊天模型也接受OpenAI的内容块格式中的这些值。到目前为止，这仅限于图像输入。对于像 Gemini 这样支持视频和其他字节输入的模型，API 还支持原生的、特定于模型的表示。

For specifics on how to use multimodal models, see the relevant how-to guides here.
有关如何使用多模态模型的详细信息，请参阅此处的相关操作指南。

For a full list of LangChain model providers with multimodal models, check out this table.
有关具有多模态模型的LangChain模型提供商的完整列表，请查看此表。

LLMs

caution 谨慎

Pure text-in/text-out LLMs tend to be older or lower-level. Many popular models are best used as chat completion models, even for non-chat use cases.
纯文本输入/文本输出LLMs往往较旧或较低级别。许多流行的模型最好用作聊天完成模型，即使对于非聊天用例也是如此。

You are probably looking for the section above instead.
您可能正在寻找上面的部分。

Language models that takes a string as input and returns a string. These are traditionally older models (newer models generally are Chat Models, see above).
将字符串作为输入并返回字符串的语言模型。这些是传统上较旧的模型（较新的模型通常是聊天模型，见上文）。

Although the underlying models are string in, string out, the LangChain wrappers also allow these models to take messages as input. This gives them the same interface as Chat Models. When messages are passed in as input, they will be formatted into a string under the hood before being passed to the underlying model.
尽管底层模型是字符串输入，字符串输出，但LangChain包装器也允许这些模型将消息作为输入。这使它们具有与聊天模型相同的界面。当消息作为输入传入时，它们将被格式化为一个字符串，然后再被传递到底层模型。

LangChain does not host any LLMs, rather we rely on third party integrations.
LangChain不托管任何LLMs内容，而是依赖第三方集成。

For specifics on how to use LLMs, see the relevant how-to guides here.
有关如何使用LLMs的详细信息，请参阅此处的相关操作指南。

Messages 消息

Some language models take a list of messages as input and return a message. There are a few different types of messages. All messages have a role, content, and response_metadata property.
某些语言模型将消息列表作为输入并返回消息。有几种不同类型的消息。所有消息都有一个角色、内容和response_metadata属性。

The role describes WHO is saying the message. LangChain has different message classes for different roles.
该角色描述了世卫组织正在发出的信息。LangChain对不同的角色有不同的消息类别。

The content property describes the content of the message. This can be a few different things:
content 属性描述消息的内容。这可能是一些不同的事情：

A string (most models deal this type of content)
字符串（大多数模型处理此类内容）
A List of dictionaries (this is used for multimodal input, where the dictionary contains information about that input type and that input location)
字典列表（用于多模态输入，其中字典包含有关该输入类型和该输入位置的信息）

HumanMessage
人类消息

This represents a message from the user.
这表示来自用户的消息。

AIMessage AIMessage（智能）

This represents a message from the model. In addition to the content property, these messages also have:
这表示来自模型的消息。除了 content 属性之外，这些消息还具有：

response_metadata

The response_metadata property contains additional metadata about the response. The data here is often specific to each model provider. This is where information like log-probs and token usage may be stored.
response_metadata 属性包含有关响应的其他元数据。此处的数据通常特定于每个模型提供商。这是可以存储 log-probs 和令牌使用情况等信息的地方。

tool_calls

These represent a decision from an language model to call a tool. They are included as part of an AIMessage output. They can be accessed from there with the .tool_calls property.
这些表示语言模型调用工具的决定。它们作为 AIMessage 输出的一部分包含在内。可以使用 .tool_calls 属性从那里访问它们。

This property returns a list of ToolCalls. A ToolCall is a dictionary with the following arguments:
此属性返回 ToolCall的列表。ToolCall 是具有以下参数的字典：

name: The name of the tool that should be called.
name：应调用的工具的名称。
args: The arguments to that tool.
args：该工具的参数。
id: The id of that tool call.
id：该工具调用的 ID。

SystemMessage
系统消息

This represents a system message, which tells the model how to behave. Not every model provider supports this.
这表示一条系统消息，它告诉模型如何行为。并非每个模型提供商都支持此功能。

ToolMessage
工具消息

This represents the result of a tool call. In addition to role and content, this message has:
这表示工具调用的结果。除了角色和内容外，此消息还具有：

a tool_call_id field which conveys the id of the call to the tool that was called to produce this result.
一个tool_call_id字段，用于将调用的 ID 传达给为生成此结果而被调用的工具。
an artifact field which can be used to pass along arbitrary artifacts of the tool execution which are useful to track but which should not be sent to the model.
一个工件字段，可用于传递工具执行的任意工件，这些工件对于跟踪很有用，但不应发送到模型。

(Legacy) FunctionMessage
（旧版）函数消息

This is a legacy message type, corresponding to OpenAI's legacy function-calling API. ToolMessage should be used instead to correspond to the updated tool-calling API.
这是一种遗留的消息类型，对应于 OpenAI 的遗留函数调用 API。应改用 ToolMessage 来对应于更新的工具调用 API。

This represents the result of a function call. In addition to role and content, this message has a name parameter which conveys the name of the function that was called to produce this result.
这表示函数调用的结果。除了 role 和 content 之外，此消息还有一个 name 参数，该参数传达为生成此结果而调用的函数的名称。

Prompt templates
提示模板

Prompt templates help to translate user input and parameters into instructions for a language model. This can be used to guide a model's response, helping it understand the context and generate relevant and coherent language-based output.
提示模板有助于将用户输入和参数转换为语言模型的指令。这可以用来指导模型的响应，帮助它理解上下文并生成相关且连贯的基于语言的输出。

Prompt Templates take as input a dictionary, where each key represents a variable in the prompt template to fill in.
提示模板将字典作为输入，其中每个键代表提示模板中要填充的一个变量。

Prompt Templates output a PromptValue. This PromptValue can be passed to an LLM or a ChatModel, and can also be cast to a string or a list of messages. The reason this PromptValue exists is to make it easy to switch between strings and messages.
提示模板输出 PromptValue。此 PromptValue 可以传递给 LLM 或 ChatModel，也可以强制转换为字符串或消息列表。此 PromptValue 存在的原因是便于在字符串和消息之间切换。

There are a few different types of prompt templates:
有几种不同类型的提示模板：

String PromptTemplates
字符串 PromptTemplates

These prompt templates are used to format a single string, and generally are used for simpler inputs. For example, a common way to construct and use a PromptTemplate is as follows:
这些提示模板用于设置单个字符串的格式，通常用于更简单的输入。例如，构造和使用 PromptTemplate 的常用方法如下：

from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me a joke about {topic}")

prompt_template.invoke({"topic": "cats"})

API Reference:PromptTemplate
API参考：PromptTemplate

ChatPromptTemplates
ChatPromptTemplates（聊天提示模板）

These prompt templates are used to format a list of messages. These "templates" consist of a list of templates themselves. For example, a common way to construct and use a ChatPromptTemplate is as follows:
这些提示模板用于设置消息列表的格式。这些“模板”由模板本身的列表组成。例如，构造和使用 ChatPromptTemplate 的常用方法如下：

from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "Tell me a joke about {topic}")
])

prompt_template.invoke({"topic": "cats"})

API Reference:ChatPromptTemplate
API参考：ChatPromptTemplate

In the above example, this ChatPromptTemplate will construct two messages when called. The first is a system message, that has no variables to format. The second is a HumanMessage, and will be formatted by the topic variable the user passes in.
在上面的示例中，此 ChatPromptTemplate 在调用时将构造两条消息。第一个是系统消息，没有要格式化的变量。第二个是 HumanMessage，将由用户传入的主题变量进行格式化。

MessagesPlaceholder
MessagesPlaceholder

This prompt template is responsible for adding a list of messages in a particular place. In the above ChatPromptTemplate, we saw how we could format two messages, each one a string. But what if we wanted the user to pass in a list of messages that we would slot into a particular spot? This is how you use MessagesPlaceholder.
此提示模板负责在特定位置添加消息列表。在上面的 ChatPromptTemplate 中，我们了解了如何设置两条消息的格式，每条消息都是字符串。但是，如果我们希望用户传入一个消息列表，我们将将其插入到特定位置，该怎么办？这就是使用 MessagesPlaceholder 的方式。

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    MessagesPlaceholder("msgs")
])

prompt_template.invoke({"msgs": [HumanMessage(content="hi!")]})

API Reference:ChatPromptTemplate | MessagesPlaceholder | HumanMessage
API参考：ChatPromptTemplate | MessagesPlaceholder | 人类消息

This will produce a list of two messages, the first one being a system message, and the second one being the HumanMessage we passed in. If we had passed in 5 messages, then it would have produced 6 messages in total (the system message plus the 5 passed in). This is useful for letting a list of messages be slotted into a particular spot.
这将生成一个包含两条消息的列表，第一条是系统消息，第二条是我们传入的 HumanMessage。如果我们传入了 5 条消息，那么它总共会产生 6 条消息（系统消息加上传入的 5 条消息）。这对于将消息列表插入特定位置非常有用。

An alternative way to accomplish the same thing without using the MessagesPlaceholder class explicitly is:
在不显式使用 MessagesPlaceholder 类的情况下完成相同操作的另一种方法是：

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("placeholder", "{msgs}") # <-- This is the changed part
])

For specifics on how to use prompt templates, see the relevant how-to guides here.
有关如何使用提示模板的详细信息，请参阅此处的相关操作指南。

Example selectors
示例选择器

One common prompting technique for achieving better performance is to include examples as part of the prompt. This gives the language model concrete examples of how it should behave. Sometimes these examples are hardcoded into the prompt, but for more advanced situations it may be nice to dynamically select them. Example Selectors are classes responsible for selecting and then formatting examples into prompts.
实现更好性能的一种常见提示技术是在提示中包含示例。这为语言模型提供了它应该如何表现的具体示例。有时，这些示例会硬编码到提示符中，但对于更高级的情况，动态选择它们可能会更好。示例选择器是负责选择示例并将其格式化为提示的类。

For specifics on how to use example selectors, see the relevant how-to guides here.
有关如何使用示例选择器的详细信息，请参阅此处的相关操作指南。

Output parsers
输出解析器

note 注意

The information here refers to parsers that take a text output from a model try to parse it into a more structured representation. More and more models are supporting function (or tool) calling, which handles this automatically. It is recommended to use function/tool calling rather than output parsing. See documentation for that here.
此处的信息是指从模型获取文本输出的解析器，这些解析器尝试将其解析为更结构化的表示形式。越来越多的模型支持函数（或工具）调用，它会自动处理这个问题。建议使用函数/工具调用而不是输出解析。请参阅此处的文档。

Responsible for taking the output of a model and transforming it to a more suitable format for downstream tasks. Useful when you are using LLMs to generate structured data, or to normalize output from chat models and LLMs.
负责获取模型的输出，并将其转换为更适合下游任务的格式。当您用于LLMs生成结构化数据，或用于规范化聊天模型和 LLMs的输出时，非常有用。

LangChain has lots of different types of output parsers. This is a list of output parsers LangChain supports. The table below has various pieces of information:
LangChain有很多不同类型的输出解析器。这是LangChain支持的输出解析器列表。下表包含各种信息：

Name: The name of the output parser
名称：输出解析器的名称

Supports Streaming: Whether the output parser supports streaming.
支持流式处理：输出解析器是否支持流式处理。

Has Format Instructions: Whether the output parser has format instructions. This is generally available except when (a) the desired schema is not specified in the prompt but rather in other parameters (like OpenAI function calling), or (b) when the OutputParser wraps another OutputParser.
有格式指令：输出解析器是否有格式指令。除非（a）未在提示符中指定所需的架构，而是在其他参数（如 OpenAI 函数调用）中指定，或者（b）当 OutputParser 包装另一个 OutputParser 时，这通常是可用的。

Calls LLM: Whether this output parser itself calls an LLM. This is usually only done by output parsers that attempt to correct misformatted output.
调用LLM：此输出解析器本身是否调用 LLM.这通常仅由尝试更正格式错误的输出的输出解析器完成。

Input Type: Expected input type. Most output parsers work on both strings and messages, but some (like OpenAI Functions) need a message with specific kwargs.
输入类型：预期的输入类型。大多数输出解析器都适用于字符串和消息，但有些（如 OpenAI 函数）需要具有特定 kwargs 的消息。

Output Type: The output type of the object returned by the parser.
输出类型：解析器返回的对象的输出类型。

Description: Our commentary on this output parser and when to use it.
描述：我们对此输出解析器以及何时使用它的评论。

Name 名字	Supports Streaming 支持流媒体	Has Format Instructions 有格式说明	Calls LLM 调用 LLM	Input Type 输入类型	Output Type 输出类型	Description 描述
JSON	✅	✅		`str` \| `Message` `斯特朗` \|`消息`	JSON object JSON 对象	Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling. 返回指定的 JSON 对象。您可以指定一个 Pydantic 模型，它将返回该模型的 JSON。可能是最可靠的输出解析器，用于获取不使用函数调用的结构化数据。
XML	✅	✅		`str` \| `Message` `斯特朗` \|`消息`	`dict 字典`	Returns a dictionary of tags. Use when XML output is needed. Use with models that are good at writing XML (like Anthropic's). 返回标记的字典。在需要 XML 输出时使用。与擅长编写 XML 的模型（如 Anthropic）一起使用。
CSV	✅	✅		`str` \| `Message` `斯特朗` \|`消息`	`List[str] 列表[str]`	Returns a list of comma separated values. 返回逗号分隔值的列表。
OutputFixing 输出固定			✅	`str` \| `Message` `斯特朗` \|`消息`		Wraps another output parser. If that output parser errors, then this will pass the error message and the bad output to an LLM and ask it to fix the output. 包装另一个输出分析器。如果该输出解析器出错，则会将错误消息和错误的输出传递给 anLLM，并要求它修复输出。
RetryWithError			✅	`str` \| `Message` `斯特朗` \|`消息`		Wraps another output parser. If that output parser errors, then this will pass the original inputs, the bad output, and the error message to an LLM and ask it to fix it. Compared to OutputFixingParser, this one also sends the original instructions. 包装另一个输出分析器。如果该输出解析器出错，则会将原始输入、错误输出和错误消息传递给 anLLM，并要求它修复它。与 OutputFixingParser 相比，这个也发送原始指令。
Pydantic 皮丹蒂克		✅		`str` \| `Message` `斯特朗` \|`消息`	`pydantic.BaseModel 皮丹蒂克。基础模型`	Takes a user defined Pydantic model and returns data in that format. 采用用户定义的 Pydantic 模型，并以该格式返回数据。
YAML		✅		`str` \| `Message` `斯特朗` \|`消息`	`pydantic.BaseModel 皮丹蒂克。基础模型`	Takes a user defined Pydantic model and returns data in that format. Uses YAML to encode it. 采用用户定义的 Pydantic 模型，并以该格式返回数据。使用 YAML 进行编码。
PandasDataFrame PandasDataFrame（熊猫数据帧）		✅		`str` \| `Message` `斯特朗` \|`消息`	`dict 字典`	Useful for doing operations with pandas DataFrames. 对于使用 pandas DataFrame 执行操作很有用。
Enum 枚举		✅		`str` \| `Message` `斯特朗` \|`消息`	`Enum 枚举`	Parses response into one of the provided enum values. 将响应分析为提供的枚举值之一。
Datetime 日期时间		✅		`str` \| `Message` `斯特朗` \|`消息`	`datetime.datetime`	Parses response into a datetime string. 将响应分析为 datetime 字符串。
Structured 结构		✅		`str` \| `Message` `斯特朗` \|`消息`	`Dict[str, str] 字典[str， str]`	An output parser that returns structured information. It is less powerful than other output parsers since it only allows for fields to be strings. This can be useful when you are working with smaller LLMs. 返回结构化信息的输出分析器。它不如其他输出解析器强大，因为它只允许字段是字符串。当您使用较小的 LLMs.

For specifics on how to use output parsers, see the relevant how-to guides here.
有关如何使用输出分析器的详细信息，请参阅此处的相关操作指南。

Chat history
聊天记录

Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation. At bare minimum, a conversational system should be able to access some window of past messages directly.
大多数LLM应用程序都有一个对话界面。对话的一个重要组成部分是能够引用对话中早些时候介绍的信息。至少，对话系统应该能够直接访问过去消息的某些窗口。

The concept of ChatHistory refers to a class in LangChain which can be used to wrap an arbitrary chain. This ChatHistory will keep track of inputs and outputs of the underlying chain, and append them as messages to a message database. Future interactions will then load those messages and pass them into the chain as part of the input.
ChatHistory的概念是指LangChain中的一个类，可以用来包装任意链。这个ChatHistory将跟踪底层链的输入和输出，并将它们作为消息附加到消息数据库中。然后，未来的交互将加载这些消息，并将它们作为输入的一部分传递到链中。

Documents 文件

A Document object in LangChain contains information about some data. It has two attributes:
LangChain中的Document对象包含有关某些数据的信息。它有两个属性：

page_content: str: The content of this document. Currently is only a string.
page_content： str：本文档的内容。目前只是一个字符串。
metadata: dict: Arbitrary metadata associated with this document. Can track the document id, file name, etc.
metadata： dict：与此文档关联的任意元数据。可以跟踪文档ID、文件名等。

Document loaders
文档加载器

These classes load Document objects. LangChain has hundreds of integrations with various data sources to load data from: Slack, Notion, Google Drive, etc.
这些类加载 Document 对象。LangChain有数百种与各种数据源的集成，可以从中加载数据：Slack，Notion，Google Drive等。

Each DocumentLoader has its own specific parameters, but they can all be invoked in the same way with the .load method. An example use case is as follows:
每个 DocumentLoader 都有自己的特定参数，但它们都可以使用 .load 方法以相同的方式调用。示例用例如下：

from langchain_community.document_loaders.csv_loader import CSVLoader

loader = CSVLoader(
    ...  # <-- Integration specific parameters here
)
data = loader.load()

API Reference:CSVLoader
API参考：CSVLoader

For specifics on how to use document loaders, see the relevant how-to guides here.
有关如何使用文档加载器的详细信息，请参阅此处的相关操作指南。

Text splitters
文本拆分器

Once you've loaded documents, you'll often want to transform them to better suit your application. The simplest example is you may want to split a long document into smaller chunks that can fit into your model's context window. LangChain has a number of built-in document transformers that make it easy to split, combine, filter, and otherwise manipulate documents.
一旦你加载了文档，你经常会希望转换它们以更好地适应你的应用程序。最简单的示例是，您可能希望将长文档拆分为较小的块，以便适应模型的上下文窗口。LangChain有许多内置的文档转换器，可以很容易地分割、合并、过滤和以其他方式操作文档。

When you want to deal with long pieces of text, it is necessary to split up that text into chunks. As simple as this sounds, there is a lot of potential complexity here. Ideally, you want to keep the semantically related pieces of text together. What "semantically related" means could depend on the type of text. This notebook showcases several ways to do that.
当您想要处理长文本时，有必要将该文本分成多个块。尽管这听起来很简单，但这里存在很多潜在的复杂性。理想情况下，您希望将语义相关的文本片段保持在一起。“语义相关”的含义可能取决于文本的类型。此笔记本展示了几种执行此操作的方法。

At a high level, text splitters work as following:
概括地说，文本分割器的工作方式如下：

Split the text up into small, semantically meaningful chunks (often sentences).
将文本拆分为语义上有意义的小块（通常是句子）。
Start combining these small chunks into a larger chunk until you reach a certain size (as measured by some function).
开始将这些小块组合成一个较大的块，直到达到一定的大小（由某些函数测量）。
Once you reach that size, make that chunk its own piece of text and then start creating a new chunk of text with some overlap (to keep context between chunks).
达到该大小时后，将该块设为自己的文本段，然后开始创建一个具有一些重叠的新文本块（以保持块之间的上下文）。

That means there are two different axes along which you can customize your text splitter:
这意味着有两个不同的轴，您可以沿着它们自定义文本分割器：

How the text is split
文本是如何分割的
How the chunk size is measured
如何测量块大小

For specifics on how to use text splitters, see the relevant how-to guides here.
有关如何使用文本拆分器的详细信息，请参阅此处的相关操作指南。

Embedding models
嵌入模型

Embedding models create a vector representation of a piece of text. You can think of a vector as an array of numbers that captures the semantic meaning of the text. By representing the text in this way, you can perform mathematical operations that allow you to do things like search for other pieces of text that are most similar in meaning. These natural language search capabilities underpin many types of context retrieval, where we provide an LLM with the relevant data it needs to effectively respond to a query.
嵌入模型创建一段文本的向量表示。您可以将向量视为捕获文本语义含义的数字数组。通过以这种方式表示文本，您可以执行数学运算，从而执行诸如搜索含义最相似的其他文本片段等操作。这些自然语言搜索功能是许多类型的上下文检索的基础，在这些检索中，我们提供LLM有效响应查询所需的相关数据。

The Embeddings class is a class designed for interfacing with text embedding models. There are many different embedding model providers (OpenAI, Cohere, Hugging Face, etc) and local models, and this class is designed to provide a standard interface for all of them.
Embeddings 类是专为与文本嵌入模型交互而设计的类。有许多不同的嵌入模型提供者（OpenAI、Cohere、Hugging Face 等）和本地模型，这个类旨在为它们提供一个标准接口。

The base Embeddings class in LangChain provides two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).
LangChain中的基础Embeddings类提供了两种方法：一种用于嵌入文档，另一种用于嵌入查询。前者接受多个文本作为输入，而后者接受单个文本。将这些方法作为两种独立方法的原因是，某些嵌入提供程序对文档（要搜索）和查询（搜索查询本身）具有不同的嵌入方法。

For specifics on how to use embedding models, see the relevant how-to guides here.
有关如何使用嵌入模型的详细信息，请参阅此处的相关操作指南。

Vector stores
矢量存储

One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query. A vector store takes care of storing embedded data and performing vector search for you.
存储和搜索非结构化数据的最常见方法之一是嵌入数据并存储生成的嵌入向量，然后在查询时嵌入非结构化查询并检索与嵌入查询“最相似”的嵌入向量。向量存储负责存储嵌入数据并为您执行向量搜索。

Most vector stores can also store metadata about embedded vectors and support filtering on that metadata before similarity search, allowing you more control over returned documents.
大多数向量存储还可以存储有关嵌入向量的元数据，并支持在相似性搜索之前过滤该元数据，从而使您能够更好地控制返回的文档。

Vector stores can be converted to the retriever interface by doing:
可以通过执行以下操作将向量存储转换为检索器接口：

vectorstore = MyVectorStore()
retriever = vectorstore.as_retriever()

For specifics on how to use vector stores, see the relevant how-to guides here.
有关如何使用矢量存储的详细信息，请参阅此处的相关操作指南。

Retrievers 猎犬

A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store. A retriever does not need to be able to store documents, only to return (or retrieve) them. Retrievers can be created from vector stores, but are also broad enough to include Wikipedia search and Amazon Kendra.
检索器是一个接口，它在给定非结构化查询的情况下返回文档。它比向量存储更通用。检索器不需要能够存储文档，只需返回（或检索）它们。检索器可以从矢量存储创建，但也足够广泛，包括维基百科搜索和亚马逊肯德拉。

Retrievers accept a string query as input and return a list of Document's as output.
检索器接受字符串查询作为输入，并返回 Document 的列表作为输出。

For specifics on how to use retrievers, see the relevant how-to guides here.
有关如何使用检索器的详细信息，请参阅此处的相关操作指南。

Tools 工具

Tools are utilities designed to be called by a model: their inputs are designed to be generated by models, and their outputs are designed to be passed back to models. Tools are needed whenever you want a model to control parts of your code or call out to external APIs.
工具是设计为由模型调用的实用程序：它们的输入被设计为由模型生成，而它们的输出被设计为被设计为传递回模型。每当您希望模型控制代码的一部分或调用外部 API 时，就需要使用工具。

A tool consists of: 工具由以下部分组成：

The name of the tool.
工具的名称。
A description of what the tool does.
对工具功能的描述。
A JSON schema defining the inputs to the tool.
定义工具输入的 JSON 架构。
A function (and, optionally, an async variant of the function).
一个函数（以及（可选）函数的异步变体）。

When a tool is bound to a model, the name, description and JSON schema are provided as context to the model. Given a list of tools and a set of instructions, a model can request to call one or more tools with specific inputs. Typical usage may look like the following:
当工具绑定到模型时，名称、描述和 JSON 架构将作为模型的上下文提供。给定一个工具列表和一组指令，模型可以请求调用一个或多个具有特定输入的工具。典型用法可能如下所示：

tools = [...] # Define a list of tools
llm_with_tools = llm.bind_tools(tools)
ai_msg = llm_with_tools.invoke("do xyz...")  # AIMessage(tool_calls=[ToolCall(...), ...], ...)

The AIMessage returned from the model MAY have tool_calls associated with it. Read this guide for more information on what the response type may look like.
从模型返回的 AIMessage 可能tool_calls与之关联。阅读本指南，了解有关响应类型可能如下所示的更多信息。

Once the chosen tools are invoked, the results can be passed back to the model so that it can complete whatever task it's performing. There are generally two different ways to invoke the tool and pass back the response:
一旦调用了选定的工具，就可以将结果传回模型，以便它可以完成它正在执行的任何任务。通常有两种不同的方法来调用该工具并传回响应：

Invoke with just the arguments
仅使用参数调用

When you invoke a tool with just the arguments, you will get back the raw tool output (usually a string). This generally looks like:
当您仅使用参数调用工具时，您将返回原始工具输出（通常是字符串）。这通常看起来像：

# You will want to previously check that the LLM returned tool calls
tool_call = ai_msg.tool_calls[0]  # ToolCall(args={...}, id=..., ...)
tool_output = tool.invoke(tool_call["args"])
tool_message = ToolMessage(content=tool_output, tool_call_id=tool_call["id"], name=tool_call["name"])

Note that the content field will generally be passed back to the model. If you do not want the raw tool response to be passed to the model, but you still want to keep it around, you can transform the tool output but also pass it as an artifact (read more about ToolMessage.artifact here)
请注意，内容字段通常会传回模型。如果您不希望将原始工具响应传递给模型，但仍希望保留它，则可以转换工具输出，但也可以将其作为工件传递（在此处阅读有关 ToolMessage.artifact 的更多信息）

... # Same code as above
response_for_llm = transform(response)
tool_message = ToolMessage(content=response_for_llm, tool_call_id=tool_call["id"], name=tool_call["name"], artifact=tool_output)

Invoke with `ToolCall`
使用 `ToolCall` 调用

The other way to invoke a tool is to call it with the full ToolCall that was generated by the model. When you do this, the tool will return a ToolMessage. The benefits of this are that you don't have to write the logic yourself to transform the tool output into a ToolMessage. This generally looks like:
调用工具的另一种方法是使用模型生成的完整 ToolCall 来调用它。执行此操作时，该工具将返回 ToolMessage。这样做的好处是，您无需自己编写逻辑即可将工具输出转换为 ToolMessage。这通常看起来像：

tool_call = ai_msg.tool_calls[0]  # ToolCall(args={...}, id=..., ...)
tool_message = tool.invoke(tool_call)
# -> ToolMessage(content="tool result foobar...", tool_call_id=..., name="tool_name")

If you are invoking the tool this way and want to include an artifact for the ToolMessage, you will need to have the tool return two things. Read more about defining tools that return artifacts here.
如果以这种方式调用该工具，并希望包含 ToolMessage 的项目，则需要让该工具返回两件事。在此处阅读有关定义返回工件的工具的更多信息。

Best practices
最佳做法

When designing tools to be used by a model, it is important to keep in mind that:
在设计供模型使用的工具时，请务必记住：

Chat models that have explicit tool-calling APIs will be better at tool calling than non-fine-tuned models.
具有显式工具调用 API 的聊天模型比未微调的模型更擅长工具调用。
Models will perform better if the tools have well-chosen names, descriptions, and JSON schemas. This another form of prompt engineering.
如果工具具有精心选择的名称、描述和 JSON 架构，则模型的性能会更好。这是提示工程的另一种形式。
Simple, narrowly scoped tools are easier for models to use than complex tools.
与复杂工具相比，简单、范围狭窄的工具更易于模型使用。

For specifics on how to use tools, see the tools how-to guides.
有关如何使用工具的详细信息，请参阅工具操作指南。

To use a pre-built tool, see the tool integration docs.
要使用预构建的工具，请参阅工具集成文档。

Toolkits 工具包

Toolkits are collections of tools that are designed to be used together for specific tasks. They have convenient loading methods.
工具包是设计用于特定任务的工具集合。他们有方便的加载方法。

All Toolkits expose a get_tools method which returns a list of tools. You can therefore do:
所有工具包都公开了一个 get_tools 方法，该方法返回工具列表。因此，您可以执行以下操作：

# Initialize a toolkit
toolkit = ExampleTookit(...)

# Get list of tools
tools = toolkit.get_tools()

Agents 代理

By themselves, language models can't take actions - they just output text. A big use case for LangChain is creating agents. Agents are systems that use an LLM as a reasoning engine to determine which actions to take and what the inputs to those actions should be. The results of those actions can then be fed back into the agent and it determine whether more actions are needed, or whether it is okay to finish.
就其本身而言，语言模型不能执行操作 - 它们只能输出文本。LangChain的一大用例是创建代理。代理是使用 a LLM 作为推理引擎来确定要执行哪些操作以及这些操作的输入应该是什么的系统。然后，可以将这些操作的结果反馈到代理中，并确定是否需要执行更多操作，或者是否可以完成。

LangGraph is an extension of LangChain specifically aimed at creating highly controllable and customizable agents. Please check out that documentation for a more in depth overview of agent concepts.
LangGraph 是 LangChain 的扩展，专门针对创建高度可控和可定制的代理。请查看该文档，更深入地了解代理概念。

There is a legacy agent concept in LangChain that we are moving towards deprecating: AgentExecutor. AgentExecutor was essentially a runtime for agents. It was a great place to get started, however, it was not flexible enough as you started to have more customized agents. In order to solve that we built LangGraph to be this flexible, highly-controllable runtime.
LangChain中有一个遗留的代理概念，我们正在逐步弃用：AgentExecutor。AgentExecutor 本质上是代理的运行时。这是一个很好的起点，但是，随着您开始拥有更多定制的代理，它不够灵活。为了解决这个问题，我们构建了 LangGraph 来成为这种灵活、高度可控的运行时。

If you are still using AgentExecutor, do not fear: we still have a guide on how to use AgentExecutor. It is recommended, however, that you start to transition to LangGraph. In order to assist in this we have put together a transition guide on how to do so.
如果您仍在使用 AgentExecutor，请不要担心：我们仍然有有关如何使用 AgentExecutor 的指南。但是，建议您开始过渡到 LangGraph。为了在这方面提供帮助，我们整理了一份关于如何做到这一点的过渡指南。

ReAct agents
ReAct 代理

One popular architecture for building agents is ReAct. ReAct combines reasoning and acting in an iterative process - in fact the name "ReAct" stands for "Reason" and "Act".
一种流行的构建代理架构是 ReAct。ReAct 在一个迭代过程中结合了推理和行动——事实上，“ReAct”这个名字代表着“Reason”和“Act”。

The general flow looks like this:
一般流程如下所示：

The model will "think" about what step to take in response to an input and any previous observations.
该模型将“思考”根据输入和任何先前的观察结果采取什么步骤。
The model will then choose an action from available tools (or choose to respond to the user).
然后，模型将从可用工具中选择一个操作（或选择响应用户）。
The model will generate arguments to that tool.
该模型将生成该工具的参数。
The agent runtime (executor) will parse out the chosen tool and call it with the generated arguments.
代理运行时（执行器）将解析出所选工具，并使用生成的参数调用它。
The executor will return the results of the tool call back to the model as an observation.
执行器会将工具调用的结果作为观察结果返回给模型。
This process repeats until the agent chooses to respond.
此过程将重复进行，直到代理选择响应。

There are general prompting based implementations that do not require any model-specific features, but the most reliable implementations use features like tool calling to reliably format outputs and reduce variance.
有一些基于提示的常规实现不需要任何特定于模型的功能，但最可靠的实现使用工具调用等功能来可靠地格式化输出并减少方差。

Please see the LangGraph documentation for more information, or this how-to guide for specific information on migrating to LangGraph.
有关更多信息，请参阅 LangGraph 文档，或者参阅本操作指南，了解有关迁移到 LangGraph 的具体信息。

Callbacks 回调

LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
LangChain提供了一个回调系统，允许你挂钩到你的LLM应用程序的各个阶段。这对于日志记录、监视、流式处理和其他任务非常有用。

You can subscribe to these events by using the callbacks argument available throughout the API. This argument is list of handler objects, which are expected to implement one or more of the methods described below in more detail.
您可以使用整个 API 中可用的 callbacks 参数来订阅这些事件。此参数是处理程序对象的列表，这些对象应实现下面更详细描述的一个或多个方法。

Callback Events
回调事件

Event 事件	Event Trigger 事件触发器	Associated Method 关联方法
Chat model start 聊天模型开始	When a chat model starts 聊天模型启动时	`on_chat_model_start`
LLM start LLM 开始	When a llm starts 当一个llm开始	`on_llm_start`
LLM new token LLM新令牌	When an llm OR chat model emits a new token llm当 OR 聊天模型发出新的令牌时	`on_llm_new_token`
LLM ends LLM 结束	When an llm OR chat model ends llm当 OR 聊天模型结束时	`on_llm_end`
LLM errors LLM 错误	When an llm OR chat model errors llm当 OR 聊天模型出错时	`on_llm_error`
Chain start 链条启动	When a chain starts running 当链开始运行时	`on_chain_start`
Chain end 链条端	When a chain ends 当链条结束时	`on_chain_end`
Chain error 链式误差	When a chain errors 当链条出错时	`on_chain_error`
Tool start 工具启动	When a tool starts running 当工具开始运行时	`on_tool_start`
Tool end 工具端	When a tool ends 当工具结束时	`on_tool_end`
Tool error 工具错误	When a tool errors 当工具出错时	`on_tool_error`
Agent action 代理操作	When an agent takes an action 当代理执行操作时	`on_agent_action`
Agent finish 代理完成	When an agent ends 当代理结束时	`on_agent_finish`
Retriever start 检索器启动	When a retriever starts 当检索器启动时	`on_retriever_start`
Retriever end 检索器端	When a retriever ends 当检索器结束时	`on_retriever_end`
Retriever error 检索器错误	When a retriever errors 当检索器出错时	`on_retriever_error`
Text 发短信	When arbitrary text is run 运行任意文本时	`on_text`
Retry 重试	When a retry event is run 运行重试事件时	`on_retry`

Callback handlers
回调处理程序

Callback handlers can either be sync or async:
回调处理程序可以是 sync 或 async：

Sync callback handlers implement the BaseCallbackHandler interface.
同步回调处理程序实现 BaseCallbackHandler 接口。
Async callback handlers implement the AsyncCallbackHandler interface.
异步回调处理程序实现 AsyncCallbackHandler 接口。

During run-time LangChain configures an appropriate callback manager (e.g., CallbackManager or AsyncCallbackManager which will be responsible for calling the appropriate method on each "registered" callback handler when the event is triggered.
在运行时，LangChain会配置一个适当的回调管理器（例如，CallbackManager或AsyncCallbackManager），当事件触发时，它将负责在每个“已注册”的回调处理器上调用适当的方法。

Passing callbacks
传递回调

The callbacks property is available on most objects throughout the API (Models, Tools, Agents, etc.) in two different places:
callbacks 属性在 API 中的大多数对象（模型、工具、代理等）的两个不同位置都可用：

The callbacks are available on most objects throughout the API (Models, Tools, Agents, etc.) in two different places:
回调在 API 中的大多数对象（模型、工具、代理等）的两个不同位置都可用：

Request time callbacks: Passed at the time of the request in addition to the input data. Available on all standard Runnable objects. These callbacks are INHERITED by all children of the object they are defined on. For example, chain.invoke({"number": 25}, {"callbacks": [handler]}).
请求时间回调：在请求时传递的回调以及输入数据。适用于所有标准 Runnable 对象。这些回调由定义它们的对象的所有子级继承。例如，chain.invoke（{“number”： 25}， {“callbacks”： [handler]}）。
Constructor callbacks: chain = TheNameOfSomeChain(callbacks=[handler]). These callbacks are passed as arguments to the constructor of the object. The callbacks are scoped only to the object they are defined on, and are not inherited by any children of the object.
构造函数回调：chain = TheNameOfSomeChain（callbacks=[handler]）。这些回调将作为参数传递给对象的构造函数。回调的范围仅限于定义它们的对象，并且不会被对象的任何子级继承。

danger 危险

Constructor callbacks are scoped only to the object they are defined on. They are not inherited by children of the object.
构造函数回调的范围仅限于定义它们的对象。它们不会被对象的子级继承。

If you're creating a custom chain or runnable, you need to remember to propagate request time callbacks to any child objects.
如果要创建自定义链或可运行链，则需要记住将请求时间回调传播到任何子对象。

Async in Python<=3.10 Python 中的异步<=3.10

Any RunnableLambda, a RunnableGenerator, or Tool that invokes other runnables and is running async in python<=3.10, will have to propagate callbacks to child objects manually. This is because LangChain cannot automatically propagate callbacks to child objects in this case.
任何 RunnableLambda、RunnableGenerator 或调用其他 runablele 并在 python<=3.10 中异步运行的工具都必须手动将回调传播到子对象。这是因为在这种情况下，LangChain无法自动将回调传播到子对象。

This is a common reason why you may fail to see events being emitted from custom runnables or tools.
这是您可能无法看到从自定义可运行对象或工具发出的事件的常见原因。

For specifics on how to use callbacks, see the relevant how-to guides here.
有关如何使用回调的详细信息，请参阅此处的相关操作指南。

Techniques 技术

Streaming 流

Individual LLM calls often run for much longer than traditional resource requests. This compounds when you build more complex chains or agents that require multiple reasoning steps.
单个LLM调用的运行时间通常比传统资源请求运行得更长。当您构建需要多个推理步骤的更复杂的链或代理时，这种情况会更加复杂。

Fortunately, LLMs generate output iteratively, which means it's possible to show sensible intermediate results before the final response is ready. Consuming output as soon as it becomes available has therefore become a vital part of the UX around building apps with LLMs to help alleviate latency issues, and LangChain aims to have first-class support for streaming.
幸运的是，以LLMs迭代方式生成输出，这意味着在最终响应准备就绪之前，可以显示合理的中间结果。因此，一旦输出可用，就立即消费输出已成为构建应用程序的用户体验的重要组成部分，LLMs以帮助缓解延迟问题，而LangChain的目标是为流媒体提供一流的支持。

Below, we'll discuss some concepts and considerations around streaming in LangChain.
下面，我们将讨论有关LangChain中流式处理的一些概念和注意事项。

`.stream()` and `.astream()`
`.stream（）` 和 `.astream（）`

Most modules in LangChain include the .stream() method (and the equivalent .astream() method for async environments) as an ergonomic streaming interface. .stream() returns an iterator, which you can consume with a simple for loop. Here's an example with a chat model:
LangChain中的大多数模块都包含.stream（）方法（以及用于异步环境的等效的.astream（）方法）作为符合人体工程学的流接口。.stream（） 返回一个迭代器，您可以通过简单的 for 循环来使用该迭代器。下面是一个聊天模型的示例：

from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-3-sonnet-20240229")

for chunk in model.stream("what color is the sky?"):
    print(chunk.content, end="|", flush=True)

API Reference:ChatAnthropic
API参考：ChatAnthropic

For models (or other components) that don't support streaming natively, this iterator would just yield a single chunk, but you could still use the same general pattern when calling them. Using .stream() will also automatically call the model in streaming mode without the need to provide additional config.
对于本机不支持流式处理的模型（或其他组件），此迭代器只会生成一个块，但在调用它们时，您仍然可以使用相同的常规模式。使用 .stream（） 也会自动在流模式下调用模型，而无需提供额外的配置。

The type of each outputted chunk depends on the type of component - for example, chat models yield AIMessageChunks. Because this method is part of LangChain Expression Language, you can handle formatting differences from different outputs using an output parser to transform each yielded chunk.
每个输出块的类型取决于组件的类型 - 例如，聊天模型产生 AIMessageChunks。由于此方法是 LangChain 表达式语言的一部分，因此您可以使用输出解析器来处理来自不同输出的格式差异，以转换每个生成的块。

You can check out this guide for more detail on how to use .stream().
您可以查看本指南，了解有关如何使用 .stream（） 的更多详细信息。

`.astream_events()`
`.astream_events（）`

While the .stream() method is intuitive, it can only return the final generated value of your chain. This is fine for single LLM calls, but as you build more complex chains of several LLM calls together, you may want to use the intermediate values of the chain alongside the final output - for example, returning sources alongside the final generation when building a chat over documents app.
虽然 .stream（） 方法很直观，但它只能返回链的最终生成值。这对于单个LLM调用来说很好，但是当您一起构建由多个LLM调用组成的更复杂的链时，您可能希望将链的中间值与最终输出一起使用 - 例如，在构建文档聊天应用程序时，将源与最终生成一起返回。

There are ways to do this using callbacks, or by constructing your chain in such a way that it passes intermediate values to the end with something like chained .assign() calls, but LangChain also includes an .astream_events() method that combines the flexibility of callbacks with the ergonomics of .stream(). When called, it returns an iterator which yields various types of events that you can filter and process according to the needs of your project.
有一些方法可以使用回调来做到这一点，或者通过这样一种方式构建你的链，即通过类似链式的.assign（）调用将中间值传递到最后，但LangChain还包括一个.astream_events（）方法，它将回调的灵活性与.stream（）的人体工程学相结合。调用时，它会返回一个迭代器，该迭代器会生成各种类型的事件，您可以根据项目的需要过滤和处理这些事件。

Here's one small example that prints just events containing streamed chat model output:
下面是一个小示例，它仅打印包含流式聊天模型输出的事件：

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(model="claude-3-sonnet-20240229")

prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
parser = StrOutputParser()
chain = prompt | model | parser

async for event in chain.astream_events({"topic": "parrot"}, version="v2"):
    kind = event["event"]
    if kind == "on_chat_model_stream":
        print(event, end="|", flush=True)

API Reference:StrOutputParser | ChatPromptTemplate | ChatAnthropic
API 参考：StrOutputParser | ChatPromptTemplate | 聊天人类学

You can roughly think of it as an iterator over callback events (though the format differs) - and you can use it on almost all LangChain components!
你可以粗略地把它看作是回调事件的迭代器（尽管格式不同）——而且你几乎可以在所有的LangChain组件上使用它！

See this guide for more detailed information on how to use .astream_events(), including a table listing available events.
有关如何使用 .astream_events（） 的更多详细信息，请参阅本指南，包括列出可用事件的表格。

Callbacks 回调

The lowest level way to stream outputs from LLMs in LangChain is via the callbacks system. You can pass a callback handler that handles the on_llm_new_token event into LangChain components. When that component is invoked, any LLM or chat model contained in the component calls the callback with the generated token. Within the callback, you could pipe the tokens into some other destination, e.g. a HTTP response. You can also handle the on_llm_end event to perform any necessary cleanup.
LLMs在LangChain中流式传输输出的最低级别方式是通过回调系统。您可以将处理on_llm_new_token事件的回调处理程序传递给 LangChain 组件。当调用该组件时，组件中包含的任何LLM 或聊天模型都会使用生成的令牌调用回调。在回调中，您可以将令牌通过管道传输到其他目标，例如 HTTP 响应。您还可以处理on_llm_end事件以执行任何必要的清理。

You can see this how-to section for more specifics on using callbacks.
您可以查看此操作指南部分，了解有关使用回调的更多详细信息。

Callbacks were the first technique for streaming introduced in LangChain. While powerful and generalizable, they can be unwieldy for developers. For example:
回调是LangChain中引入的第一种流技术。虽然功能强大且可通用，但它们对于开发人员来说可能很笨拙。例如：

You need to explicitly initialize and manage some aggregator or other stream to collect results.
您需要显式初始化和管理一些聚合器或其他流来收集结果。
The execution order isn't explicitly guaranteed, and you could theoretically have a callback run after the .invoke() method finishes.
执行顺序没有明确保证，理论上您可以在 .invoke（） 方法完成后运行回调。
Providers would often make you pass an additional parameter to stream outputs instead of returning them all at once.
提供商通常会让您将其他参数传递给流输出，而不是一次返回所有参数。
You would often ignore the result of the actual model call in favor of callback results.
您通常会忽略实际模型调用的结果，而关注回调结果。

Tokens 令牌

The unit that most model providers use to measure input and output is via a unit called a token. Tokens are the basic units that language models read and generate when processing or producing text. The exact definition of a token can vary depending on the specific way the model was trained - for instance, in English, a token could be a single word like "apple", or a part of a word like "app".
大多数模型提供商用来测量输入和输出的单位是通过称为令牌的单位。标记是语言模型在处理或生成文本时读取和生成的基本单元。令牌的确切定义可能因模型训练的特定方式而异 - 例如，在英语中，令牌可以是单个单词（如“apple”），也可以是单词的一部分（如“app”）。

When you send a model a prompt, the words and characters in the prompt are encoded into tokens using a tokenizer. The model then streams back generated output tokens, which the tokenizer decodes into human-readable text. The below example shows how OpenAI models tokenize LangChain is cool!:
当您向模型发送提示时，提示中的单词和字符将使用标记器编码到标记中。然后，该模型流回生成的输出标记，标记器将其解码为人类可读的文本。以下示例显示了 OpenAI 模型如何标记 LangChain 很酷！

You can see that it gets split into 5 different tokens, and that the boundaries between tokens are not exactly the same as word boundaries.
您可以看到它被分成 5 个不同的标记，并且标记之间的边界与单词边界并不完全相同。

The reason language models use tokens rather than something more immediately intuitive like "characters" has to do with how they process and understand text. At a high-level, language models iteratively predict their next generated output based on the initial input and their previous generations. Training the model using tokens language models to handle linguistic units (like words or subwords) that carry meaning, rather than individual characters, which makes it easier for the model to learn and understand the structure of the language, including grammar and context. Furthermore, using tokens can also improve efficiency, since the model processes fewer units of text compared to character-level processing.
语言模型之所以使用标记，而不是像“字符”这样更直接直观的东西，这与它们处理和理解文本的方式有关。在高层次上，语言模型根据初始输入和前几代内容迭代预测其下一个生成的输出。使用标记语言模型训练模型，以处理带有含义的语言单元（如单词或子词），而不是单个字符，这使模型更容易学习和理解语言的结构，包括语法和上下文。此外，使用令牌还可以提高效率，因为与字符级处理相比，模型处理的文本单元更少。

Function/tool calling
函数/工具调用

info 信息

We use the term tool calling interchangeably with function calling. Although function calling is sometimes meant to refer to invocations of a single function, we treat all models as though they can return multiple tool or function calls in each message.
我们将术语“工具调用”与函数调用互换使用。尽管函数调用有时是指对单个函数的调用，但我们将所有模型视为它们可以在每条消息中返回多个工具或函数调用。

Tool calling allows a chat model to respond to a given prompt by generating output that matches a user-defined schema.
工具调用允许聊天模型通过生成与用户定义的架构匹配的输出来响应给定的提示。

While the name implies that the model is performing some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user. One common example where you wouldn't want to call a function with the generated arguments is if you want to extract structured output matching some schema from unstructured text. You would give the model an "extraction" tool that takes parameters matching the desired schema, then treat the generated output as your final result.
虽然这个名字暗示模型正在执行一些操作，但实际上并非如此！该模型仅生成工具的参数，实际运行（或不运行）工具取决于用户。一个常见的示例是，如果您不想使用生成的参数调用函数，则要从非结构化文本中提取与某些架构匹配的结构化输出。您可以为模型提供一个“提取”工具，该工具采用与所需架构匹配的参数，然后将生成的输出视为最终结果。

Diagram of a tool call by a chat model

Tool calling is not universal, but is supported by many popular LLM providers, including Anthropic, Cohere, Google, Mistral, OpenAI, and even for locally-running models via Ollama.
工具调用不是通用的，但许多流行的LLM提供商都支持，包括 Anthropic、Cohere、Google、Mistral、OpenAI，甚至通过 Ollama 为本地运行的模型提供支持。

LangChain provides a standardized interface for tool calling that is consistent across different models.
LangChain为工具调用提供了一个标准化的接口，该接口在不同模型中是一致的。

The standard interface consists of:
标准接口包括：

ChatModel.bind_tools(): a method for specifying which tools are available for a model to call. This method accepts LangChain tools as well as Pydantic objects.
ChatModel.bind_tools（）：一种用于指定哪些工具可供模型调用的方法。此方法接受 LangChain 工具以及 Pydantic 对象。
AIMessage.tool_calls: an attribute on the AIMessage returned from the model for accessing the tool calls requested by the model.
AIMessage.tool_calls：从模型返回的 AIMessage 上的属性，用于访问模型请求的工具调用。

Tool usage 工具使用

After the model calls tools, you can use the tool by invoking it, then passing the arguments back to the model. LangChain provides the Tool abstraction to help you handle this.
在模型调用工具后，您可以通过调用它来使用该工具，然后将参数传回模型。LangChain提供了Tool抽象来帮你处理这个问题。

The general flow is this:
一般流程是这样的：

Generate tool calls with a chat model in response to a query.
使用聊天模型生成工具调用以响应查询。
Invoke the appropriate tools using the generated tool call as arguments.
使用生成的工具调用作为参数来调用相应的工具。
Format the result of the tool invocations as ToolMessages.
将工具调用结果的格式设置为 ToolMessages。
Pass the entire list of messages back to the model so that it can generate a final answer (or call more tools).
将整个消息列表传回模型，以便它可以生成最终答案（或调用更多工具）。

Diagram of a complete tool calling flow

This is how tool calling agents perform tasks and answer queries.
这就是工具调用代理执行任务和回答查询的方式。

Check out some more focused guides below:
查看下面一些更有针对性的指南：

Structured output
结构化输出

LLMs are capable of generating arbitrary text. This enables the model to respond appropriately to a wide range of inputs, but for some use-cases, it can be useful to constrain the LLM's output to a specific format or structure. This is referred to as structured output.
LLMs能够生成任意文本。这使模型能够对各种输入做出适当的响应，但对于某些用例，将LLM输出限制为特定格式或结构可能很有用。这称为结构化输出。

For example, if the output is to be stored in a relational database, it is much easier if the model generates output that adheres to a defined schema or format. Extracting specific information from unstructured text is another case where this is particularly useful. Most commonly, the output format will be JSON, though other formats such as YAML can be useful too. Below, we'll discuss a few ways to get structured output from models in LangChain.
例如，如果输出要存储在关系数据库中，那么如果模型生成的输出符合定义的架构或格式，则要容易得多。从非结构化文本中提取特定信息是另一种特别有用的情况。最常见的是，输出格式为 JSON，但其他格式（如 YAML）也很有用。下面，我们将讨论几种从LangChain中的模型中获取结构化输出的方法。

`.with_structured_output()`
`.with_structured_output（）`

For convenience, some LangChain chat models support a .with_structured_output() method. This method only requires a schema as input, and returns a dict or Pydantic object. Generally, this method is only present on models that support one of the more advanced methods described below, and will use one of them under the hood. It takes care of importing a suitable output parser and formatting the schema in the right format for the model.
为方便起见，一些LangChain聊天模型支持.with_structured_output（）方法。此方法只需要一个架构作为输入，并返回一个 dict 或 Pydantic 对象。通常，此方法仅存在于支持下面所述的更高级方法之一的模型上，并且将在引擎盖下使用其中一种方法。它负责导入合适的输出解析器，并以适合模型的正确格式格式化架构。

Here's an example: 下面是一个示例：

from typing import Optional

from langchain_core.pydantic_v1 import BaseModel, Field


class Joke(BaseModel):
    """Joke to tell user."""

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")

structured_llm = llm.with_structured_output(Joke)

structured_llm.invoke("Tell me a joke about cats")

Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=None)

We recommend this method as a starting point when working with structured output:
在处理结构化输出时，我们建议将此方法作为起点：

It uses other model-specific features under the hood, without the need to import an output parser.
它在后台使用其他特定于模型的功能，而无需导入输出解析器。
For the models that use tool calling, no special prompting is needed.
对于使用工具调用的模型，不需要特殊提示。
If multiple underlying techniques are supported, you can supply a method parameter to toggle which one is used.
如果支持多种基础技术，则可以提供一个方法参数来切换使用哪一种。

You may want or need to use other techniques if:
如果出现以下情况，您可能需要或需要使用其他技术：

The chat model you are using does not support tool calling.
您正在使用的聊天模型不支持工具调用。
You are working with very complex schemas and the model is having trouble generating outputs that conform.
您正在处理非常复杂的架构，并且模型在生成符合要求的输出时遇到问题。

For more information, check out this how-to guide.
有关详细信息，请查看此操作指南。

You can also check out this table for a list of models that support with_structured_output().
您还可以查看此表以获取支持 with_structured_output（） 的模型列表。

Raw prompting
原始提示

The most intuitive way to get a model to structure output is to ask nicely. In addition to your query, you can give instructions describing what kind of output you'd like, then parse the output using an output parser to convert the raw model message or string output into something more easily manipulated.
让模型构建输出的最直观方法是好好地问。除了查询之外，您还可以提供说明来描述您想要的输出类型，然后使用输出解析器解析输出，以将原始模型消息或字符串输出转换为更易于操作的内容。

The biggest benefit to raw prompting is its flexibility:
原始提示的最大好处是它的灵活性：

Raw prompting does not require any special model features, only sufficient reasoning capability to understand the passed schema.
原始提示不需要任何特殊的模型特征，只需要足够的推理能力来理解传递的模式。
You can prompt for any format you'd like, not just JSON. This can be useful if the model you are using is more heavily trained on a certain type of data, such as XML or YAML.
您可以提示输入所需的任何格式，而不仅仅是 JSON。如果您使用的模型在某种类型的数据（如 XML 或 YAML）上进行了更深入的训练，这可能很有用。

However, there are some drawbacks too:
但是，也有一些缺点：

LLMs are non-deterministic, and prompting a LLM to consistently output data in the exactly correct format for smooth parsing can be surprisingly difficult and model-specific.
LLMs是非确定性的，并且提示 A LLM 始终以完全正确的格式输出数据以实现流畅解析可能会非常困难且特定于模型。
Individual models have quirks depending on the data they were trained on, and optimizing prompts can be quite difficult. Some may be better at interpreting JSON schema, others may be best with TypeScript definitions, and still others may prefer XML.
根据它们所训练的数据，各个模型都有各自的怪癖，优化提示可能非常困难。有些人可能更擅长解释 JSON 模式，有些人可能最适合 TypeScript 定义，还有一些人可能更喜欢 XML。

While features offered by model providers may increase reliability, prompting techniques remain important for tuning your results no matter which method you choose.
虽然模型提供商提供的功能可能会提高可靠性，但无论您选择哪种方法，提示技术对于调整结果仍然很重要。

JSON mode JSON模式

Some models, such as Mistral, OpenAI, Together AI and Ollama, support a feature called JSON mode, usually enabled via config.
一些模型，如 Mistral、OpenAI、Together AI 和 Ollama，支持称为 JSON 模式的功能，通常通过配置启用。

When enabled, JSON mode will constrain the model's output to always be some sort of valid JSON. Often they require some custom prompting, but it's usually much less burdensome than completely raw prompting and more along the lines of, "you must always return JSON". The output also generally easier to parse.
启用后，JSON 模式会将模型的输出限制为始终是某种有效的 JSON。通常，它们需要一些自定义提示，但它通常比完全原始的提示要少得多，而且更像是“您必须始终返回 JSON”。输出通常也更易于解析。

It's also generally simpler to use directly and more commonly available than tool calling, and can give more flexibility around prompting and shaping results than tool calling.
与工具调用相比，它通常更易于直接使用，并且更常用，并且与工具调用相比，它可以在提示和塑造结果方面提供更大的灵活性。

Here's an example: 下面是一个示例：

from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain.output_parsers.json import SimpleJsonOutputParser

model = ChatOpenAI(
    model="gpt-4o",
    model_kwargs={ "response_format": { "type": "json_object" } },
)

prompt = ChatPromptTemplate.from_template(
    "Answer the user's question to the best of your ability."
    'You must always output a JSON object with an "answer" key and a "followup_question" key.'
    "{question}"
)

chain = prompt | model | SimpleJsonOutputParser()

chain.invoke({ "question": "What is the powerhouse of the cell?" })

API Reference:ChatPromptTemplate | ChatOpenAI | SimpleJsonOutputParser
API参考：ChatPromptTemplate | 聊天OpenAI | SimpleJsonOutputParser

{'answer': 'The powerhouse of the cell is the mitochondrion. It is responsible for producing energy in the form of ATP through cellular respiration.',
 'followup_question': 'Would you like to know more about how mitochondria produce energy?'}

For a full list of model providers that support JSON mode, see this table.
有关支持 JSON 模式的模型提供程序的完整列表，请参阅此表。

Tool calling
工具调用

For models that support it, tool calling can be very convenient for structured output. It removes the guesswork around how best to prompt schemas in favor of a built-in model feature.
对于支持它的模型，工具调用对于结构化输出来说非常方便。它消除了关于如何最好地提示架构以支持内置模型功能的猜测。

It works by first binding the desired schema either directly or via a LangChain tool to a chat model using the .bind_tools() method. The model will then generate an AIMessage containing a tool_calls field containing args that match the desired shape.
它的工作原理是首先使用 .bind_tools（） 方法直接或通过 LangChain 工具将所需的模式绑定到聊天模型。然后，该模型将生成一个 AIMessage，其中包含一个 tool_calls 字段，该字段包含与所需形状匹配的参数。

There are several acceptable formats you can use to bind tools to a model in LangChain. Here's one example:
您可以使用几种可接受的格式将工具绑定到LangChain中的模型。下面是一个示例：

from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI

class ResponseFormatter(BaseModel):
    """Always use this tool to structure your response to the user."""

    answer: str = Field(description="The answer to the user's question")
    followup_question: str = Field(description="A followup question the user could ask")

model = ChatOpenAI(
    model="gpt-4o",
    temperature=0,
)

model_with_tools = model.bind_tools([ResponseFormatter])

ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")

ai_msg.tool_calls[0]["args"]

API Reference:ChatOpenAI
API参考：ChatOpenAI

{'answer': "The powerhouse of the cell is the mitochondrion. It generates most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
 'followup_question': 'How do mitochondria generate ATP?'}

Tool calling is a generally consistent way to get a model to generate structured output, and is the default technique used for the .with_structured_output() method when a model supports it.
工具调用是获取模型以生成结构化输出的一种通常一致的方式，并且是模型支持 .with_structured_output（） 方法时使用的默认技术。

The following how-to guides are good practical resources for using function/tool calling for structured output:
以下操作指南是使用函数/工具调用进行结构化输出的良好实用资源：

For a full list of model providers that support tool calling, see this table.
有关支持工具调用的模型提供程序的完整列表，请参阅此表。

Retrieval 检索

LLMs are trained on a large but fixed dataset, limiting their ability to reason over private or recent information. Fine-tuning an LLM with specific facts is one way to mitigate this, but is often poorly suited for factual recall and can be costly. Retrieval is the process of providing relevant information to an LLM to improve its response for a given input. Retrieval augmented generation (RAG) is the process of grounding the LLM generation (output) using the retrieved information.
LLMs在大型但固定的数据集上进行训练，限制了他们对私人或最新信息进行推理的能力。LLM用特定的事实进行微调是缓解这种情况的一种方法，但通常不适合事实回忆，而且成本可能很高。检索是向 an LLM 提供相关信息以改进其对给定输入的响应的过程。检索增强生成（RAG）是使用检索到的信息将LLM生成（输出）接地的过程。

tip 提示

See our RAG from Scratch code and video series.
请观看我们的 RAG from Scratch 代码和视频系列。
For a high-level guide on retrieval, see this tutorial on RAG.
有关检索的高级指南，请参阅有关 RAG 的此教程。

RAG is only as good as the retrieved documents’ relevance and quality. Fortunately, an emerging set of techniques can be employed to design and improve RAG systems. We've focused on taxonomizing and summarizing many of these techniques (see below figure) and will share some high-level strategic guidance in the following sections. You can and should experiment with using different pieces together. You might also find this LangSmith guide useful for showing how to evaluate different iterations of your app.
RAG 的好坏取决于检索到的文档的相关性和质量。幸运的是，可以采用一套新兴的技术来设计和改进 RAG 系统。我们重点介绍了其中许多技术的分类和总结（见下图），并将在以下各节中分享一些高级战略指导。您可以而且应该尝试一起使用不同的部分。您可能还会发现此 LangSmith 指南对于展示如何评估应用的不同迭代很有用。

Query Translation
查询翻译

First, consider the user input(s) to your RAG system. Ideally, a RAG system can handle a wide range of inputs, from poorly worded questions to complex multi-part queries. Using an LLM to review and optionally modify the input is the central idea behind query translation. This serves as a general buffer, optimizing raw user inputs for your retrieval system. For example, this can be as simple as extracting keywords or as complex as generating multiple sub-questions for a complex query.
首先，考虑用户对 RAG 系统的输入。理想情况下，RAG 系统可以处理各种输入，从措辞不当的问题到复杂的多部分查询。使用 LLM to 查看和选择性地修改输入是查询翻译背后的中心思想。这用作常规缓冲区，优化检索系统的原始用户输入。例如，这可以像提取关键字一样简单，也可以像为复杂查询生成多个子问题一样复杂。

Name 名字	When to use 何时使用	Description 描述
Multi-query 多查询	When you need to cover multiple perspectives of a question. 当您需要涵盖一个问题的多个角度时。	Rewrite the user question from multiple perspectives, retrieve documents for each rewritten question, return the unique documents for all queries. 从多个角度重写用户问题，检索每个重写问题的文档，返回所有查询的唯一文档。
Decomposition 分解	When a question can be broken down into smaller subproblems. 当一个问题可以分解成更小的子问题时。	Decompose a question into a set of subproblems / questions, which can either be solved sequentially (use the answer from first + retrieval to answer the second) or in parallel (consolidate each answer into final answer). 将一个问题分解为一组子问题/问题，这些子问题/问题可以按顺序解决（使用第一个答案+检索来回答第二个答案）或并行解决（将每个答案合并为最终答案）。
Step-back 后退一步	When a higher-level conceptual understanding is required. 当需要更高层次的概念理解时。	First prompt the LLM to ask a generic step-back question about higher-level concepts or principles, and retrieve relevant facts about them. Use this grounding to help answer the user question. 首先提示他们LLM提出一个关于更高层次的概念或原则的一般性后退问题，并检索有关它们的相关事实。使用此接地来帮助回答用户问题。
HyDE 海德	If you have challenges retrieving relevant documents using the raw user inputs. 如果您在使用原始用户输入检索相关文档时遇到困难。	Use an LLM to convert questions into hypothetical documents that answer the question. Use the embedded hypothetical documents to retrieve real documents with the premise that doc-doc similarity search can produce more relevant matches. 使用将LLM问题转换为回答问题的假设文档。使用嵌入的假设文档来检索真实文档，前提是 doc-doc 相似性搜索可以产生更相关的匹配。

tip 提示

See our RAG from Scratch videos for a few different specific approaches:
请观看我们的 RAG from Scratch 视频，了解几种不同的具体方法：

Routing 路由

Second, consider the data sources available to your RAG system. You want to query across more than one database or across structured and unstructured data sources. Using an LLM to review the input and route it to the appropriate data source is a simple and effective approach for querying across sources.
其次，考虑可用于 RAG 系统的数据源。您希望跨多个数据库进行查询，或者跨结构化和非结构化数据源进行查询。使用 an LLM 查看输入并将其路由到适当的数据源是跨源查询的一种简单有效的方法。

Name 名字	When to use 何时使用	Description 描述
Logical routing 逻辑路由	When you can prompt an LLM with rules to decide where to route the input. 当您可以提示 LLM with 规则来决定将输入路由到何处时。	Logical routing can use an LLM to reason about the query and choose which datastore is most appropriate. 逻辑路由可以使用 LLM to 对查询进行推理，并选择最合适的数据存储。
Semantic routing 语义路由	When semantic similarity is an effective way to determine where to route the input. 当语义相似性是确定将输入路由到何处的有效方法时。	Semantic routing embeds both query and, typically a set of prompts. It then chooses the appropriate prompt based upon similarity. 语义路由嵌入查询和（通常是一组提示）。然后，它会根据相似性选择适当的提示。

tip 提示

See our RAG from Scratch video on routing.
请观看我们关于路由的 RAG from Scratch 视频。

Query Construction
查询构造

Third, consider whether any of your data sources require specific query formats. Many structured databases use SQL. Vector stores often have specific syntax for applying keyword filters to document metadata. Using an LLM to convert a natural language query into a query syntax is a popular and powerful approach. In particular, text-to-SQL, text-to-Cypher, and query analysis for metadata filters are useful ways to interact with structured, graph, and vector databases respectively.
第三，考虑是否有任何数据源需要特定的查询格式。许多结构化数据库使用 SQL。矢量存储通常具有用于将关键字过滤器应用于文档元数据的特定语法。使用 an LLM 将自然语言查询转换为查询语法是一种流行且功能强大的方法。特别是，文本到 SQL、文本到 Cypher 和元数据过滤器的查询分析分别是与结构化数据库、图形数据库和矢量数据库交互的有用方法。

Name 名字	When to Use 何时使用	Description 描述
Text to SQL 文本转 SQL	If users are asking questions that require information housed in a relational database, accessible via SQL. 如果用户提出的问题需要存储在关系数据库中的信息，可通过 SQL 访问。	This uses an LLM to transform user input into a SQL query. 这使用将LLM用户输入转换为 SQL 查询。
Text-to-Cypher 文本到密码	If users are asking questions that require information housed in a graph database, accessible via Cypher. 如果用户提出的问题需要存储在图形数据库中的信息，可通过 Cypher 访问。	This uses an LLM to transform user input into a Cypher query. 这使用将LLM用户输入转换为 Cypher 查询。
Self Query 自助查询	If users are asking questions that are better answered by fetching documents based on metadata rather than similarity with the text. 如果用户提出的问题可以通过根据元数据而不是与文本的相似性获取文档来更好地回答。	This uses an LLM to transform user input into two things: (1) a string to look up semantically, (2) a metadata filter to go along with it. This is useful because oftentimes questions are about the METADATA of documents (not the content itself). 它使用将LLM用户输入转换为两件事：（1）用于在语义上查找的字符串，（2）用于配合它的元数据过滤器。这很有用，因为问题通常是关于文档的元数据（而不是内容本身）。

tip 提示

See our blog post overview and RAG from Scratch video on query construction, the process of text-to-DSL where DSL is a domain specific language required to interact with a given database. This converts user questions into structured queries.
请参阅我们的博客文章概述和 RAG from Scratch 视频，了解查询构造，即文本到 DSL 的过程，其中 DSL 是与给定数据库交互所需的特定领域语言。这会将用户问题转换为结构化查询。

Indexing 索引

Fourth, consider the design of your document index. A simple and powerful idea is to decouple the documents that you index for retrieval from the documents that you pass to the LLM for generation. Indexing frequently uses embedding models with vector stores, which compress the semantic information in documents to fixed-size vectors.
第四，考虑文档索引的设计。一个简单而强大的想法是将您为检索而编制的索引文档与您传递给LLM生成的文档分离。索引经常使用具有向量存储的嵌入模型，这些向量存储将文档中的语义信息压缩为固定大小的向量。

Many RAG approaches focus on splitting documents into chunks and retrieving some number based on similarity to an input question for the LLM. But chunk size and chunk number can be difficult to set and affect results if they do not provide full context for the LLM to answer a question. Furthermore, LLMs are increasingly capable of processing millions of tokens.
许多 RAG 方法侧重于将文档拆分为多个块，并根据与输入问题的相似性检索一些数字LLM。但是，如果块大小和块数量没有提供回答问题的完整上下文，LLM则它们可能难以设置并影响结果。此外，LLMs他们越来越有能力处理数百万个代币。

Two approaches can address this tension: (1) Multi Vector retriever using an LLM to translate documents into any form (e.g., often into a summary) that is well-suited for indexing, but returns full documents to the LLM for generation. (2) ParentDocument retriever embeds document chunks, but also returns full documents. The idea is to get the best of both worlds: use concise representations (summaries or chunks) for retrieval, but use the full documents for answer generation.
有两种方法可以解决这种紧张关系：（1）多向量检索器，使用一个LLM将文档翻译成任何形式（例如，通常转换为摘要），非常适合索引，但将完整的文档返回给LLM生成。（2） ParentDocument 检索器嵌入文档块，但也返回完整的文档。这个想法是两全其美：使用简洁的表示（摘要或块）进行检索，但使用完整的文档来生成答案。

Name 名字	Index Type 索引类型	Uses an LLM 使用LLM	When to Use 何时使用	Description 描述
Vector store 矢量存储	Vector store 矢量存储	No 不	If you are just getting started and looking for something quick and easy. 如果您刚刚开始并寻找快速简便的东西。	This is the simplest method and the one that is easiest to get started with. It involves creating embeddings for each piece of text. 这是最简单的方法，也是最容易上手的方法。它涉及为每段文本创建嵌入。
ParentDocument 父文档	Vector store + Document Store 矢量存储 + 文档存储	No 不	If your pages have lots of smaller pieces of distinct information that are best indexed by themselves, but best retrieved all together. 如果您的页面有很多较小的不同信息，这些信息最好单独编制索引，但最好一起检索。	This involves indexing multiple chunks for each document. Then you find the chunks that are most similar in embedding space, but you retrieve the whole parent document and return that (rather than individual chunks). 这涉及为每个文档索引多个块。然后，找到在嵌入空间中最相似的块，但您检索了整个父文档并返回该文档（而不是单个块）。
Multi Vector 多向量	Vector store + Document Store 矢量存储 + 文档存储	Sometimes during indexing 有时在索引期间	If you are able to extract information from documents that you think is more relevant to index than the text itself. 如果您能够从文档中提取您认为与索引更相关的信息，而不是文本本身。	This involves creating multiple vectors for each document. Each vector could be created in a myriad of ways - examples include summaries of the text and hypothetical questions. 这涉及为每个文档创建多个向量。每个向量都可以以多种方式创建 - 示例包括文本摘要和假设问题。
Time-Weighted Vector store 时间加权向量存储	Vector store 矢量存储	No 不	If you have timestamps associated with your documents, and you want to retrieve the most recent ones 如果您有与文档关联的时间戳，并且想要检索最新的时间戳	This fetches documents based on a combination of semantic similarity (as in normal vector retrieval) and recency (looking at timestamps of indexed documents) 它根据语义相似性（如普通向量检索）和新近度（查看索引文档的时间戳）的组合来获取文档

tip 提示

See our RAG from Scratch video on indexing fundamentals
观看我们的 RAG from Scratch 视频，了解索引基础知识
See our RAG from Scratch video on multi vector retriever
观看我们关于多向量检索器的 RAG from Scratch 视频

Fifth, consider ways to improve the quality of your similarity search itself. Embedding models compress text into fixed-length (vector) representations that capture the semantic content of the document. This compression is useful for search / retrieval, but puts a heavy burden on that single vector representation to capture the semantic nuance / detail of the document. In some cases, irrelevant or redundant content can dilute the semantic usefulness of the embedding.
第五，考虑如何提高相似性搜索本身的质量。嵌入模型将文本压缩为固定长度（向量）表示形式，以捕获文档的语义内容。这种压缩对于搜索/检索很有用，但会给捕获文档的语义细微差别/细节的单一向量表示带来沉重的负担。在某些情况下，不相关或冗余的内容会稀释嵌入的语义有用性。

ColBERT is an interesting approach to address this with a higher granularity embeddings: (1) produce a contextually influenced embedding for each token in the document and query, (2) score similarity between each query token and all document tokens, (3) take the max, (4) do this for all query tokens, and (5) take the sum of the max scores (in step 3) for all query tokens to get a query-document similarity score; this token-wise scoring can yield strong results.
ColBERT 是一种有趣的方法，可以通过更高粒度的嵌入来解决这个问题：（1）为文档和查询中的每个令牌生成受上下文影响的嵌入，（2）对每个查询令牌和所有文档令牌之间的相似性进行评分，（3）取最大值，（4）对所有查询令牌执行此操作，以及（5）取所有查询令牌的最大分数之和（在第 3 步中）以获得查询文档相似性分数;这种按令牌计价的评分可以产生强大的结果。

There are some additional tricks to improve the quality of your retrieval. Embeddings excel at capturing semantic information, but may struggle with keyword-based queries. Many vector stores offer built-in hybrid-search to combine keyword and semantic similarity, which marries the benefits of both approaches. Furthermore, many vector stores have maximal marginal relevance, which attempts to diversify the results of a search to avoid returning similar and redundant documents.
还有一些额外的技巧可以提高检索的质量。嵌入在捕获语义信息方面表现出色，但在处理基于关键字的查询时可能会遇到困难。许多向量存储提供内置的混合搜索，以结合关键字和语义相似性，这结合了两种方法的优点。此外，许多向量存储具有最大的边际相关性，它试图使搜索结果多样化，以避免返回相似和冗余的文档。

Name 名字	When to use 何时使用	Description 描述
ColBERT 科尔伯特	When higher granularity embeddings are needed. 当需要更高粒度的嵌入时。	ColBERT uses contextually influenced embeddings for each token in the document and query to get a granular query-document similarity score. ColBERT 对文档和查询中的每个令牌使用受上下文影响的嵌入，以获得精细的查询-文档相似性分数。
Hybrid search 混合搜索	When combining keyword-based and semantic similarity. 当结合基于关键字的相似性和语义相似性时。	Hybrid search combines keyword and semantic similarity, marrying the benefits of both approaches. 混合搜索结合了关键字和语义的相似性，结合了这两种方法的优点。
Maximal Marginal Relevance (MMR) 最大边际相关性（MMR）	When needing to diversify search results. 当需要使搜索结果多样化时。	MMR attempts to diversify the results of a search to avoid returning similar and redundant documents. MMR 尝试使搜索结果多样化，以避免返回相似和冗余的文档。

tip 提示

See our RAG from Scratch video on ColBERT.
观看我们在 ColBERT 上的 RAG from Scratch 视频。

Post-processing
后处理

Sixth, consider ways to filter or rank retrieved documents. This is very useful if you are combining documents returned from multiple sources, since it can can down-rank less relevant documents and / or compress similar documents.
第六，考虑对检索到的文档进行过滤或排序的方法。如果您正在合并从多个来源返回的文档，这将非常有用，因为它可以降低相关性较低的文档和/或压缩相似的文档。

Name 名字	Index Type 索引类型	Uses an LLM 使用LLM	When to Use 何时使用	Description 描述
Contextual Compression 上下文压缩	Any 任何	Sometimes 有时	If you are finding that your retrieved documents contain too much irrelevant information and are distracting the LLM. 如果您发现检索到的文档包含太多不相关的信息，并且分散了 LLM.	This puts a post-processing step on top of another retriever and extracts only the most relevant information from retrieved documents. This can be done with embeddings or an LLM. 这会将后处理步骤置于另一个检索器之上，并且仅从检索到的文档中提取最相关的信息。这可以通过嵌入或 LLM.
Ensemble 整体	Any 任何	No 不	If you have multiple retrieval methods and want to try combining them. 如果您有多种检索方法，并想尝试将它们组合在一起。	This fetches documents from multiple retrievers and then combines them. 这将从多个检索器中获取文档，然后将它们合并。
Re-ranking 重新排名	Any 任何	Yes 是的	If you want to rank retrieved documents based upon relevance, especially if you want to combine results from multiple retrieval methods . 如果要根据相关性对检索到的文档进行排名，特别是如果要合并来自多种检索方法的结果。	Given a query and a list of documents, Rerank indexes the documents from most to least semantically relevant to the query. 给定一个查询和一个文档列表，Rerank 将从语义上与查询最不相关的顺序对文档进行索引。

tip 提示

See our RAG from Scratch video on RAG-Fusion, on approach for post-processing across multiple queries: Rewrite the user question from multiple perspectives, retrieve documents for each rewritten question, and combine the ranks of multiple search result lists to produce a single, unified ranking with Reciprocal Rank Fusion (RRF).
观看我们关于 RAG-Fusion 的 RAG from Scratch 视频，了解跨多个查询进行后处理的方法：从多个角度重写用户问题，检索每个重写问题的文档，并合并多个搜索结果列表的排名，以使用互易排名融合（RRF）生成单一、统一的排名。

Generation 代

Finally, consider ways to build self-correction into your RAG system. RAG systems can suffer from low quality retrieval (e.g., if a user question is out of the domain for the index) and / or hallucinations in generation. A naive retrieve-generate pipeline has no ability to detect or self-correct from these kinds of errors. The concept of "flow engineering" has been introduced in the context of code generation: iteratively build an answer to a code question with unit tests to check and self-correct errors. Several works have applied this RAG, such as Self-RAG and Corrective-RAG. In both cases, checks for document relevance, hallucinations, and / or answer quality are performed in the RAG answer generation flow.
最后，考虑在RAG系统中建立自我纠正的方法。RAG 系统可能会遭受低质量的检索（例如，如果用户问题不在索引的域范围内）和/或生成过程中的幻觉。幼稚的检索-生成管道无法检测或自我纠正这些类型的错误。“流工程”的概念是在代码生成的上下文中引入的：通过单元测试来检查和自我纠正错误，迭代地构建代码问题的答案。一些工作已经应用了这种 RAG，例如 Self-RAG 和 Corrective-RAG。在这两种情况下，对文档相关性、幻觉和/或答案质量的检查都是在 RAG 答案生成流程中执行的。

We've found that graphs are a great way to reliably express logical flows and have implemented ideas from several of these papers using LangGraph, as shown in the figure below (red - routing, blue - fallback, green - self-correction):
我们发现图形是可靠地表达逻辑流的好方法，并且已经使用 LangGraph 实现了其中几篇论文中的想法，如下图所示（红色 - 路由，蓝色 - 回退，绿色 - 自我纠正）：

Routing: Adaptive RAG (paper). Route questions to different retrieval approaches, as discussed above
路由：自适应 RAG（纸质）。如上所述，将问题路由到不同的检索方法
Fallback: Corrective RAG (paper). Fallback to web search if docs are not relevant to query
后备：纠正性 RAG（纸质）。如果文档与查询不相关，则回退到 Web 搜索
Self-correction: Self-RAG (paper). Fix answers w/ hallucinations or don’t address question
自我纠正：Self-RAG（纸质）。修复带有幻觉的答案或不解决问题

Name 名字	When to use 何时使用	Description 描述
Self-RAG 自RAG	When needing to fix answers with hallucinations or irrelevant content. 当需要用幻觉或不相关的内容修复答案时。	Self-RAG performs checks for document relevance, hallucinations, and answer quality during the RAG answer generation flow, iteratively building an answer and self-correcting errors. Self-RAG 在 RAG 答案生成流程中执行文档相关性、幻觉和答案质量的检查，迭代构建答案并自我纠正错误。
Corrective-RAG 矫正-RAG	When needing a fallback mechanism for low relevance docs. 当需要低相关性文档的回退机制时。	Corrective-RAG includes a fallback (e.g., to web search) if the retrieved documents are not relevant to the query, ensuring higher quality and more relevant retrieval. 如果检索到的文档与查询不相关，Corrective-RAG 包括回退（例如，到 Web 搜索），从而确保更高的质量和更相关的检索。

tip 提示

See several videos and cookbooks showcasing RAG with LangGraph:
观看几个视频和食谱，展示 RAG 和 LangGraph 的功能：

See our LangGraph RAG recipes with partners:
查看我们与合作伙伴合作的 LangGraph RAG 食谱：

Text splitting
文本拆分

LangChain offers many different types of text splitters. These all live in the langchain-text-splitters package.
LangChain提供了许多不同类型的文本分割器。这些都存在于langchain-text-splitters包中。

Table columns: 表列：

Name: Name of the text splitter
名称：文本拆分器的名称
Classes: Classes that implement this text splitter
类：实现此文本拆分器的类
Splits On: How this text splitter splits text
拆分打开：此文本拆分器如何拆分文本
Adds Metadata: Whether or not this text splitter adds metadata about where each chunk came from.
添加元数据：此文本拆分器是否添加有关每个块来自何处的元数据。
Description: Description of the splitter, including recommendation on when to use it.
说明：拆分器的说明，包括何时使用它的建议。

Name 名字	Classes 类	Splits On 拆分开启	Adds Metadata 添加元数据	Description 描述
Recursive 递归的	RecursiveCharacterTextSplitter, RecursiveJsonSplitter RecursiveCharacterTextSplitter、RecursiveJsonSplitter	A list of user defined characters 用户定义字符的列表		Recursively splits text. This splitting is trying to keep related pieces of text next to each other. This is the `recommended way` to start splitting text. 递归拆分文本。这种拆分是试图使相关的文本片段彼此相邻。这是开始拆分文本的`推荐方法`。
HTML	HTMLHeaderTextSplitter, HTMLSectionSplitter HTMLHeaderTextSplitter、HTMLSectionSplitter	HTML specific characters HTML 特定字符	✅	Splits text based on HTML-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the HTML) 根据 HTML 特定的字符拆分文本。值得注意的是，这增加了有关该块来自何处的相关信息（基于 HTML）
Markdown 降价	MarkdownHeaderTextSplitter, MarkdownHeaderTextSplitter，	Markdown specific characters Markdown 特定字符	✅	Splits text based on Markdown-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the Markdown) 根据 Markdown 特定的字符拆分文本。值得注意的是，这增加了有关该块来自何处的相关信息（基于 Markdown）
Code 法典	many languages 多种语言	Code (Python, JS) specific characters 代码（Python、JS）特定字符		Splits text based on characters specific to coding languages. 15 different languages are available to choose from. 根据特定于编码语言的字符拆分文本。有 15 种不同的语言可供选择。
Token 令牌	many classes 许多班级	Tokens 令牌		Splits text on tokens. There exist a few different ways to measure tokens. 在标记上拆分文本。有几种不同的方法来衡量代币。
Character 字符	CharacterTextSplitter	A user defined character 用户定义的字符		Splits text based on a user defined character. One of the simpler methods. 根据用户定义的字符分割文本。一种更简单的方法。
Semantic Chunker (Experimental) Semantic Chunker （实验性）	SemanticChunker 语义分块器	Sentences 句子		First splits on sentences. Then combines ones next to each other if they are semantically similar enough. Taken from Greg Kamradt 首先是句子的分裂。然后，如果它们在语义上足够相似，则将它们彼此相邻组合。摘自格雷格·卡姆拉特
Integration: AI21 Semantic 集成：AI21 语义	AI21SemanticTextSplitter AI21语义文本拆分器		✅	Identifies distinct topics that form coherent pieces of text and splits along those. 识别形成连贯文本片段的不同主题，并沿着这些主题进行拆分。

Evaluation 评估

Evaluation is the process of assessing the performance and effectiveness of your LLM-powered applications. It involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose. This process is vital for building reliable applications.
评估是评估由您的LLM应用程序提供支持的应用程序的性能和有效性的过程。它涉及根据一组预定义的标准或基准测试模型的响应，以确保其满足所需的质量标准并实现预期目的。此过程对于构建可靠的应用程序至关重要。

LangSmith helps with this process in a few ways:
LangSmith 通过以下几种方式帮助完成此过程：

It makes it easier to create and curate datasets via its tracing and annotation features
它通过其跟踪和注释功能可以更轻松地创建和管理数据集
It provides an evaluation framework that helps you define metrics and run your app against your dataset
它提供了一个评估框架，可帮助您定义指标并针对数据集运行您的应用程序
It allows you to track results over time and automatically run your evaluators on a schedule or as part of CI/Code
它允许您跟踪一段时间内的结果，并按计划或作为 CI/Code 的一部分自动运行评估器

To learn more, check out this LangSmith guide.
要了解更多信息，请查看此 LangSmith 指南。

Tracing 描图

A trace is essentially a series of steps that your application takes to go from input to output. Traces contain individual steps called runs. These can be individual calls from a model, retriever, tool, or sub-chains. Tracing gives you observability inside your chains and agents, and is vital in diagnosing issues.
跟踪本质上是应用程序从输入到输出所采取的一系列步骤。跟踪包含称为“运行”的各个步骤。这些可以是来自模型、检索器、工具或子链的单独调用。跟踪可以让您在链和代理内部获得可观察性，并且在诊断问题时至关重要。

For a deeper dive, check out this LangSmith conceptual guide.
如需更深入的了解，请查看此 LangSmith 概念指南。

Conceptual guide 概念指南

Architecture​建筑

langchain-core​langchain-核心

Partner packages​合作伙伴套餐

langchain​langchain（朗链）

langchain-community​langchain-社区

langgraph​LangGraph （英语）

langserve​langserve

LangSmith​兰史密斯

LangChain Expression Language (LCEL)​LangChain表达式语言（LCEL）

Runnable interface​可运行的界面

Components​ 组件

Chat models​聊天模型

Multimodality​综合

LLMs​

Messages​ 消息

HumanMessage​人类消息

AIMessage​ AIMessage（智能）

SystemMessage​系统消息

ToolMessage​工具消息

(Legacy) FunctionMessage​（旧版）函数消息

Prompt templates​提示模板

String PromptTemplates​字符串 PromptTemplates

ChatPromptTemplates​ChatPromptTemplates（聊天提示模板）

MessagesPlaceholder​MessagesPlaceholder

Example selectors​示例选择器

Output parsers​输出解析器

Chat history​聊天记录

Documents​ 文件

Document loaders​文档加载器

Text splitters​文本拆分器

Embedding models​嵌入模型

Vector stores​矢量存储

Retrievers​ 猎犬

Tools​ 工具

Invoke with just the arguments​仅使用参数调用

Invoke with ToolCall​使用 ToolCall 调用

Best practices​最佳做法

Related​ 相关

Toolkits​ 工具包

Agents​ 代理

ReAct agents​ReAct 代理

Callbacks​ 回调

Callback Events​回调事件

Callback handlers​回调处理程序

Passing callbacks​传递回调

Techniques​ 技术

Streaming​ 流

.stream() and .astream()​.stream（） 和 .astream（）

.astream_events()​.astream_events（）

Callbacks​ 回调

Tokens​ 令 牌

Function/tool calling​函数/工具调用

Tool usage​ 工具使用

Structured output​结构化输出

.with_structured_output()​.with_structured_output（）

Raw prompting​原始提示

JSON mode​ JSON模式

Tool calling​工具调用

Retrieval​ 检索

Query Translation​查询翻译

Routing​ 路由

Query Construction​查询构造

Indexing​ 索引

Post-processing​后处理

Generation​ 代

Text splitting​文本拆分

Evaluation​ 评估

Tracing​ 描图

Was this page helpful? 这个页面有帮助吗？

You can also leave detailed feedback on GitHub.您也可以在 GitHub 上留下详细的反馈。

Architecture
建筑

`langchain-core`
`langchain-核心`

Partner packages
合作伙伴套餐

`langchain`
`langchain（朗链）`

`langchain-community`
`langchain-社区`

`langgraph`
`LangGraph （英语）`

`langserve`
`langserve`

LangSmith
兰史密斯

LangChain Expression Language (LCEL)
LangChain表达式语言（LCEL）

Runnable interface
可运行的界面

Components 组件

Chat models
聊天模型

Multimodality
综合

LLMs

Messages 消息

HumanMessage
人类消息

AIMessage AIMessage（智能）

SystemMessage
系统消息

ToolMessage
工具消息

(Legacy) FunctionMessage
（旧版）函数消息

Prompt templates
提示模板

String PromptTemplates
字符串 PromptTemplates

ChatPromptTemplates
ChatPromptTemplates（聊天提示模板）

MessagesPlaceholder
MessagesPlaceholder

Example selectors
示例选择器

Output parsers
输出解析器

Chat history
聊天记录

Documents 文件

Document loaders
文档加载器

Text splitters
文本拆分器

Embedding models
嵌入模型

Vector stores
矢量存储

Retrievers 猎犬

Tools 工具

Invoke with just the arguments
仅使用参数调用

Invoke with `ToolCall`
使用 `ToolCall` 调用

Best practices
最佳做法

Related 相关

Toolkits 工具包

Agents 代理

ReAct agents
ReAct 代理

Callbacks 回调

Callback Events
回调事件

Callback handlers
回调处理程序

Passing callbacks
传递回调

Techniques 技术

Streaming 流

`.stream()` and `.astream()`
`.stream（）` 和 `.astream（）`

`.astream_events()`
`.astream_events（）`

Callbacks 回调

Tokens 令牌

Function/tool calling
函数/工具调用

Tool usage 工具使用

Structured output
结构化输出

`.with_structured_output()`
`.with_structured_output（）`

Raw prompting
原始提示

JSON mode JSON模式

Tool calling
工具调用

Retrieval 检索

Query Translation
查询翻译

Routing 路由

Query Construction
查询构造

Indexing 索引

Post-processing
后处理

Generation 代

Text splitting
文本拆分

Evaluation 评估

Tracing 描图

You can also leave detailed feedback on GitHub.
您也可以在 GitHub 上留下详细的反馈。