Prompt engineering 提示工程

This guide shares strategies and tactics for getting better results from large language models (sometimes referred to as GPT models) like GPT-4o. The methods described here can sometimes be deployed in combination for greater effect. We encourage experimentation to find the methods that work best for you.
这个指南分享了从大型语言模型（有时被称为 GPT 模型）如 GPT-4o 获得更好结果的策略和策略。这里描述的方法有时可以组合使用以达到更好的效果。我们鼓励您进行实验以找到最适合您的方法。

You can also explore example prompts which showcase what our models are capable of:
你还可以探索示例提示,这些示例展示了我们的模型的能力:

Prompt examples 提示示例

Explore prompt examples to learn what GPT models can do
探索提示示例以了解 GPT 模型的功能

Six strategies for getting better results
获得更好结果的六大策略

Write clear instructions 写出清晰的说明

These models can’t read your mind. If outputs are too long, ask for brief replies. If outputs are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you’d like to see. The less the model has to guess at what you want, the more likely you’ll get it.
这些模型无法读懂你的想法。如果输出太长,请求简短回复。如果输出过于简单,要求专家级写作。如果你不喜欢格式,请示范你想要的格式。模型猜测你想要什么越少,你就越有可能得到想要的。

Tactics: 战术:

Provide reference text 提供参考文本

Language models can confidently invent fake answers, especially when asked about esoteric topics or for citations and URLs. In the same way that a sheet of notes can help a student do better on a test, providing reference text to these models can help in answering with fewer fabrications.
语言模型可以自信地编造虚假答案,尤其是在被询问关于深奥主题或需要引用和网址时。就像笔记可以帮助学生在考试中表现更好一样,为这些模型提供参考文本也可以帮助减少编造的内容。

Tactics: 战术:

Split complex tasks into simpler subtasks
将复杂任务分解为更简单的子任务

Just as it is good practice in software engineering to decompose a complex system into a set of modular components, the same is true of tasks submitted to a language model. Complex tasks tend to have higher error rates than simpler tasks. Furthermore, complex tasks can often be re-defined as a workflow of simpler tasks in which the outputs of earlier tasks are used to construct the inputs to later tasks.
正如在软件工程中将复杂系统分解为一组模块化组件是一种良好实践,同样适用于提交给语言模型的任务。复杂任务的错误率通常高于简单任务。此外,复杂任务通常可以重新定义为一系列简单任务的工作流,其中前一任务的输出用于构建后一任务的输入。

Tactics: 战术:

Give the model time to "think"
给模型一些时间"思考"

If asked to multiply 17 by 28, you might not know it instantly, but can still work it out with time. Similarly, models make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a "chain of thought" before an answer can help the model reason its way toward correct answers more reliably.
如果要求您乘以 17 和 28,您可能无法立即知道答案,但仍可以花时间计算出来。类似地,当模型试图立即回答而不是花时间计算答案时,它会做出更多的推理错误。在得出答案之前要求"思维过程"可以更可靠地帮助模型推演出正确的答案。

Tactics: 战术:

Use external tools 使用外部工具

Compensate for the weaknesses of the model by feeding it the outputs of other tools. For example, a text retrieval system (sometimes called RAG or retrieval augmented generation) can tell the model about relevant documents. A code execution engine like OpenAI's Code Interpreter can help the model do math and run code. If a task can be done more reliably or efficiently by a tool rather than by a language model, offload it to get the best of both.
通过为模型提供其他工具的输出来弥补模型的缺陷。例如,文本检索系统(有时称为 RAG 或检索增强型生成)可以告诉模型相关文档。像 OpenAI 的 Code Interpreter 这样的代码执行引擎可以帮助模型进行数学计算和运行代码。如果某项任务可以由工具更可靠或高效地完成,而不是由语言模型来完成,那就将其外包出去,以获得最佳效果。

Tactics: 战术:

Test changes systematically
系统地测试变更

Improving performance is easier if you can measure it. In some cases a modification to a prompt will achieve better performance on a few isolated examples but lead to worse overall performance on a more representative set of examples. Therefore to be sure that a change is net positive to performance it may be necessary to define a comprehensive test suite (also known an as an "eval").
如果您能够衡量性能,那么提高性能就更容易。在某些情况下,对提示的修改可以在少数孤立的示例中获得更好的性能,但会导致在更具代表性的示例集上出现更差的整体性能。因此,要确保变更对性能是整体积极的,可能需要定义一个全面的测试套件(也称为"评估")。

Tactic: 战术:

Evaluate model outputs with reference to gold-standard answers
参考标准答案评估模型输出

Tactics 战术

Each of the strategies listed above can be instantiated with specific tactics. These tactics are meant to provide ideas for things to try. They are by no means fully comprehensive, and you should feel free to try creative ideas not represented here.
上述列出的每一种策略都可以通过具体的策战来实施。这些策战旨在提供一些可尝试的想法。它们绝不是全面的,您也应该尝试在此没有体现的创意想法。

Strategy: Write clear instructions
战略：编写清晰的说明

Tactic: Include details in your query to get more relevant answers
策略:在您的查询中包含详细信息,以获得更相关的答案

In order to get a highly relevant response, make sure that requests provide any important details or context. Otherwise you are leaving it up to the model to guess what you mean.
为了获得高度相关的响应,请确保请求提供任何重要的细节或背景。否则,您将让模型去猜测您的意思。


Worse 更糟	Better 更好
How do I add numbers in Excel? 如何在 Excel 中添加数字?	How do I add up a row of dollar amounts in Excel? I want to do this automatically for a whole sheet of rows with all the totals ending up on the right in a column called "Total". 如何在 Excel 中自动求一行的美元金额总和?我希望能为整个表单的所有行自动完成此操作,所有总和将出现在右侧的一列中,列名为"总计"。
Who’s president? 谁是总统?	Who was the president of Mexico in 2021, and how frequently are elections held? 2021 年墨西哥的总统是谁,总统选举的频率是多少?
Write code to calculate the Fibonacci sequence. 编写代码计算斐波那契数列。	Write a TypeScript function to efficiently calculate the Fibonacci sequence. Comment the code liberally to explain what each piece does and why it's written that way. 写一个 TypeScript 函数高效地计算斐波那契数列。在代码中添加大量注释来解释每个部分的作用以及它们是如何编写的。
Summarize the meeting notes. 总结会议记录。	Summarize the meeting notes in a single paragraph. Then write a markdown list of the speakers and each of their key points. Finally, list the next steps or action items suggested by the speakers, if any. 会议纪要概括为一个段落。接着以 Markdown 格式列出演讲者及其关键观点。最后列出演讲者提出的后续步骤或行动项目(如有)。 - 演讲者 1： - 关键观点 1 - 关键观点 2 - 演讲者 2： - 关键观点 1 - 关键观点 2 - 后续步骤或行动项目： - 步骤 1 - 步骤 2

Tactic: Ask the model to adopt a persona
战术:要求模型采取某一形象

The system message can be used to specify the persona used by the model in its replies.
系统消息可用于指定模型在其回复中使用的角色。

SYSTEM 系统

When I ask for help to write something, you will reply with a document that contains at least one joke or playful comment in every paragraph.
当我要求帮助撰写某些内容时,你将回复一份文件,其中每个段落中至少包含一个笑话或调侃性评论。

USER 用户

Write a thank you note to my steel bolt vendor for getting the delivery in on time and in short notice. This made it possible for us to deliver an important order.
写一张感谢信给我的钢螺栓供应商,感谢他们能够及时交货,这使我们能够完成一个重要的订单。

Tactic: Use delimiters to clearly indicate distinct parts of the input
策略：使用分隔符来明确地指示输入的不同部分

Delimiters like triple quotation marks, XML tags, section titles, etc. can help demarcate sections of text to be treated differently.
像三重引号、XML 标签、章节标题等分隔符可以帮助划分需要被不同处理的文本区域。

USER 用户

Summarize the text delimited by triple quotes with a haiku. """insert text here"""
" This is a test text. This is a longer text that should be summarized with a haiku. The text is enclosed in triple quotes, which indicates that it should be summarized. " 云层下晨雾鸟鸣林间响春日缓缓来 """插入文本这里""" 在简体中文中: 这是一个很直白的翻译。没有任何附加文本

SYSTEM 系统

You will be provided with a pair of articles (delimited with XML tags) about the same topic. First summarize the arguments of each article. Then indicate which of them makes a better argument and explain why.
你将获得一对关于同一主题的文章(使用 XML 标签分隔)。首先总结每篇文章的论点。然后指出哪一篇论点更优秀,并解释原因。

USER 用户

<article> insert first article here </article> <article> insert second article here </article>

插入第一篇文章

在此插入第二篇文章

SYSTEM 系统

You will be provided with a thesis abstract and a suggested title for it. The thesis title should give the reader a good idea of the topic of the thesis but should also be eye-catching. If the title does not meet these criteria, suggest 5 alternatives.
你将获得一篇论文摘要和一个建议的论文标题。论文标题应该让读者对论文主题有所了解,同时也应该引人注目。如果标题不符合这些标准,建议 5 个替代方案。

USER 用户

Abstract: insert abstract here Title: insert title here
摘要:在此插入摘要标题：插入标题

For straightforward tasks such as these, using delimiters might not make a difference in the output quality. However, the more complex a task is the more important it is to disambiguate task details. Don’t make the model work to understand exactly what you are asking of them.
对于这些简单的任务,使用分隔符可能不会改变输出质量。但是,任务越复杂,明确任务细节就越重要。不要让模型努力理解你到底要求他们做什么。

Tactic: Specify the steps required to complete a task
策略:详细说明完成任务所需的步骤

Some tasks are best specified as a sequence of steps. Writing the steps out explicitly can make it easier for the model to follow them.
有些任务最好被指定为一系列步骤。明确地书写这些步骤可以使模型更容易遵循它们。

SYSTEM 系统

Use the following step-by-step instructions to respond to user inputs. Step 1 - The user will provide you with text in triple quotes. Summarize this text in one sentence with a prefix that says "Summary: ". Step 2 - Translate the summary from Step 1 into Spanish, with a prefix that says "Translation: ".
使用以下分步说明来回应用户输入。步骤 1 - 用户将在三个引号中向您提供文本。用一句话总结此文本,并加上前缀"总结: "。翻译：步骤 2 - 将步骤 1 中的总结翻译成西班牙语。

USER 用户

"""insert text here""" """插入文本这里""" 在简体中文中: 这是一个很直白的翻译。没有任何附加文本

Tactic: Provide examples 策略:提供示例

Providing general instructions that apply to all examples is generally more efficient than demonstrating all permutations of a task by example, but in some cases providing examples may be easier. For example, if you intend for the model to copy a particular style of responding to user queries which is difficult to describe explicitly. This is known as "few-shot" prompting.
对于所有示例提供通用说明通常比示范任务的所有排列更有效率,但在某些情况下,提供示例可能更容易。例如,如果您希望模型能复制回应用户查询的特定样式,而这种样式很难明确描述。这被称为"少量示例"提示。

SYSTEM 系统

Answer in a consistent style.
回答以一致的风格。

USER 用户

Teach me about patience. 耐心

ASSISTANT 助手

The river that carves the deepest valley flows from a modest spring; the grandest symphony originates from a single note; the most intricate tapestry begins with a solitary thread.
开凿最深谷地的河流源于一处谦逊的泉眼；最宏大的交响乐源于一个单一音符；最复杂精致的织锦由一根单线开始。

USER 用户

Teach me about the ocean.
教我关于海洋的知识。

Tactic: Specify the desired length of the output
战术:指定输出的期望长度

You can ask the model to produce outputs that are of a given target length. The targeted output length can be specified in terms of the count of words, sentences, paragraphs, bullet points, etc. Note however that instructing the model to generate a specific number of words does not work with high precision. The model can more reliably generate outputs with a specific number of paragraphs or bullet points.
您可以要求模型生成指定长度的输出。目标输出长度可以用字数、句子数、段落数、项目符号等来指定。但请注意,指示模型生成特定字数的输出无法做到高精度。模型可以更可靠地生成指定段落数或项目符号数的输出。

USER 用户

Summarize the text delimited by triple quotes in about 50 words. """insert text here"""
在 50 字左右概括以下加标三重引号的文本: """插入文本这里""" 在简体中文中: 这是一个很直白的翻译。没有任何附加文本

USER 用户

Summarize the text delimited by triple quotes in 2 paragraphs. """insert text here"""
总结由三引号分隔的文本,分两个段落。第一段 : xxx. xxx. 第二段 : xxx. xxx. """插入文本这里""" 在简体中文中: 这是一个很直白的翻译。没有任何附加文本

USER 用户

Summarize the text delimited by triple quotes in 3 bullet points. """insert text here"""
对于以下引用文本,总结成三个要点: """ This is the text that needs to be summarized. It discusses a variety of topics, including the weather, current events, and personal experiences. The text is fairly long and covers a wide range of subject matter. """ - 文本涉及天气、时事和个人经历等多个主题 - 文本内容丰富,覆盖范围广 - 文本篇幅较长 """插入文本这里""" 在简体中文中: 这是一个很直白的翻译。没有任何附加文本

Strategy: Provide reference text
战略:提供参考文本

Tactic: Instruct the model to answer using a reference text
战术:指示模型使用参考文本回答

If we can provide a model with trusted information that is relevant to the current query, then we can instruct the model to use the provided information to compose its answer.
如果我们可以为模型提供与当前查询相关的可靠信息,那么我们就可以指示模型使用提供的信息来撰写答案。

SYSTEM 系统

Use the provided articles delimited by triple quotes to answer questions. If the answer cannot be found in the articles, write "I could not find an answer."
使用由三重引号分隔的提供的文章来回答问题。如果找不到答案,请写"我找不到答案。"

USER 用户

<insert articles, each delimited by triple quotes> Question: <insert question here>
问题:

Given that all models have limited context windows, we need some way to dynamically lookup information that is relevant to the question being asked. Embeddings can be used to implement efficient knowledge retrieval. See the tactic "Use embeddings-based search to implement efficient knowledge retrieval" for more details on how to implement this.
鉴于所有模型都有有限的上下文窗口,我们需要某种方式动态查找与所提问题相关的信息。嵌入可用于实现高效的知识检索。详见战术"使用基于嵌入的搜索来实现高效的知识检索"。

Tactic: Instruct the model to answer with citations from a reference text
战术：指示模型从参考文本中引用回答

If the input has been supplemented with relevant knowledge, it's straightforward to request that the model add citations to its answers by referencing passages from provided documents. Note that citations in the output can then be verified programmatically by string matching within the provided documents.
如果输入已经补充了相关知识,那么只需简单请求该模型通过引用提供的文档中的段落来为其答案添加引用。请注意,输出中的引用可以通过对提供的文档进行字符串匹配来程序化地进行验证。

SYSTEM 系统

You will be provided with a document delimited by triple quotes and a question. Your task is to answer the question using only the provided document and to cite the passage(s) of the document used to answer the question. If the document does not contain the information needed to answer this question then simply write: "Insufficient information." If an answer to the question is provided, it must be annotated with a citation. Use the following format for to cite relevant passages ({"citation": …}).
您将收到一个由三重引号分隔的文档和一个问题。您的任务是仅使用所提供的文档来回答这个问题,并引用用于回答问题的文档片段。如果文档不包含回答此问题所需的信息,则只需写:"信息不足"。如果提供了问题的答案,它必须附有引用标注。使用以下格式引用相关段落({"citation": ...})。

USER 用户

"""<insert document here>""" Question: <insert question here>
问题:

Strategy: Split complex tasks into simpler subtasks
策略：将复杂任务拆分为更简单的子任务

Tactic: Use intent classification to identify the most relevant instructions for a user query
策略：使用意图分类来识别用户查询的最相关说明

For tasks in which lots of independent sets of instructions are needed to handle different cases, it can be beneficial to first classify the type of query and to use that classification to determine which instructions are needed. This can be achieved by defining fixed categories and hardcoding instructions that are relevant for handling tasks in a given category. This process can also be applied recursively to decompose a task into a sequence of stages. The advantage of this approach is that each query will contain only those instructions that are required to perform the next stage of a task which can result in lower error rates compared to using a single query to perform the whole task. This can also result in lower costs since larger prompts cost more to run (see pricing information).
对于需要大量独立指令集来处理不同情况的任务,先对查询类型进行分类,并使用该分类确定所需的指令可能会很有帮助。这可以通过定义固定的类别并编码与处理给定类别任务相关的指令来实现。该过程也可以递归应用,将任务分解为一系列阶段。这种方法的优点是,每个查询只包含执行任务下一阶段所需的指令,从而可以相比使用单个查询执行整个任务而获得更低的错误率。这也可能导致成本更低,因为更大的提示需要更多成本运行(参见价格信息)。

Suppose for example that for a customer service application, queries could be usefully classified as follows:
假设以客户服务应用程序为例,查询可以按以下方式进行有用的分类:

SYSTEM 系统

You will be provided with customer service queries. Classify each query into a primary category and a secondary category. Provide your output in json format with the keys: primary and secondary. Primary categories: Billing, Technical Support, Account Management, or General Inquiry. Billing secondary categories: - Unsubscribe or upgrade - Add a payment method - Explanation for charge - Dispute a charge Technical Support secondary categories: - Troubleshooting - Device compatibility - Software updates Account Management secondary categories: - Password reset - Update personal information - Close account - Account security General Inquiry secondary categories: - Product information - Pricing - Feedback - Speak to a human
{ "primary": "客户服务查询", "secondary": "分类" } 主要类别：账单、技术支持、账户管理或一般查询。账单次要类别: - 退订或升级添加支付方式 - 费用说明争议费用技术支持次级类别: 故障排除设备兼容性 - 软件更新账户管理次级类别: 重置密码 - 更新个人信息关闭账户账户安全一般询问次要类别: 产品信息价格反馈与人交谈

USER 用户

I need to get my internet working again.
我需要让我的互联网再次工作。

Based on the classification of the customer query, a set of more specific instructions can be provided to a model for it to handle next steps. For example, suppose the customer requires help with "troubleshooting".
根据客户查询的分类,可以为模型提供一组更具体的说明,以便它处理下一步。例如,假设客户需要"故障排除"方面的帮助。

SYSTEM 系统

You will be provided with customer service inquiries that require troubleshooting in a technical support context. Help the user by: - Ask them to check that all cables to/from the router are connected. Note that it is common for cables to come loose over time. - If all cables are connected and the issue persists, ask them which router model they are using - Now you will advise them how to restart their device: -- If the model number is MTD-327J, advise them to push the red button and hold it for 5 seconds, then wait 5 minutes before testing the connection. -- If the model number is MTD-327S, advise them to unplug and replug it, then wait 5 minutes before testing the connection. - If the customer's issue persists after restarting the device and waiting 5 minutes, connect them to IT support by outputting {"IT support requested"}. - If the user starts asking questions that are unrelated to this topic then confirm if they would like to end the current chat about troubleshooting and classify their request according to the following scheme: <insert primary/secondary classification scheme from above here>
您将收到需要在技术支持环境中进行故障排除的客户服务问询。请帮助用户: - 请他们检查所有连接到/从路由器的电缆是否连接牢固。请注意,随着时间的推移,电缆松动是很常见的情况。如果所有电缆都已连接且问题仍然存在,请询问他们使用的路由器型号现在您将指导他们如何重新启动他们的设备: 如果型号为 MTD-327J,建议他们按下红色按钮并按住 5 秒,然后等待 5 分钟后再测试连接。如果型号是 MTD-327S,建议他们先拔掉电源,然后等待 5 分钟后再重新插上电源,然后再测试连接。如果客户在重启设备并等待 5 分钟后问题仍然存在，通过输出{"需要 IT 支持"}来连接他们到 IT 支持。如果用户开始提出与此话题无关的问题,请确认他们是否想要结束当前的故障排除聊天,并根据以下方案对他们的请求进行分类:

USER 用户

I need to get my internet working again.
我需要让我的互联网再次工作。

Notice that the model has been instructed to emit special strings to indicate when the state of the conversation changes. This enables us to turn our system into a state machine where the state determines which instructions are injected. By keeping track of state, what instructions are relevant at that state, and also optionally what state transitions are allowed from that state, we can put guardrails around the user experience that would be hard to achieve with a less structured approach.
请注意,该模型已被指示发出特殊字符串以指示对话状态的变化。这使我们能够将系统转换为状态机,其中状态决定注入哪些指令。通过跟踪状态、在该状态下相关的指令以及可选的允许状态转换,我们可以为用户体验设置护栏,这在采取更不结构化的方法时难以实现。

Tactic: For dialogue applications that require very long conversations, summarize or filter previous dialogue
对话应用程序要求进行非常长的对话时,可以总结或过滤前面的对话

Since models have a fixed context length, dialogue between a user and an assistant in which the entire conversation is included in the context window cannot continue indefinitely.
由于模型具有固定的上下文长度,因此无法无限期地进行包含整个对话的用户和助理之间的对话。

There are various workarounds to this problem, one of which is to summarize previous turns in the conversation. Once the size of the input reaches a predetermined threshold length, this could trigger a query that summarizes part of the conversation and the summary of the prior conversation could be included as part of the system message. Alternatively, prior conversation could be summarized asynchronously in the background throughout the entire conversation.
有多种解决这个问题的方法,其中一种是总结之前的对话内容。一旦输入的内容达到预设的长度阈值,就可以触发一个查询,对对话的一部分进行总结,并将这个总结作为系统消息的一部分。另一种方式是在整个对话过程中,异步地对之前的对话进行总结。

An alternative solution is to dynamically select previous parts of the conversation that are most relevant to the current query. See the tactic "Use embeddings-based search to implement efficient knowledge retrieval".
一个可选的解决方案是动态选择与当前查询最相关的对话的先前部分。参见策略"使用基于嵌入的搜索实现高效的知识检索"。

Tactic: Summarize long documents piecewise and construct a full summary recursively
策略:逐段总结长文档,以递归方式构建完整的摘要

Since models have a fixed context length, they cannot be used to summarize a text longer than the context length minus the length of the generated summary in a single query.
由于模型具有固定的上下文长度,因此它们无法在单次查询中总结长度超过上下文长度减去生成摘要长度的文本。

To summarize a very long document such as a book we can use a sequence of queries to summarize each section of the document. Section summaries can be concatenated and summarized producing summaries of summaries. This process can proceed recursively until an entire document is summarized. If it’s necessary to use information about earlier sections in order to make sense of later sections, then a further trick that can be useful is to include a running summary of the text that precedes any given point in the book while summarizing content at that point. The effectiveness of this procedure for summarizing books has been studied in previous research by OpenAI using variants of GPT-3.
要总结一份非常长的文档,如一本书,我们可以使用一系列查询来总结文档的每个部分。部分总结可以连接起来并进行总结,产生总结的总结。这个过程可以递归进行,直到整个文档被总结。如果有必要使用早期部分的信息才能理解后续部分,那么一种有用的技巧是在总结某一点的内容时,包括文中前述内容的运行总结。这种总结图书的方法的有效性已在之前由 OpenAI 使用 GPT-3 的变体进行的研究中进行了研究。

Strategy: Give models time to "think"
策略:给模型时间"思考"

Tactic: Instruct the model to work out its own solution before rushing to a conclusion
策略:在急于得出结论之前,指示模型自行找出解决方案

Sometimes we get better results when we explicitly instruct the model to reason from first principles before coming to a conclusion. Suppose for example we want a model to evaluate a student’s solution to a math problem. The most obvious way to approach this is to simply ask the model if the student's solution is correct or not.
有时我们在做出结论之前先从基本原理推理,能得到更好的结果。例如,我们希望一个模型评估学生解决数学问题的答案。最直接的方法就是直接询问模型该解答是否正确。

SYSTEM 系统

Determine if the student's solution is correct or not.
判断学生的解决方案是否正确。

USER 用户

Problem Statement: I'm building a solar power installation and I need help working out the financials. - Land costs $100 / square foot - I can buy solar panels for $250 / square foot - I negotiated a contract for maintenance that will cost me a flat $100k per year, and an additional $10 / square foot What is the total cost for the first year of operations as a function of the number of square feet. Student's Solution: Let x be the size of the installation in square feet. 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 100x Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
问题陈述:我正在建立一个太阳能电力装置,需要帮助计算财务相关内容。土地每平方英尺成本 100 美元 - 我可以以每平方英尺 250 美元的价格购买太阳能电池板我谈判了一份每年固定成本为 10 万美元的维修合同,另外每平方英尺还需支付 10 美元作为平方英尺数量的函数,第一年运营的总成本是多少? 学生的解决方案:设 x 为安装面积的大小,单位为平方英尺。 1. 土地成本：100x 太阳能电池板成本：250x 3. 维护成本：100,000 + 100x 总成本:100x + 250x + 100,000 + 100x = 450x + 100,000

ASSISTANT 助手

The student's solution is correct.
学生的解决方案正确。

But the student's solution is actually not correct! We can get the model to successfully notice this by prompting the model to generate its own solution first.
但是学生的解决方案实际上并不正确！我们可以通过提示模型先生成自己的解决方案来让模型成功地注意到这一点。

SYSTEM 系统

First work out your own solution to the problem. Then compare your solution to the student's solution and evaluate if the student's solution is correct or not. Don't decide if the student's solution is correct until you have done the problem yourself.
先自己解决这个问题。然后将你的解决方案与学生的解决方案进行比较,评估学生的解决方案是否正确。在自己解决这个问题之前,不要决定学生的解决方案是否正确。

USER 用户

ASSISTANT 助手

Let x be the size of the installation in square feet. 1. Land cost: 100x 2. Solar panel cost: 250x 3. Maintenance cost: 100,000 + 10x Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000 The student's solution is incorrect. They made an error in the maintenance cost calculation, using 100x instead of 10x. The correct total cost for the first year of operations is 360x + 100,000.
设 x 为安装面积的平方英尺。 1. 土地成本：100x 太阳能电池板成本：250x 3. 维护成本：100,000 + 10x 总成本:100x + 250x + 100,000 + 10x = 360x + 100,000 学生的解决方案是不正确的。他们在维护成本计算中出错,使用了 100x 而不是 10x。第一年运营的正确总成本为 360x + 100,000。

Tactic: Use inner monologue or a sequence of queries to hide the model's reasoning process
战术:使用内部独白或一系列查询来隐藏模型的推理过程

The previous tactic demonstrates that it is sometimes important for the model to reason in detail about a problem before answering a specific question. For some applications, the reasoning process that a model uses to arrive at a final answer would be inappropriate to share with the user. For example, in tutoring applications we may want to encourage students to work out their own answers, but a model’s reasoning process about the student’s solution could reveal the answer to the student.
以前的策略表明,在回答特定问题之前,模型有时需要对问题进行详细推理。对于某些应用程序而言,模型用于得出最终答案的推理过程是不适合与用户共享的。例如,在辅导应用程序中,我们可能希望鼓励学生自己寻找答案,但模型对学生解决方案的推理过程可能会泄露答案给学生。

Inner monologue is a tactic that can be used to mitigate this. The idea of inner monologue is to instruct the model to put parts of the output that are meant to be hidden from the user into a structured format that makes parsing them easy. Then before presenting the output to the user, the output is parsed and only part of the output is made visible.
内心独白是一种可以用来缓解这种情况的策略。内心独白的概念是指指示模型将旨在对用户隐藏的输出部分置于易于解析的结构格式中。然后在将输出呈现给用户之前,对输出进行解析,只显示部分输出。

SYSTEM 系统

Follow these steps to answer the user queries. Step 1 - First work out your own solution to the problem. Don't rely on the student's solution since it may be incorrect. Enclose all your work for this step within triple quotes ("""). Step 2 - Compare your solution to the student's solution and evaluate if the student's solution is correct or not. Enclose all your work for this step within triple quotes ("""). Step 3 - If the student made a mistake, determine what hint you could give the student without giving away the answer. Enclose all your work for this step within triple quotes ("""). Step 4 - If the student made a mistake, provide the hint from the previous step to the student (outside of triple quotes). Instead of writing "Step 4 - ..." write "Hint:".
按照以下步骤回答用户提出的问题。步骤 1 - 首先解决问题并找出自己的解决方案。不要依赖学生的解决方案,因为可能不正确。将此步骤的所有工作都括在三重引号(""")中。步骤 2 - 将您的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。将此步骤的所有工作都放在三重引号内(""")。 """第 3 步 - 如果学生犯了错误,确定您可以向学生给出什么提示,而不透露答案。""" 提示：如果学生犯了错误,向学生提供上一步的提示(不在三重引号内)。不要写"步骤 4 - ..."而是写"提示："。

USER 用户

Problem Statement: <insert problem statement> Student Solution: <insert student solution>
问题声明: 学生解决方案:

Alternatively, this can be achieved with a sequence of queries in which all except the last have their output hidden from the end user.
或者,这可以通过一系列查询来实现,其中除最后一个外,其他查询的输出都对最终用户隐藏。

First, we can ask the model to solve the problem on its own. Since this initial query doesn't require the student’s solution, it can be omitted. This provides the additional advantage that there is no chance that the model’s solution will be biased by the student’s attempted solution.
首先，我们可以要求该模型自行解决该问题。由于这个初始查询不需要学生的解决方案，因此可以省略它。这提供了额外的优势,即模型的解决方案不会受到学生尝试解决方案的偏差影响。

USER 用户

Next, we can have the model use all available information to assess the correctness of the student’s solution.
接下来,我们可以让模型利用所有可用的信息来评估学生解决方案的正确性。

SYSTEM 系统

Compare your solution to the student's solution and evaluate if the student's solution is correct or not.
将您的解决方案与学生的解决方案进行比较,并评估学生的解决方案是否正确。

USER 用户

Problem statement: """<insert problem statement>""" Your solution: """<insert model generated solution>""" Student’s solution: """<insert student's solution>"""
问题陈述: """""" 您的解决方案: """"""

Finally, we can let the model use its own analysis to construct a reply in the persona of a helpful tutor.
最后,我们可以让模型使用自己的分析在有帮助的导师的人物角色中构建一个回复。

SYSTEM 系统

You are a math tutor. If the student made an error, offer a hint to the student in a way that does not reveal the answer. If the student did not make an error, simply offer them an encouraging comment.
你是一位数学家教。如果学生犯了错误,请以不透露答案的方式给予提示。如果学生没有犯错,只需给予鼓励性的评论。

USER 用户

Problem statement: """<insert problem statement>""" Your solution: """<insert model generated solution>""" Student’s solution: """<insert student's solution>""" Analysis: """<insert model generated analysis from previous step>"""
您的解决方案: """""" 学生的解决方案: """""" 分析: """"""

Tactic: Ask the model if it missed anything on previous passes
战略:询问该模型是否在先前的传递中遗漏了任何内容

Suppose that we are using a model to list excerpts from a source which are relevant to a particular question. After listing each excerpt the model needs to determine if it should start writing another or if it should stop. If the source document is large, it is common for a model to stop too early and fail to list all relevant excerpts. In that case, better performance can often be obtained by prompting the model with followup queries to find any excerpts it missed on previous passes.
假设我们正在使用一个模型来列出与特定问题相关的引文摘要。在列出每个引文摘要后,模型需要确定是否应该开始写另一个,还是应该停止。如果源文档很大,模型通常会过早停止,无法列出所有相关的引文摘要。在这种情况下,通过提示模型进行后续查询,以查找之前遗漏的任何引文摘要,通常可以获得更好的性能。

SYSTEM 系统

You will be provided with a document delimited by triple quotes. Your task is to select excerpts which pertain to the following question: "What significant paradigm shifts have occurred in the history of artificial intelligence." Ensure that excerpts contain all relevant context needed to interpret them - in other words don't extract small snippets that are missing important context. Provide output in JSON format as follows: [{"excerpt": "..."}, ... {"excerpt": "..."}]
人工智能历史上出现的重大范式转变包括: 1. 从专家系统到机器学习:从定制的知识库系统,转向利用大数据进行机器学习,自动发现规律并做出预测。 2. 从局限的任务执行到通用智能:从解决特定问题的系统,向拥有更广泛感知、理解和自主决策能力的通用人工智能发展。 3. 从基于规则的推理到深度学习:从基于逻辑规则的推理,转向通过深度神经网络自动学习特征并进行预测分类的方法。 4. 从窄 AI 到广 AI:从专注于单一任务的窄人工智能,向拥有更广泛认知能力的通用人工智能发展。确保摘录包含解释它们所需的所有相关背景。换句话说,不要提取缺少重要背景的小片段。以 JSON 格式提供输出: Translated Text: { "translated_text": "确保摘录包含解释它们所需的所有相关背景。换句话说,不要提取缺少重要背景的小片段。以 JSON 格式提供输出:" } {"摘录": "..."} ... {"excerpt":"..."}

USER 用户

"""<insert document here>"""

ASSISTANT 助手

[{"excerpt": "the model writes an excerpt here"}, ... {"excerpt": "the model writes another excerpt here"}]
{"excerpt": "这个模型在这里写了一个片段"} ... {"excerpt": "该模型在这里写另一段摘录"}]

USER 用户

Are there more relevant excerpts? Take care not to repeat excerpts. Also ensure that excerpts contain all relevant context needed to interpret them - in other words don't extract small snippets that are missing important context.
是否有更相关的摘录?请小心不要重复摘录。另外,请确保摘录包含解释它们所需的所有相关背景信息 - 换句话说,不要提取缺乏重要背景信息的小片段。

Strategy: Use external tools
战略：使用外部工具

Tactic: Use embeddings-based search to implement efficient knowledge retrieval
策略：使用基于嵌入的搜索实现高效的知识检索

A model can leverage external sources of information if provided as part of its input. This can help the model to generate more informed and up-to-date responses. For example, if a user asks a question about a specific movie, it may be useful to add high quality information about the movie (e.g. actors, director, etc…) to the model’s input. Embeddings can be used to implement efficient knowledge retrieval, so that relevant information can be added to the model input dynamically at run-time.
如果将外部信息源作为模型输入的一部分提供,模型就可以利用这些信息。这有助于模型生成更有见地和更新的响应。例如,如果用户询问关于某部电影的问题,添加关于该电影的高质量信息(如演员、导演等)到模型输入中可能会很有用。嵌入可用于实现高效的知识检索,使相关信息能够动态地添加到模型输入中。

A text embedding is a vector that can measure the relatedness between text strings. Similar or relevant strings will be closer together than unrelated strings. This fact, along with the existence of fast vector search algorithms means that embeddings can be used to implement efficient knowledge retrieval. In particular, a text corpus can be split up into chunks, and each chunk can be embedded and stored. Then a given query can be embedded and vector search can be performed to find the embedded chunks of text from the corpus that are most related to the query (i.e. closest together in the embedding space).
文本嵌入是一种向量,可以测量文本字符串之间的相关性。相似或相关的字符串会比不相关的字符串更接近。这一事实以及快速向量搜索算法的存在意味着嵌入可用于实现高效的知识检索。特别是,文本语料库可以被分割成块,每个块都可以被嵌入和存储。然后,给定的查询可以被嵌入,并且可以执行向量搜索以找到与该查询最相关(即在嵌入空间中最接近)的语料库中的嵌入文本块。

Example implementations can be found in the OpenAI Cookbook. See the tactic “Instruct the model to use retrieved knowledge to answer queries” for an example of how to use knowledge retrieval to minimize the likelihood that a model will make up incorrect facts.
示例实现可以在OpenAI Cookbook中找到。请参阅策略"指示模型使用检索到的知识来回答查询",了解如何使用知识检索来最小化模型制造错误事实的可能性。

Tactic: Use code execution to perform more accurate calculations or call external APIs
策略：使用代码执行来执行更准确的计算或调用外部 API

Language models cannot be relied upon to perform arithmetic or long calculations accurately on their own. In cases where this is needed, a model can be instructed to write and run code instead of making its own calculations. In particular, a model can be instructed to put code that is meant to be run into a designated format such as triple backtick. After an output is produced, the code can be extracted and run. Finally, if necessary, the output from the code execution engine (i.e. Python interpreter) can be provided as an input to the model for the next query.
语言模型不能依赖自身准确地执行算术或长计算。在需要此类功能的情况下,可以指示模型编写并运行代码,而不是自行做出计算。特别是,可以指示模型将需要运行的代码放入专门的格式,如三个反引号中。在生成输出后,可以提取并运行该代码。最后,如有必要,可将代码执行引擎(即 Python 解释器)的输出作为输入提供给模型以进行后续查询。

SYSTEM 系统

You can write and execute Python code by enclosing it in triple backticks, e.g. ```code goes here```. Use this to perform calculations.
你可以通过将代码包裹在三个重音符号（```）中来编写和执行 Python 代码，例如 ```代码位于此处```。使用这种方式进行计算。

USER 用户

Find all real-valued roots of the following polynomial: 3*x**5 - 5*x**4 - 3*x**3 - 7*x - 10.
求下列多项式的所有实根：3*x**5 - 5*x**4 - 3*x**3 - 7*x - 10。

Another good use case for code execution is calling external APIs. If a model is instructed in the proper use of an API, it can write code that makes use of it. A model can be instructed in how to use an API by providing it with documentation and/or code samples showing how to use the API.
另一个良好的代码执行用例是调用外部 API。如果一个模型被指示正确使用 API，它可以编写利用它的代码。通过提供文档和/或代码示例展示如何使用 API，可以指导模型如何使用 API。

SYSTEM 系统

You can write and execute Python code by enclosing it in triple backticks. Also note that you have access to the following module to help users send messages to their friends: ```python import message message.write(to="John", message="Hey, want to meetup after work?")```
你可以在三个反引号内写入并执行 Python 代码。此外,您还可以使用以下模块来帮助用户向朋友发送消息: 以下是源文本的简体中文翻译: ```python import random def roll_dice(num_dice, num_sides): """ Simulates rolling a set of dice. Args: num_dice (int): The number of dice to roll. num_sides (int): The number of sides on each die. Returns: list: A list of the results of each die roll. """ results = [] for _ in range(num_dice): results.append(random.randint(1, num_sides)) return results # Example usage print(roll_dice(3, 6)) # Rolls 3 six-sided dice ``` 导入信息 ``` message.write(收件人="John", 消息="嗨,下班后想见个面吗?") ```

WARNING: Executing code produced by a model is not inherently safe and precautions should be taken in any application that seeks to do this. In particular, a sandboxed code execution environment is needed to limit the harm that untrusted code could cause.
警告：执行由模型生成的代码并非 inherently 安全,在任何试图执行此操作的应用程序中都应采取预防措施。特别是需要一个沙盒代码执行环境来限制不受信任代码可能造成的伤害。

Tactic: Give the model access to specific functions
策略：让模型访问特定功能

The Chat Completions API allows passing a list of function descriptions in requests. This enables models to generate function arguments according to the provided schemas. Generated function arguments are returned by the API in JSON format and can be used to execute function calls. Output provided by function calls can then be fed back into a model in the following request to close the loop. This is the recommended way of using OpenAI models to call external functions. To learn more see the function calling section in our introductory text generation guide and more function calling examples in the OpenAI Cookbook.
聊天生成 API 允许在请求中传递函数描述列表。这使模型能够根据提供的模式生成函数参数。生成的函数参数以 JSON 格式返回给 API,可用于执行函数调用。函数调用的输出然后可以在后续请求中反馈到模型中,完成循环。这是使用 OpenAI 模型调用外部函数的推荐方式。要了解更多信息,请参见我们的文本生成指南中的函数调用部分,以及 OpenAI Cookbook 中的更多函数调用示例。

Strategy: Test changes systematically
战略: 系统地测试变更

Sometimes it can be hard to tell whether a change — e.g., a new instruction or a new design — makes your system better or worse. Looking at a few examples may hint at which is better, but with small sample sizes it can be hard to distinguish between a true improvement or random luck. Maybe the change helps performance on some inputs, but hurts performance on others.
有时很难判断一个变化(例如新指令或新设计)是否使您的系统更好还是更差。通过查看几个示例可能会暗示哪个更好,但由于样本量小,很难区分真正的改进和随机运气。也许这种变化有助于某些输入的性能,但损害了其他输入的性能。

Evaluation procedures (or "evals") are useful for optimizing system designs. Good evals are:
评估流程（或"评价"）对于优化系统设计很有用。好的评价应该是:

Representative of real-world usage (or at least diverse)
代表现实世界使用情况(或至少是多样化)
Contain many test cases for greater statistical power (see table below for guidelines)
包含许多测试用例以增加统计力度（请参见下表中的指南）
Easy to automate or repeat
易于自动化或重复

Difference to detect 检测差异	Sample size needed for 95% confidence 需要 95% 置信度的样本量
30%	~10
10%	~100
3%	~1,000
1%	~10,000

Evaluation of outputs can be done by computers, humans, or a mix. Computers can automate evals with objective criteria (e.g., questions with single correct answers) as well as some subjective or fuzzy criteria, in which model outputs are evaluated by other model queries. OpenAI Evals is an open-source software framework that provides tools for creating automated evals.
输出的评估可以由计算机、人类或混合方式进行。计算机可以使用客观标准（例如只有一个正确答案的问题）以及一些主观或模糊标准自动完成评估,其中模型输出由其他模型查询进行评估。OpenAI Evals是一个开源软件框架,提供创建自动化评估的工具。

Model-based evals can be useful when there exists a range of possible outputs that would be considered equally high in quality (e.g. for questions with long answers). The boundary between what can be realistically evaluated with a model-based eval and what requires a human to evaluate is fuzzy and is constantly shifting as models become more capable. We encourage experimentation to figure out how well model-based evals can work for your use case.
基于模型的评估在存在多种可能输出且质量均等高时可能很有用(例如长答案问题)。可用模型评估的范围和需要人工评估的范围界限模糊,随着模型能力的不断提升而不断变化。我们鼓励进行实验,以了解基于模型的评估在您的使用案例中的实用性。

Tactic: Evaluate model outputs with reference to gold-standard answers
战略：以参考标准答案评估模型输出

Suppose it is known that the correct answer to a question should make reference to a specific set of known facts. Then we can use a model query to count how many of the required facts are included in the answer.
假设已知一个问题的正确答案应该参考一组特定的已知事实。那么我们可以使用一个模型查询来统计答案中包含了多少个所需的事实。

For example, using the following system message:
例如，使用以下系统消息:

SYSTEM 系统

You will be provided with text delimited by triple quotes that is supposed to be the answer to a question. Check if the following pieces of information are directly contained in the answer: - Neil Armstrong was the first person to walk on the moon. - The date Neil Armstrong first walked on the moon was July 21, 1969. For each of these points perform the following steps: 1 - Restate the point. 2 - Provide a citation from the answer which is closest to this point. 3 - Consider if someone reading the citation who doesn't know the topic could directly infer the point. Explain why or why not before making up your mind. 4 - Write "yes" if the answer to 3 was yes, otherwise write "no". Finally, provide a count of how many "yes" answers there are. Provide this count as {"count": <insert count here>}.
您将获得用三重引号分隔的文本,这应该是对问题的回答。检查以下信息是否直接包含在答案中: - 尼尔·阿姆斯特朗是第一个踏上月球的人。尼尔·阿姆斯特朗首次登月的日期是 1969 年 7 月 21 日。对于每个这些点执行以下步骤: 1 - 重申这一观点。 2 - 提供最接近此论点的答案引用。 3 - 考虑一下如果不了解主题的人阅读这篇引文,他们是否能直接推断出观点。在下定决心之前,解释一下原因。 4 - 如果 3 的答案是"是"，则写"是"；否则写"否"。 {"count": }.

Here's an example input where both points are satisfied:
这里有一个示例输入,满足两个要点:

SYSTEM 系统

USER 用户

"""Neil Armstrong is famous for being the first human to set foot on the Moon. This historic event took place on July 21, 1969, during the Apollo 11 mission."""
尼尔·阿姆斯特朗因成为首位登陆月球的人类而闻名。这一历史性事件发生在 1969 年 7 月 21 日,期间是阿波罗 11 号航天任务期间。

Here's an example input where only one point is satisfied:
这里有一个只有一个点满足的例子:

SYSTEM 系统

USER 用户

"""Neil Armstrong made history when he stepped off the lunar module, becoming the first person to walk on the moon."""
尼尔·阿姆斯特朗创造历史,当他走下登月舱时成为了第一个登陆月球的人。

Here's an example input where none are satisfied:
这里有一个没有任何人满意的输入示例:

SYSTEM 系统

USER 用户

"""In the summer of '69, a voyage grand, Apollo 11, bold as legend's hand. Armstrong took a step, history unfurled, "One small step," he said, for a new world."""
1969 年的夏天,是一次盛大航程, 阿波罗 11 号,勇敢如传奇的手笔。阿姆斯特朗迈出了一步,历史展开了翅膀 "一小步,"他说,"通向新世界。"

There are many possible variants on this type of model-based eval. Consider the following variation which tracks the kind of overlap between the candidate answer and the gold-standard answer, and also tracks whether the candidate answer contradicts any part of the gold-standard answer.
这种基于模型的评估存在许多可能的变体。考虑以下这种变体,它跟踪候选答案与标准答案之间的重叠程度,同时也检测候选答案是否与标准答案的任何部分产生矛盾。

SYSTEM 系统

Use the following steps to respond to user inputs. Fully restate each step before proceeding. i.e. "Step 1: Reason...". Step 1: Reason step-by-step about whether the information in the submitted answer compared to the expert answer is either: disjoint, equal, a subset, a superset, or overlapping (i.e. some intersection but not subset/superset). Step 2: Reason step-by-step about whether the submitted answer contradicts any aspect of the expert answer. Step 3: Output a JSON object structured like: {"type_of_overlap": "disjoint" or "equal" or "subset" or "superset" or "overlapping", "contradiction": true or false}
使用以下步骤来响应用户输入。在继续之前，完全重述每一步。例如"步骤 1:推理..."。第 1 步：逐步合理地考虑提交答案中的信息与专家答案是否:互斥、相等、子集、超集还是重叠(即有一些交集但不是子集/超集)。第 2 步：逐步推理所提交的答案是否与专家答案的任何方面存在矛盾。 {"type_of_overlap": "disjoint" 或 "equal" 或 "subset" 或 "superset" 或 "overlapping", "contradiction": true 或 false}

Here's an example input with a substandard answer which nonetheless does not contradict the expert answer:
这是一个示例输入,尽管答案不太标准,但也不会与专家答案相矛盾:

SYSTEM 系统

USER 用户

Question: """What event is Neil Armstrong most famous for and on what date did it occur? Assume UTC time.""" Submitted Answer: """Didn't he walk on the moon or something?""" Expert Answer: """Neil Armstrong is most famous for being the first person to walk on the moon. This historic event occurred on July 21, 1969."""
尼尔·阿姆斯特朗最著名的事件是他于 1969 年 7 月 20 日(UTC 时间)登陆月球。他不是在月球上行走过吗? 尼尔·阿姆斯特朗最著名的是成为首位登陆月球的人。这一历史性事件发生在 1969 年 7 月 21 日。

Here's an example input with answer that directly contradicts the expert answer:
下面是一个输入示例,其答案直接与专家答案相矛盾:

SYSTEM 系统

USER 用户

Question: """What event is Neil Armstrong most famous for and on what date did it occur? Assume UTC time.""" Submitted Answer: """On the 21st of July 1969, Neil Armstrong became the second person to walk on the moon, following after Buzz Aldrin.""" Expert Answer: """Neil Armstrong is most famous for being the first person to walk on the moon. This historic event occurred on July 21, 1969."""
尼尔·阿姆斯特朗最著名的事件是于 1969 年 7 月 20 日(UTC 时间)成为第一个登陆月球的人。 1969 年 7 月 21 日,尼尔·阿姆斯特朗成为第二个登月的人,继布兹·奥尔德林之后。尼尔·阿姆斯特朗最著名的是成为第一个登陆月球的人。这一历史性事件发生在 1969 年 7 月 21 日。

Here's an example input with a correct answer that also provides a bit more detail than is necessary:
这里是一个带有正确答案的示例输入,它还提供了比必要更多的详细信息:

SYSTEM 系统

USER 用户

Question: """What event is Neil Armstrong most famous for and on what date did it occur? Assume UTC time.""" Submitted Answer: """At approximately 02:56 UTC on July 21st 1969, Neil Armstrong became the first human to set foot on the lunar surface, marking a monumental achievement in human history.""" Expert Answer: """Neil Armstrong is most famous for being the first person to walk on the moon. This historic event occurred on July 21, 1969."""
尼尔·阿姆斯特朗最著名的事件是在 1969 年 7 月 20 日(UTC 时间)登陆月球。 1969 年 7 月 21 日大约 02:56 UTC,尼尔·阿姆斯特朗成为人类历史上第一个登陆月球表面的人,这是人类历史上的一个重大成就。尼尔·阿姆斯特朗最著名的是成为首位登陆月球的人。这一历史性事件发生在 1969 年 7 月 21 日。

Other resources 其他资源

For more inspiration, visit the OpenAI Cookbook, which contains example code and also links to third-party resources such as:
为了获得更多灵感,请访问OpenAI Cookbook,其中包含示例代码,也有指向第三方资源的链接,例如:

Prompt engineering 提示工程

Six strategies for getting better results获得更好结果的六大策略

Write clear instructions 写出清晰的说明

Provide reference text 提供参考文本

Split complex tasks into simpler subtasks将复杂任务分解为更简单的子任务

Give the model time to "think"给模型一些时间"思考"

Use external tools 使用外部工具

Test changes systematically系统地测试变更

Tactics 战术

Strategy: Write clear instructions战略：编写清晰的说明

Tactic: Include details in your query to get more relevant answers策略:在您的查询中包含详细信息,以获得更相关的答案

Tactic: Ask the model to adopt a persona战术:要求模型采取某一形象

Tactic: Use delimiters to clearly indicate distinct parts of the input策略：使用分隔符来明确地指示输入的不同部分

Tactic: Specify the steps required to complete a task策略:详细说明完成任务所需的步骤

Tactic: Provide examples 策略:提供示例

Tactic: Specify the desired length of the output战术:指定输出的期望长度

Strategy: Provide reference text战略:提供参考文本

Tactic: Instruct the model to answer using a reference text战术:指示模型使用参考文本回答

Tactic: Instruct the model to answer with citations from a reference text战术：指示模型从参考文本中引用回答

Strategy: Split complex tasks into simpler subtasks策略：将复杂任务拆分为更简单的子任务

Tactic: Use intent classification to identify the most relevant instructions for a user query策略：使用意图分类来识别用户查询的最相关说明

Tactic: For dialogue applications that require very long conversations, summarize or filter previous dialogue对话应用程序要求进行非常长的对话时,可以总结或过滤前面的对话

Tactic: Summarize long documents piecewise and construct a full summary recursively策略:逐段总结长文档,以递归方式构建完整的摘要

Strategy: Give models time to "think"策略:给模型时间"思考"

Tactic: Instruct the model to work out its own solution before rushing to a conclusion策略:在急于得出结论之前,指示模型自行找出解决方案

Tactic: Use inner monologue or a sequence of queries to hide the model's reasoning process战术:使用内部独白或一系列查询来隐藏模型的推理过程

Tactic: Ask the model if it missed anything on previous passes战略:询问该模型是否在先前的传递中遗漏了任何内容

Strategy: Use external tools战略：使用外部工具

Tactic: Use embeddings-based search to implement efficient knowledge retrieval策略：使用基于嵌入的搜索实现高效的知识检索

Tactic: Use code execution to perform more accurate calculations or call external APIs策略：使用代码执行来执行更准确的计算或调用外部 API

Tactic: Give the model access to specific functions策略：让模型访问特定功能

Strategy: Test changes systematically战略: 系统地测试变更

Tactic: Evaluate model outputs with reference to gold-standard answers战略：以参考标准答案评估模型输出

Other resources 其他资源

Six strategies for getting better results
获得更好结果的六大策略

Split complex tasks into simpler subtasks
将复杂任务分解为更简单的子任务

Give the model time to "think"
给模型一些时间"思考"

Test changes systematically
系统地测试变更

Strategy: Write clear instructions
战略：编写清晰的说明

Tactic: Include details in your query to get more relevant answers
策略:在您的查询中包含详细信息,以获得更相关的答案

Tactic: Ask the model to adopt a persona
战术:要求模型采取某一形象

Tactic: Use delimiters to clearly indicate distinct parts of the input
策略：使用分隔符来明确地指示输入的不同部分

Tactic: Specify the steps required to complete a task
策略:详细说明完成任务所需的步骤

Tactic: Specify the desired length of the output
战术:指定输出的期望长度

Strategy: Provide reference text
战略:提供参考文本

Tactic: Instruct the model to answer using a reference text
战术:指示模型使用参考文本回答

Tactic: Instruct the model to answer with citations from a reference text
战术：指示模型从参考文本中引用回答

Strategy: Split complex tasks into simpler subtasks
策略：将复杂任务拆分为更简单的子任务

Tactic: Use intent classification to identify the most relevant instructions for a user query
策略：使用意图分类来识别用户查询的最相关说明

Tactic: For dialogue applications that require very long conversations, summarize or filter previous dialogue
对话应用程序要求进行非常长的对话时,可以总结或过滤前面的对话

Tactic: Summarize long documents piecewise and construct a full summary recursively
策略:逐段总结长文档,以递归方式构建完整的摘要

Strategy: Give models time to "think"
策略:给模型时间"思考"

Tactic: Instruct the model to work out its own solution before rushing to a conclusion
策略:在急于得出结论之前,指示模型自行找出解决方案

Tactic: Use inner monologue or a sequence of queries to hide the model's reasoning process
战术:使用内部独白或一系列查询来隐藏模型的推理过程

Tactic: Ask the model if it missed anything on previous passes
战略:询问该模型是否在先前的传递中遗漏了任何内容

Strategy: Use external tools
战略：使用外部工具

Tactic: Use embeddings-based search to implement efficient knowledge retrieval
策略：使用基于嵌入的搜索实现高效的知识检索

Tactic: Use code execution to perform more accurate calculations or call external APIs
策略：使用代码执行来执行更准确的计算或调用外部 API

Tactic: Give the model access to specific functions
策略：让模型访问特定功能

Strategy: Test changes systematically
战略: 系统地测试变更

Tactic: Evaluate model outputs with reference to gold-standard answers
战略：以参考标准答案评估模型输出