对话补全
POSThttps://api.deepseek.com/chat/completions
根据输入的上下文,来让模型补全对话内容。
Request
- application/json
Body
required
messages
object[]
required
Possible values: [deepseek-chat
, deepseek-reasoner
]
使用的模型的 ID。您可以使用 deepseek-chat。
Possible values: >= -2
and <= 2
Default value: 0
介于 -2.0 和 2.0 之间的数字。如果该值为正,那么新 token 会根据其在已有文本中的出现频率受到相应的惩罚,降低模型重复相同内容的可能性。
Possible values: > 1
介于 1 到 8192 间的整数,限制一次请求中模型生成 completion 的最大 token 数。输入 token 和输出 token 的总长度受模型的上下文长度的限制。
如未指定 max_tokens
参数,默认使用 4096。
Possible values: >= -2
and <= 2
Default value: 0
介于 -2.0 和 2.0 之间的数字。如果该值为正,那么新 token 会根据其是否已在已有文本中出现受到相应的惩罚,从而增加模型谈论新主题的可能性。
response_format
object
nullable
stop
object
nullable
如果设置为 True,将会以 SSE(server-sent events)的形式以流式发送消息增量。消息流以 data: [DONE]
结尾。
stream_options
object
nullable
流式输出相关选项。只有在 stream
参数为 true
时,才可设置此参数。
如果设置为 true,在流式消息最后的 data: [DONE]
之前将会传输一个额外的块。此块上的 usage 字段显示整个请求的 token 使用统计信息,而 choices 字段将始终是一个空数组。所有其他块也将包含一个 usage 字段,但其值为 null。
Possible values: <= 2
Default value: 1
采样温度,介于 0 和 2 之间。更高的值,如 0.8,会使输出更随机,而更低的值,如 0.2,会使其更加集中和确定。 我们通常建议可以更改这个值或者更改 top_p
,但不建议同时对两者进行修改。
Possible values: <= 1
Default value: 1
作为调节采样温度的替代方案,模型会考虑前 top_p
概率的 token 的结果。所以 0.1 就意味着只有包括在最高 10% 概率中的 token 会被考虑。 我们通常建议修改这个值或者更改 temperature
,但不建议同时对两者进行修改。
tools
object[]
nullable
tool_choice
object
nullable
是否返回所输出 token 的对数概率。如果为 true,则在 message
的 content
中返回每个输出 token 的对数概率。
Possible values: <= 20
一个介于 0 到 20 之间的整数 N,指定每个输出位置返回输出概率 top N 的 token,且返回这些 token 的对数概率。指定此参数时,logprobs 必须为 true。
Responses
- 200 (No streaming)
- 200 (Streaming)
OK, 返回一个 chat completion
对象。
- application/json
- Schema
- Example (from schema)
- Example
Schema
该对话的唯一标识符。
choices
object[]
required
创建聊天完成时的 Unix 时间戳(以秒为单位)。
生成该 completion 的模型名。
This fingerprint represents the backend configuration that the model runs with.
Possible values: [chat.completion
]
对象的类型, 其值为 chat.completion
。
usage
object
该对话补全请求的用量信息。
模型 completion 产生的 token 数。
用户 prompt 所包含的 token 数。该值等于 prompt_cache_hit_tokens + prompt_cache_miss_tokens
用户 prompt 中,命中上下文缓存的 token 数。
用户 prompt 中,未命中上下文缓存的 token 数。
该请求中,所有 token 的数量(prompt + completion)。
completion_tokens_details
object
{
"id": "string",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "string",
"reasoning_content": "string",
"tool_calls": [
{
"id": "string",
"type": "function",
"function": {
"name": "string",
"arguments": "string"
}
}
],
"role": "assistant"
},
"logprobs": {
"content": [
{
"token": "string",
"logprob": 0,
"bytes": [
0
],
"top_logprobs": [
{
"token": "string",
"logprob": 0,
"bytes": [
0
]
}
]
}
]
}
}
],
"created": 0,
"model": "string",
"system_fingerprint": "string",
"object": "chat.completion",
"usage": {
"completion_tokens": 0,
"prompt_tokens": 0,
"prompt_cache_hit_tokens": 0,
"prompt_cache_miss_tokens": 0,
"total_tokens": 0,
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}
{
"id": "930c60df-bf64-41c9-a88e-3ec75f81e00e",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Hello! How can I help you today?",
"role": "assistant"
}
}
],
"created": 1705651092,
"model": "deepseek-chat",
"object": "chat.completion",
"usage": {
"completion_tokens": 10,
"prompt_tokens": 16,
"total_tokens": 26
}
}
OK, 返回包含一系列 chat completion chunk
对象的流式输出。
- text/event-stream
- Schema
- Example
Schema
Array [
Array [
]
]
该对话的唯一标识符。
choices
object[]
required
模型生成的 completion 的选择列表。
delta
object
required
流式返回的一个 completion 增量。
completion 增量的内容。
仅适用于 deepseek-reasoner 模型。内容为 assistant 消息中在最终答案之前的推理内容。
Possible values: [assistant
]
产生这条消息的角色。
Possible values: [stop
, length
, content_filter
, tool_calls
, insufficient_system_resource
]
模型停止生成 token 的原因。
stop
:模型自然停止生成,或遇到 stop
序列中列出的字符串。
length
:输出长度达到了模型上下文长度限制,或达到了 max_tokens
的限制。
content_filter
:输出内容因触发过滤策略而被过滤。
insufficient_system_resource
: 由于后端推理资源受限,请求被打断。
该 completion 在模型生成的 completion 的选择列表中的索引。
创建聊天完成时的 Unix 时间戳(以秒为单位)。流式响应的每个 chunk 的时间戳相同。
生成该 completion 的模型名。
This fingerprint represents the backend configuration that the model runs with.
Possible values: [chat.completion.chunk
]
对象的类型, 其值为 chat.completion.chunk
。
data: {"id": "1f633d8bfc032625086f14113c411638", "choices": [{"index": 0, "delta": {"content": "", "role": "assistant"}, "finish_reason": null, "logprobs": null}], "created": 1718345013, "model": "deepseek-chat", "system_fingerprint": "fp_a49d71b8a1", "object": "chat.completion.chunk", "usage": null}
data: {"choices": [{"delta": {"content": "Hello", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": "!", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": " How", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": " can", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": " I", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": " assist", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": " you", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": " today", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": "?", "role": "assistant"}, "finish_reason": null, "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1"}
data: {"choices": [{"delta": {"content": "", "role": null}, "finish_reason": "stop", "index": 0, "logprobs": null}], "created": 1718345013, "id": "1f633d8bfc032625086f14113c411638", "model": "deepseek-chat", "object": "chat.completion.chunk", "system_fingerprint": "fp_a49d71b8a1", "usage": {"completion_tokens": 9, "prompt_tokens": 17, "total_tokens": 26}}
data: [DONE]
- curl
- python
- go
- nodejs
- ruby
- csharp
- php
- java
- powershell
- OpenAI SDK
from openai import OpenAI
# for backward compatibility, you can still use `https://api.deepseek.com/v1` as `base_url`.
client = OpenAI(api_key="<your API key>", base_url="https://api.deepseek.com")
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"},
],
max_tokens=1024,
temperature=0.7,
stream=False
)
print(response.choices[0].message.content)
- REQUESTS
- HTTP.CLIENT
import requests
import json
url = "https://api.deepseek.com/chat/completions"
payload = json.dumps({
"messages": [
{
"content": "You are a helpful assistant",
"role": "system"
},
{
"content": "Hi",
"role": "user"
}
],
"model": "deepseek-chat",
"frequency_penalty": 0,
"max_tokens": 2048,
"presence_penalty": 0,
"response_format": {
"type": "text"
},
"stop": None,
"stream": False,
"stream_options": None,
"temperature": 1,
"top_p": 1,
"tools": None,
"tool_choice": "none",
"logprobs": False,
"top_logprobs": None
})
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json',
'Authorization': 'Bearer <TOKEN>'
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)