DeepSeek API 响应被 8192 Token 上限截断的处理方案

问题

调用 DeepSeek Coder API（或 DeepSeek Chat）生成较长内容时，响应不完整，输出在中途被截断，finish_reason 为 length 而非 stop。

分析

DeepSeek 模型的默认 max_tokens 上限为 8192（部分模型更低）。生成代码、长文档等任务时很容易触及该限制。

finish_reason 值含义：

stop：模型正常完成输出
length：到达 max_tokens 上限被截断

解决方案

方案一：检测 finish_reason 并请求续写

python

def get_complete_response(client, messages, model="deepseek-chat", max_rounds=5):
    full_content = ""
    round_messages = messages.copy()
    
    for _ in range(max_rounds):
        response = client.chat.completions.create(
            model=model,
            messages=round_messages,
            max_tokens=8192
        )
        choice = response.choices[0]
        full_content += choice.message.content or ""
        
        if choice.finish_reason == "stop":
            break
        elif choice.finish_reason == "length":
            # 请求模型从断点续写
            round_messages.append({"role": "assistant", "content": choice.message.content})
            round_messages.append({"role": "user", "content": "请继续，从刚才中断的地方接着写"})
        else:
            break
    
    return full_content

方案二：分段生成

对于可预测结构的输出（如多个函数），逐个请求：

python

# 不好：一次请求生成 10 个函数
# 好：每次请求生成 1~2 个函数，循环处理
for func_name in function_list:
    response = client.chat.completions.create(
        model="deepseek-chat",
        messages=[{"role": "user", "content": f"只写 {func_name} 这一个函数的实现"}]
    )

方案三：使用流式输出监控

流式输出时可以实时检测是否被截断：

python

finish_reason = None
for chunk in stream:
    if chunk.choices[0].finish_reason:
        finish_reason = chunk.choices[0].finish_reason

if finish_reason == "length":
    print("⚠️ 响应被截断，考虑续写或分段")

来源：Issue #881 - deepseek-ai/DeepSeek-V3

AI 工具接入

模型能力

高级功能

集成与工具

运维与稳定性

GitHub MCP Server

设置与安装

用量与账单管理

模型切换

Cloud Agent（云端 AI 代理）

Copilot CLI

CLI 自定义总览

CLI 安装与配置

CLI 自动化

CLI Agent 使用

Copilot SDK

认证配置

故障排查

集成与可观测性

Cloud Agent 任务工作流

自定义与 Spaces

启用与配置（set-up）

启用 Copilot

Prompt 工程

代码补全

工具集成

Agent 系统

Copilot CLI 核心概念

计费说明

上下文与索引

语言与框架

Learn by Playing

Terminal UI

Privacy & Security

Custom Agents 详解

CLI 计费管理

CLI Enterprise

CLI Chat

CLI MCP

CLI Reference

Experimental

DeepSeek API 响应被 8192 Token 上限截断的处理方案 ​

问题 ​

分析 ​

解决方案 ​

DeepSeek API 响应被 8192 Token 上限截断的处理方案

问题

分析

解决方案