DeepSeek 多轮 Function Calling 输出与 query 无关内容

问题

在多轮 Function Calling 对话中，DeepSeek V3.1 某些轮次的回复内容与当前请求完全无关，出现算法题解析、编程讲解等不相关文字，且输出开头包含 <|begin_of_sentence|> 等特殊 token。

分析

社区发现该问题与以下因素有关：

系统 prompt 冲突：当上文 system 指令提示使用某个工具获取"补充结果"时，下一轮模型可能将 system prompt 中的指令与上下文混淆
特殊 token 泄漏：<|begin_of_sentence|> 出现在输出中说明模型的 token 边界处理有问题，可能是服务端 bug
消息格式问题：历史消息中有不规范的 tool_call / tool_result 格式

解决方案

检查并修复消息格式：

# 每轮 function call 必须完整包含：
# 1. assistant 消息（带 tool_calls）
# 2. tool 消息（对应每个 tool_call_id）
# 3. 下一轮 user 消息

def build_valid_history(turns):
    messages = []
    for turn in turns:
        if turn.get('tool_calls'):
            # assistant 调用工具
            messages.append({
                "role": "assistant",
                "content": None,
                "tool_calls": turn['tool_calls']
            })
            # 工具结果（每个都要有）
            for tc in turn['tool_calls']:
                messages.append({
                    "role": "tool",
                    "tool_call_id": tc['id'],
                    "content": turn['results'].get(tc['id'], "")
                })
        else:
            messages.append({"role": turn['role'], "content": turn['content']})
    return messages

避免在 system prompt 中引用工具名：让 user 消息驱动工具调用，不要在 system 中说"请调用 X 工具"。

来源：Issue #976 - deepseek-ai/DeepSeek-V3

DeepSeek 多轮 Function Calling 输出与 query 无关内容 #

问题 #

分析 #

解决方案 #

DeepSeek 多轮 Function Calling 输出与 query 无关内容

问题

分析

解决方案