OpenRouter 图像生成：通过统一 API 调用 Gemini、Flux、Sourceful 等模型生成图片

OpenRouter 通过 /api/v1/chat/completions 支持图像生成，只需在请求中加入 modalities: ["image", "text"]（或 ["image"]）即可。响应中 message.images 数组包含 base64 编码的图片数据 URL。支持 image_config 参数控制宽高比（1:1 到 21:9 共 10 种）、分辨率（1K/2K/4K）、字体输入（Sourceful 专用）和超分辨率参考图（Sourceful 专用）。支持 Streaming 模式，图片在 delta.images 中增量返回。

OpenRouter 通过 Chat Completions 和 Responses 端点支持图像生成。在 Models 页面过滤"image output"可查看所有支持的模型。

发现图像生成模型

通过 API 查询

# 只列出图像生成模型
curl "https://openrouter.ai/api/v1/models?output_modalities=image"

# 同时支持文本和图像输出的模型
curl "https://openrouter.ai/api/v1/models?output_modalities=text,image"

生成图像

使用 modalities 参数控制输出类型：

同时输出文本和图像的模型（如 Gemini）：modalities: ["image", "text"]
只输出图像的模型（如 Sourceful、Flux）：modalities: ["image"]

基础示例

import { OpenRouter } from '@openrouter/sdk';

const openRouter = new OpenRouter({ apiKey: '<OPENROUTER_API_KEY>' });

const result = await openRouter.chat.send({
  model: 'google/gemini-2.5-flash-image',
  messages: [
    {
      role: 'user',
      content: 'Generate a beautiful sunset over mountains',
    },
  ],
  modalities: ['image', 'text'],
  stream: false,
});

if (result.choices) {
  const message = result.choices[0].message;
  if (message.images) {
    message.images.forEach((image, index) => {
      const imageUrl = image.imageUrl.url; // Base64 data URL
      console.log(`Generated image ${index + 1}: ${imageUrl.substring(0, 50)}...`);
    });
  }
}

图像配置参数（image_config）

宽高比（aspect_ratio）

payload = {
  "model": "google/gemini-3-pro-image-preview",
  "messages": [{"role": "user", "content": "Create a landscape photo"}],
  "modalities": ["image", "text"],
  "image_config": {
    "aspect_ratio": "16:9",
    "image_size": "4K"
  }
}

支持的宽高比：

比例	分辨率
`1:1`	1024×1024（默认）
`2:3`	832×1248
`3:2`	1248×832
`9:16`	768×1344
`16:9`	1344×768
`21:9`	1536×672

扩展比例（仅 google/gemini-3.1-flash-image-preview）：1:4、4:1、1:8、8:1。

分辨率（image_size）

值	说明
`1K`	标准分辨率（默认）
`2K`	高分辨率
`4K`	最高分辨率
`0.5K`	低分辨率，省资源（仅 `gemini-3.1-flash-image-preview`）

字体渲染（font_inputs，仅 Sourceful）

仅 sourceful/riverflow-v2-fast 和 sourceful/riverflow-v2-pro 支持。在生成的图像中渲染自定义字体文字：

{
  "image_config": {
    "font_inputs": [
      {
        "font_url": "https://example.com/fonts/custom-font.ttf",
        "text": "Hello World"
      }
    ]
  }
}

最多 2 个字体输入，每个 $0.03
text 值应与 prompt 中的文字完全一致
适合短标题和副标题

超分辨率参考图（super_resolution_references，仅 Sourceful）

仅 Sourceful 模型在图生图模式下支持（即 messages 中包含图片时）：

{
  "image_config": {
    "super_resolution_references": [
      "https://example.com/reference1.jpg",
      "https://example.com/reference2.jpg"
    ]
  }
}

最多 4 个参考 URL，每个 $0.20
输出图片尺寸与输入图片相同，建议使用大尺寸输入

Streaming 模式

payload = {
  "model": "google/gemini-2.5-flash-image",
  "messages": [{"role": "user", "content": "Create an image of a futuristic city"}],
  "modalities": ["image", "text"],
  "stream": True
}

response = requests.post(url, headers=headers, json=payload, stream=True)

for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = line[6:]
            if data != '[DONE]':
                chunk = json.loads(data)
                if chunk.get("choices"):
                    delta = chunk["choices"][0].get("delta", {})
                    if delta.get("images"):
                        for image in delta["images"]:
                            print(f"Generated image: {image['image_url']['url'][:50]}...")

响应格式

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "I've generated a beautiful sunset image for you.",
        "images": [
          {
            "type": "image_url",
            "image_url": {
              "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
            }
          }
        ]
      }
    }
  ]
}

图像以 base64 data URL 格式返回，通常为 PNG（data:image/png;base64,）。

支持图像生成的模型

google/gemini-3.1-flash-image-preview（支持扩展宽高比和 0.5K 分辨率）
google/gemini-2.5-flash-image
black-forest-labs/flux.2-pro
black-forest-labs/flux.2-flex
sourceful/riverflow-v2-standard-preview（支持字体渲染和超分辨率参考图）

常见问题

Q: 响应中没有 images 字段怎么办？

A: 检查三点：① 模型的 output_modalities 中是否包含 "image"；② 是否设置了正确的 modalities 参数（text+image 模型用 ["image","text"]，纯图像模型用 ["image"]）；③ prompt 中是否明确要求生成图像。

Q: 生成的图像如何保存到本地？

A: 响应中的 url 是 base64 data URL（data:image/png;base64,...），提取 base64 部分后用 Buffer.from(b64, 'base64') 解码，再写入文件即可。

Q: 图像生成的速率限制和文本生成一样吗？

A: 不一定。图像生成通常有独立的速率限制，且生成时间更长。具体限制查看各模型的详情页，或在触发 429 时实现指数退避重试。

发现图像生成模型 #

通过 API 查询 #

生成图像 #

基础示例 #

图像配置参数（image_config） #

宽高比（aspect_ratio） #

分辨率（image_size） #

字体渲染（font_inputs，仅 Sourceful） #

超分辨率参考图（super_resolution_references，仅 Sourceful） #

Streaming 模式 #

响应格式 #

支持图像生成的模型 #

常见问题 #