📄 工具详情 ⚙️ 安装教程 📚 使用教程

能力标签

🌐 翻译 🐳 Docker 💻 CLI 🔗 REST API 🧬 Embedding 🖼 视觉 🧠 Claude ✨ GPT ⌨️ Cursor 🖥 本地 LLM

⚙️

Agent工作流

模型大师

Q: model-maestro 如何安装和开始使用？

访问 model-maestro 的 GitHub 仓库或官方网站，按照 README 文档中的步骤安装依赖并运行。通常需要 Python 3.8+ 或 Node.js 16+ 基础环境。

Q: model-maestro 是否免费？许可证是什么？

model-maestro 完全免费，采用 开源协议 许可证开源发布，任何人都可以免费使用、修改和分发。

Q: model-maestro 适合哪些用户使用？

model-maestro 主要面向有一定技术基础的用户，包括开发者、数据分析师、AI 工程师等专业人士。

Q: model-maestro 的社区活跃度和项目维护状况如何？

model-maestro 在 GitHub 上已获得 6 个 Star，处于积极发展阶段，社区在持续扩大。

基于 Python · 无代码搭建完整 AI 自动化流程

英文名：model-maestro

⭐ 6 Stars 💻 Python 📄 未公布协议 🏷 AI 7.5分

7.5AI 综合评分

aipythonworkflow

⚙️ 配置说明 🔍 查看原项目

✦ AI Skill Hub 推荐

AI Skill Hub 推荐使用：模型大师是一款优质的Agent工作流。AI 综合评分 7.5 分，在同类工具中表现稳健。如果你正在寻找可靠的Agent工作流解决方案，这是一个值得深入了解的选择。

📚 深度解析

模型大师是一套完整的 AI Agent 自动化工作流方案。随着 AI 能力的不断提升，基于 Agent 的自动化工作流正在成为提升个人和团队效率的核心方式。区别于传统的 RPA 自动化（模拟鼠标键盘操作），AI Agent 工作流通过理解任务意图、动态规划执行路径，能够处理更复杂的非结构化任务。

模型大师工作流的设计遵循"最小配置，最大复用"原则：核心逻辑已经封装好，用户只需配置自己的 API Key 和业务参数即可快速上手。工作流内置错误处理和重试机制，在网络波动或 API 限速等情况下仍能稳定运行，适合作为生产环境的自动化基础设施。

在实际部署时，建议先在测试环境中运行 3-5 次，验证各个环节的输出结果符合预期，再部署到生产环境。AI Skill Hub 评分 7.5 分，是同类 Agent 工作流中的精选推荐。

📋 工具概览

模型大师是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排，将复杂的多步骤任务拆解为清晰的自动化流程，实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成，适合构建数据处理管线、业务自动化和 AI 辅助决策系统。

GitHub Stars

⭐ 6

开发语言

Python

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

未公布

AI 综合评分

7.5 分

工具类型

Agent工作流

Forks

—

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

📌 核心特色

可视化 Agent 工作流编排，无需编写复杂代码
支持多步骤自动化任务链，实现全流程无人值守
与外部 API、数据库和第三方服务无缝集成
内置错误处理与自动重试机制，保障稳定运行
提供可复用的自动化模板，快速在同类场景部署

🎯 主要使用场景

自动化日常重复性工作，将精力集中于创造性任务
构建数据采集 → 处理 → 输出的完整自动化管线
实现跨平台、跨系统的数据流转和业务协同

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：pip 安装（推荐）
pip install model-maestro

# 方式二：虚拟环境安装（推荐生产环境）
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install model-maestro

# 方式三：从源码安装（获取最新功能）
git clone https://github.com/skymoonsun/model-maestro
cd model-maestro
pip install -e .

# 验证安装
python -c "import model_maestro; print('安装成功')"

📋 安装步骤说明

访问 GitHub 仓库获取工作流文件
在对应平台（Dify / Flowise / Make 等）中找到「导入工作流」功能
上传工作流文件
按照提示配置必要的环境变量和 API Key
运行测试确认流程正常后投入使用

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 命令行使用
model-maestro --help

# 基本用法
model-maestro input_file -o output_file

# Python 代码中调用
import model_maestro

# 示例
result = model_maestro.process("input")
print(result)

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

# model-maestro 配置文件示例（config.yml）
app:
  name: "model-maestro"
  debug: false
  log_level: "INFO"

# 运行时指定配置文件
model-maestro --config config.yml

# 或通过环境变量配置
export MODEL_MAESTRO_API_KEY="your-key"
export MODEL_MAESTRO_OUTPUT_DIR="./output"

📑 README 深度解析真实文档完整度 84/100 查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

简介

Config-driven Unified LLM Gateway

Route, load-balance and manage Ollama, OpenAI and other LLM providers through a single authenticated API. Model Maestro gives you user-based access control, model mapping, token usage tracking, health-checked node pooling and a modern Next.js admin dashboard — all wired to PostgreSQL + Redis.

<a href="#quick-start">Quick Start</a> · <a href="#features">Features</a> · <a href="#architecture">Architecture</a> · <a href="#api-reference">API</a> · <a href="#admin-panel">Admin Panel</a>

---

Features

JWT Authentication — Bearer-token auth on every LLM request.
Admin Dashboard — Next.js 16 panel for visual management of users, nodes, models, groups and audit logs.
Model Mapping — Translate display names (gpt-oss:120b) to real names (gpt-oss:120b-cloud) via PostgreSQL with JSON-file caching.
Node-Scoped Model Mappings — Bind a mapping to a specific node so the same display name can resolve to different real names on different backends.
Node-Scoped Routing via Model Prefix — Force a request to a specific node by prefixing the model name: node:trmix:kimi-k2.6:latest routes directly to the node with code trmix.
Multi-Node Load Balancing — Round-robin, weighted and priority-based strategies across Ollama and vLLM nodes.
Antigravity Support — Google v1internal API proxy via OAuth 2.0. Access Gemini and Claude models through Google's infrastructure as a first-class provider alongside Ollama and vLLM.
AWS Bedrock Support — Native AWS Bedrock Converse API node type with automatic credential forwarding, image input and streaming support.
vLLM Support — Native vLLM (OpenAI-compatible) node type with automatic health checks, model discovery and Authorization: Bearer header forwarding.
Cursor AI Support — Connect Cursor API keys via the cursor node type. Automatically routes OpenAI-compatible requests to Cursor's backend (StandardAgents proxy or Cursor directly) with health checks and model discovery.
Public Tunnel — One-click Cloudflare (quick or named tunnel) and ngrok integration to expose your local API publicly without manual setup.
Model Groups — Group models into logical units with fallback chains. Requests dynamically resolve to the best member based on capability tags (vision, tools) and strategy.
Node Health Management — Automatic health checks, model discovery and availability tracking for Ollama, vLLM, Antigravity, Bedrock and Cursor nodes.
Per-Node Warmup Toggle — Enable or disable model warmup per node via admin UI.
Drag-and-Drop Node Priority — Reorder node cards in the admin panel to update fallback priority visually.
User-Level Access Control — Per-user model/node/node-model allowlists and rate limits (requests / tokens per day). Restrict a user to specific nodes or even specific models on specific nodes.
Token Usage Tracking — Background-batched activity logs with prompt / completion / total token breakdowns, plus request source identification (Cursor, Claude, OpenClaw, Grafana, etc.).
Tool Set Filtering — Restrict which tools a model is allowed to invoke via configurable tool sets.
Unified Models Page — Single tabbed view for both Ollama and vLLM models with live metadata (context length, capabilities, max model len) and one-click sync.
Sync Caps / Sync Meta — Pull capabilities from Ollama (/api/show) and max_model_len from vLLM (/v1/models) directly from the admin UI.
Context Length Config — Per-model context length stored in mappings (used by Cursor/Antigravity for usage bars).
Streaming — SSE-based streaming on /api/chat, /api/generate and /v1/chat/completions.
OpenAI Compatible — Drop-in /v1/chat/completions, /v1/completions, /v1/embeddings and /v1/models endpoints.
Full Ollama API — /api/generate, /api/chat, /api/embeddings, /api/tags, /api/show, /api/copy, /api/delete, /api/pull, /api/push, /api/create.
Grafana Assistant API — Full Grafana LLM Assistant compatibility endpoints (/grafana/assistant/*) for Grafana-native AI features.
DeepSeek Tool Call Parsing — Auto-detects and converts DeepSeek's raw XML tool call output (<tool_calls><invoke>, <CallMcpTool>, <tool_call name="...">) to OpenAI tool_calls format in streaming and non-streaming responses. Kimi/Moonshot <|tool_calls_section_begin|> format also supported.
Streaming-Aware Background Tasks — Health checks, model discovery and warmup defer when streams are active, preventing interruptions.
Node-Aware Model Warmup — Warmup requests target only models that exist on each node, eliminating 404 errors from stale model names.
Background Tasks — Redis-backed async queue for activity logging, node health checks, model discovery, model warmup and load cleanup.
Audit Logs — Every admin action is timestamped and queryable.
PostgreSQL + Alembic — Schema migrations run automatically on container startup.
Redis Cache — Hot-path caching for mappings, config and user usage data.

---

Supported Features

Chat completions, streaming, tool calls, image input
Thinking models (gemini-3-pro, claude-opus-4-6-thinking)
Automatic OAuth token refresh
Endpoint fallback (Sandbox → Daily → Prod)

---

Supported Features

Chat completions and streaming via Bedrock Converse API
Image input (base64)
Tool calls (via Converse API toolConfig)
Automatic AWS SigV4 signing

---

Setup

1. Add the official Antigravity OAuth credentials to .env:

   GOOGLE_CLIENT_ID=your-google-client-id.apps.googleusercontent.com
   GOOGLE_CLIENT_SECRET=your-google-client-secret
   GOOGLE_REDIRECT_URI=http://localhost:3000/admin/oauth/callback

2. Restart the container: docker compose restart maestro 3. In the admin panel, create a node with Node Type = antigravity 4. Click Google Auth on the node detail page and sign in 5. Click Sync Models to fetch available models

Setup

1. Create a node with Node Type = bedrock 2. Set Base URL to your AWS region endpoint, e.g. https://bedrock-runtime.us-east-1.amazonaws.com 3. Add AWS credentials in the node detail page: - AWS_ACCESS_KEY_ID - AWS_SECRET_ACCESS_KEY 4. Click Sync Models to fetch available foundation models

Setup

1. Go to Settings in the admin panel 2. Under Tunnel, select your provider: - Cloudflare (Quick): leave hostname empty — a random URL is generated automatically - Cloudflare (Named): fill hostname (e.g. api.example.com), api_token, account_id and optionally zone_id - ngrok: fill api_token with your ngrok auth token 3. Click Start 4. The public URL will appear in the public_url field once the tunnel is active

Quick Setup (OpenClaw)

Add to ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "brave": {
        "enabled": true,
        "config": {
          "webSearch": {
            "apiKey": "<maestro-jwt-token>"
          }
        }
      }
    }
  }
}

Use the patcher script to automatically redirect OpenClaw's Brave URL to your Maestro instance.

For the complete setup guide (including cron configuration, manual testing, and backend proxy options), see docs/OPENCLAW_BRAVE_SEARCH.md.

---

Quick Start

Requires Docker & Docker Compose.

```bash

Usage

Force routing to Antigravity via node prefix:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"model": "node:antigravity:gemini-3-flash", "messages": [{"role":"user","content":"Hello"}]}'

Or use model mappings to route transparently.

Usage

Force routing to Bedrock via node prefix:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"model": "node:bedrock:us.anthropic.claude-3-5-sonnet-20241022-v2:0", "messages": [{"role":"user","content":"Hello"}]}'

Or use model mappings to route transparently.

2. Configure

cp .env.example .env

Configuration

Copy .env.example to .env and set:

```env

Start tunnel (uses saved config)

curl -X POST http://localhost:8000/admin/tunnel/start \ -H "Authorization: Bearer $ADMIN_TOKEN"

Add node (with optional code for prefix routing)

curl -X POST http://localhost:8000/admin/nodes \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "main", "base_url": "http://localhost:11434", "priority": 100, "code": "trmix", "node_type": "ollama" }'

Start tunnel with saved config

curl -X POST http://localhost:8000/admin/tunnel/start \ -H "Authorization: Bearer $ADMIN_TOKEN"

Get tunnel config

curl http://localhost:8000/admin/tunnel/config \ -H "Authorization: Bearer $ADMIN_TOKEN"

Update tunnel config

curl -X PUT http://localhost:8000/admin/tunnel/config \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "provider": "cloudflare", "api_token": "your-cloudflare-api-token", "account_id": "your-account-id", "hostname": "api.example.com" }'


**Model Groups**

bash

Get LLM config

curl http://localhost:8000/grafana/assistant/config \ -H "Authorization: Bearer $TOKEN"

Update LLM config

curl -X POST http://localhost:8000/grafana/assistant/config \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{"model": "gpt-oss:120b", "temperature": 0.7}'

3. Launch full stack (PostgreSQL + Redis + FastAPI + Next.js)

docker compose -f docker-compose.dev.yml up --build -d

Admin Token (for /admin/* endpoints)

ADMIN_TOKEN=change-this-for-production

API Endpoints

```bash

API Reference

For the complete API reference with all request/response examples, see docs/API.md.

LLM Endpoints

Method	Endpoint	Description
`POST`	`/api/chat`	Chat completions (Ollama format)
`POST`	`/api/generate`	Text generation
`POST`	`/api/embeddings`	Generate embeddings
`GET`	`/api/tags`	List available models
`POST`	`/api/show`	Show model info
`POST`	`/api/copy`	Copy model
`DELETE`	`/api/delete`	Delete model
`POST`	`/api/pull`	Pull model
`POST`	`/api/push`	Push model
`POST`	`/api/create`	Create model from Modelfile
`POST`	`/v1/completions`	OpenAI-compatible completions
`POST`	`/v1/embeddings`	OpenAI-compatible embeddings
`GET`	`/res/v1/web/search`	Brave Search-compatible web search

Example — Chat

curl -X POST http://localhost:8000/api/chat \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss:120b",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": false
  }'

Example — Streaming Chat

curl -X POST http://localhost:8000/api/chat \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-oss:120b",
    "messages": [{"role": "user", "content": "Tell me a story"}],
    "stream": true
  }'

Admin Endpoints

Users

```bash

API only

docker compose -f docker-compose.dev.yml logs -f maestro

IDE Integration

Model Maestro is designed to be the backend for modern AI-powered IDEs and tools. See the full integration guide for step-by-step setup:

Claude Code — ANTHROPIC_BASE_URL override
Codex — Run ./scripts/codex-maestro for the Codex Desktop App (auto-configures Responses API streaming via Maestro; no manual env vars needed). For the VS Code extension or CLI, set OPENAI_BASE_URL + OPENAI_API_KEY. Use a Maestro model group as the default model for dynamic switching without restarting Codex.
OpenClaw — openclaw.json provider configuration
Cursor IDE — OpenAI API Key + custom base URL
Grafana Assistant — Grafana plugin with domain bypass script or reverse proxy

For complete configuration examples and troubleshooting, see docs/IDE_INTEGRATION.md.

Web Search Integration (Brave Search)

Model Maestro provides a Brave Search-compatible endpoint (/res/v1/web/search) that can be used by OpenClaw and other clients expecting Brave Search semantics.

Authentication: X-Subscription-Token or Authorization: Bearer <token>
Backend: Forwards to Ollama Web Search (or any custom search proxy)
Response Format: Full Brave Search API compatibility

Troubleshooting

Restart the full stack

docker compose -f docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml up --build -d

Run migrations manually

docker exec maestro alembic upgrade head

Re-run seeds

docker exec maestro python -m app.seeder --reset
docker exec maestro python -m app.seeder

Clear cache

docker exec maestro python scripts/clear_cache.py

Check PostgreSQL health

docker exec maestro-postgres pg_isready -U maestro_user -d maestro

Check Redis

docker exec maestro-redis redis-cli ping

View logs

```bash

🇨🇳 中文文档镜像 AI 翻译 2026-05-26

英文原文章节由系统翻译为中文摘要，便于快速理解。完整原文见上方 "📑 README 深度解析"。

📌 简介

Model Maestro 是一个基于配置驱动的统一 LLM Gateway（大模型网关）。它通过单一的身份验证 API，实现对 Ollama、OpenAI 以及其他 LLM 提供商的路由、负载均衡与统一管理。系统内置了基于用户的访问控制、模型映射（Model Mapping）、Token 使用量追踪以及具备健康检查功能的节点管理，旨在为开发者提供一个安全、可控的模型接入层。

⚡ 功能介绍

本项目提供了一套完整的模型管理方案：支持通过 JWT Authentication 进行 Bearer-token 身份验证；配备基于 Next.js 16 的 Admin Dashboard，实现用户、节点、模型、分组及审计日志的可视化管理；强大的 Model Mapping 功能允许通过 PostgreSQL 和 JSON 缓存将自定义显示名称转换为真实的后端模型名称；同时支持针对特定节点的模型映射绑定，确保不同节点间模型调用的隔离性与灵活性。

🛠 安装步骤（Docker/pip/源码）

项目部署主要通过 Docker 实现。首先需在 `.env` 文件中配置 Antigravity OAuth 凭据，随后使用 `docker compose restart maestro` 重启容器。在 Admin Panel 中创建 Node Type 为 `antigravity` 的节点并完成 Google Auth 授权。对于 AWS Bedrock 节点，需在节点详情页配置相应的 AWS 区域 Endpoint 及 `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY`，并点击 Sync Models 同步模型列表。此外，系统内置 Tunnel 功能，支持通过 Cloudflare 或 ngrok 快速实现公网访问。

🚀 使用教程

Model Maestro 支持标准的 Chat completions、streaming、tool calls 及图像输入。对于具备 Thinking 能力的模型（如 gemini-3-pro, claude-opus-4-6-thinking）提供原生支持。用户可以通过特定的 Node Prefix（例如 `node:antigravity:...` 或 `node:bedrock:...`）强制将请求路由至特定节点，也可以利用 Model Mapping 实现透明的路由转换。系统还支持 Endpoint fallback 机制，可在 Sandbox、Daily 与 Prod 环境间自动切换。

⚙️ 配置说明（含 MCP / env）

项目配置通过 `.env` 文件进行管理。部署前请务必将 `.env.example` 复制为 `.env` 并根据实际环境修改关键参数。对于需要公网访问的场景，可在 Admin Panel 的 Settings -> Tunnel 中配置 Cloudflare（支持 Quick 随机 URL 或 Named 自定义域名）或 ngrok。若需通过 API 启动 Tunnel，可使用 `curl` 调用 `/admin/tunnel/start` 接口。请确保在生产环境中修改 `ADMIN_TOKEN` 以保障管理接口的安全。

🔌 API 说明

本项目采用全栈架构，包含 PostgreSQL、Redis、FastAPI 与 Next.js。所有管理类接口均需通过 `ADMIN_TOKEN` 进行身份验证。API 层面完全兼容 OpenAI 风格的 Chat completions 接口，并针对 Bedrock Converse API 进行了深度优化，支持自动进行 AWS SigV4 签名。此外，系统还提供了一个兼容 Brave Search 语义的 `/res/v1/web/search` 接口，方便 OpenClaw 等客户端进行 Web Search 集成。

🔄 工作流/模块

Model Maestro 被设计为现代 AI 驱动型 IDE 和工具的强大后端。它深度集成了 Claude Code（通过覆盖 `ANTHROPIC_BASE_URL`）和 Codex 等开发工具。同时，通过提供兼容的 Web Search 接口，它可以无缝接入各类 AI Agent 工作流，将 Ollama Web Search 或其他搜索能力转化为标准化的 API 服务，实现从模型路由到工具调用的全链路自动化。

❓ FAQ 摘要

遇到问题时，可以尝试通过 `docker compose down` 后重新 `up --build` 来重启全栈服务。如果数据库结构出现异常，可以使用 `docker exec maestro alembic upgrade head` 手动运行数据库迁移命令。若需重置数据，可通过 `docker exec maestro python -m app.seeder --reset` 重新运行数据种子填充程序。对于生产环境，请务必确保 `ADMIN_TOKEN` 已被修改为高强度随机字符串。

🎯 aiskill88 AI 点评 A 级 2026-05-25

统一LLM网关，支持多个AI服务，代码质量高

📚 实用指南（长尾问题）

适合谁

使用 Cursor 编辑器、希望提升 AI 编程效率的开发者
构建企业知识库 / RAG 检索应用的团队
跨境业务、多语言内容运营团队

最佳实践

生产部署优先使用 Docker Compose 隔离依赖，并挂载 volume 持久化数据
本地部署优先选 GGUF 量化模型，节省显存并保持响应速度
Cursor rules 控制在 80 行内，否则模型上下文成本会显著上升

常见错误

API key 直接提交到 git 仓库（请用 .env 并加入 .gitignore）
容器内无法访问宿主机 localhost — 使用 host.docker.internal
显存不足直接 OOM — 优先降低 context 或换更小的量化模型
Python 依赖冲突：建议用 venv / uv 隔离环境

部署方案

Docker：model-maestro 提供官方镜像，docker compose up 一键启动
CLI：直接 npm install -g / pip install，命令行调用
本地部署：CPU 8GB 起，GPU 推荐 16GB+ 显存
云端托管：可放在 Vercel / Railway / Fly.io 等 PaaS 平台

原始名称	`model-maestro`
原始描述	开源AI工作流：Unified LLM Gateway that proxies multiple providers (Ollama, OpenAI-compatible) 。⭐6 · Python
Topics	`aipythonworkflow`
GitHub	https://github.com/skymoonsun/model-maestro
语言	Python

模型大师

简介

Features

Supported Features

Supported Features

Setup

Setup

Setup

Quick Setup (OpenClaw)

Quick Start

Usage

Usage

2. Configure

Configuration

Start tunnel (uses saved config)

Add node (with optional code for prefix routing)

Start tunnel with saved config

Get tunnel config

Update tunnel config

Get LLM config

Update LLM config

3. Launch full stack (PostgreSQL + Redis + FastAPI + Next.js)

Admin Token (for /admin/* endpoints)

API Endpoints

API Reference

LLM Endpoints

Admin Endpoints

API only

IDE Integration

Web Search Integration (Brave Search)

Troubleshooting

🤖 交给 Agent 安装 · 模型大师