AI Skill Hub 推荐使用:total-agent-memory MCP工具 是一款优质的MCP工具。AI 综合评分 7.8 分,在同类工具中表现稳健。如果你正在寻找可靠的MCP工具解决方案,这是一个值得深入了解的选择。
为Claude Code和Codex CLI提供持久化记忆功能的开源MCP工具。自动提取知识图谱,支持多轮对话上下文保留,适合需要长期记忆和知识积累的AI应用开发者和研究人员。
total-agent-memory MCP工具 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
为Claude Code和Codex CLI提供持久化记忆功能的开源MCP工具。自动提取知识图谱,支持多轮对话上下文保留,适合需要长期记忆和知识积累的AI应用开发者和研究人员。
total-agent-memory MCP工具 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/vbcherepanov/total-agent-memory
# 方式二:手动配置 claude_desktop_config.json
{
"mcpServers": {
"total-agent-memory-mcp--": {
"command": "npx",
"args": ["-y", "total-agent-memory"]
}
}
}
# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
# 安装后在 Claude 对话中直接使用 # 示例: 用户: 请帮我用 total-agent-memory MCP工具 执行以下任务... Claude: [自动调用 total-agent-memory MCP工具 MCP 工具处理请求] # 查看可用工具列表 # 在 Claude 中输入:"列出所有可用的 MCP 工具"
// claude_desktop_config.json 配置示例
{
"mcpServers": {
"total-agent-memory_mcp__": {
"command": "npx",
"args": ["-y", "total-agent-memory"],
"env": {
// "API_KEY": "your-api-key-here"
}
}
}
}
// 保存后重启 Claude Desktop 生效
The only memory layer that learns how you work — not just what you said. Persistent, local memory for AI coding agents: Claude Code, Codex CLI, Cursor, any MCP client. Temporal knowledge graph · procedural memory · AST codebase ingest · cross-project analogy · 3D WebGL visualization.
Why this, not mem0 / Letta / Zep / Supermemory / Cognee? → docs/vs-competitors.md
---
| Capability | Tool | One-liner |
|---|---|---|
| 🧠 **Procedural memory** | workflow_predict / workflow_track | "How did I solve this last time?" — predicts steps with confidence |
| 🔗 **Cross-project analogy** | analogize | "Was there something like this in another repo?" — Jaccard + Dempster-Shafer |
| ⚠️ **Pre-edit risk warnings** | file_context | Surfaces past errors / hot spots on the file you're about to edit |
| 🛡 **Self-improving rules** | learn_error + self_rules_context | Bash failures → patterns → auto-consolidated behavioral rules at N≥3 |
| 🕰 **Temporal facts** | kg_add_fact / kg_at | Append-only KG with valid_from/valid_to — query what was true at any point |
| 🎯 **Task workflow phases** | classify_task / phase_transition | Automatic L1-L4 complexity classification, state machine across van/plan/creative/build/reflect/archive |
| 🧩 **Structured decisions** | save_decision | Options + criteria matrix + rationale + discarded → searchable decision records with per-criterion embeddings |
| 💸 **Token-efficient retrieval** | memory_recall(mode="index") + memory_get | 3-layer workflow: compact IDs → timeline → batched full fetch. ~83% token saving on typical queries |
lookup-memory / tam-lookup / ctm-lookup (legacy) CLI now installed alongside total-agent-memory MCP server (registered as [project.scripts] so ./install.sh and ./update.sh put them on PATH automatically). Sub-agent prompts that reference the legacy ~/claude-memory-server/ollama/lookup_memory.sh script keep working; new prompts should prefer the package-installed name.fastembed by default. Switch via V9_EMBED_BACKEND=openai-3-large (set MEMORY_EMBED_API_KEY) — costs ~$0.10/5k rows for re-embed, expected R@5 lift on conversational data.ce-marco by default. V9_RERANKER_BACKEND=bge-v2-m3 (or off) switches at runtime.--subject-aware in benchmarks/locomo_bench_llm.py. Future: surface as MCP tool flag.- Re-embed (only if switching embedding model, otherwise skip):
python -m scripts.reembed --backend openai-3-large --confirm
- Old bash sub-agent prompts that hardcode ~/claude-memory-server/ollama/lookup_memory.sh "query" will keep working. To ride the new package install, replace with lookup-memory "query".
```
Then restart Claude Code: /mcp restart memory.
1. Cloud providers (only if you want to replace/augment Ollama):
export MEMORY_LLM_PROVIDER=openai # or "anthropic"
export MEMORY_LLM_API_KEY=sk-...
export MEMORY_LLM_MODEL=gpt-4o-mini # or "claude-haiku-4-5" See Cloud providers for OpenRouter / per-phase routing / Cohere examples.
2. Install additional hooks (for UserPromptSubmit capture + citation):
./install.sh --ide claude-code # re-run installer; it now registers user-prompt-submit.sh hook The hook is additive — existing hooks keep working.
3. activeContext.md Obsidian integration (if you want markdown projection): ```bash export MEMORY_ACTIVECONTEXT_VAULT=~/Documents/project/Projects # default
Two manual paths. Same 60+ tools, same dashboard, different deployment shapes.
All installers preserve ~/.tam/memory.db (legacy installs: ~/.claude-memory/memory.db) and your config files; only services + hook registrations are removed.
./install.sh --uninstall # macOS/Linux/WSL2 — removes LaunchAgents OR systemd units
.\install.ps1 -Uninstall # Windows — unregisters Scheduled Tasks + cleans settings.json
git clone https://github.com/vbcherepanov/total-agent-memory.git
cd total-agent-memory
bash install-docker.sh --with-compose
Brings up 5 services:
| Service | Role | Exposed |
|---|---|---|
mcp | MCP server (HTTP transport) | 127.0.0.1:3737/mcp |
dashboard | Web UI | 127.0.0.1:37737 |
ollama | Local LLM runtime | 127.0.0.1:11434 |
reflection | File-watch queue drainer | internal |
scheduler | Ofelia cron (backfill + update check) | internal |
First run pulls qwen2.5-coder:7b (~4.7 GB) + nomic-embed-text (~275 MB) — 5–10 min cold start.
GPU note: Docker Desktop on macOS doesn't forward Metal. Native install is faster on Mac. On Linux with NVIDIA Container Toolkit, uncomment the deploy.resources.reservations.devices block in docker-compose.yml.
Without Ollama: works fully — raw content is saved, retrieval via BM25 + FastEmbed dense embeddings.
With Ollama: you also get LLM-generated summaries, keywords, question-forms, compressed representations, and deep enrichment (entities, intent, topics).
brew install ollama # or: curl -fsSL https://ollama.com/install.sh | sh
ollama serve &
ollama pull qwen2.5-coder:7b # default — best quality/speed on M-series
ollama pull nomic-embed-text # optional, alternative embedder
| Channel | Command | What it does |
|---|---|---|
| **npx** (Node) | npx -y total-agent-memory connect claude-code | Zero-install. Bootstraps a Python venv in ~/.tam/.venv via uv (or python3 fallback), pulls the PyPI server, wires the MCP entry into your IDE. Replace claude-code with codex / cursor / cline / continue / aider / windsurf / gemini-cli / opencode. |
| **uvx** (Python via uv) | uvx total-agent-memory | One-off run with no install. Best for trying without commitment. |
| **pipx** (Python isolated) | pipx install total-agent-memory | Installs the total-agent-memory, tam, tam-lookup, lookup-memory binaries on PATH in an isolated venv. |
| **brew** (macOS / Linuxbrew) | brew install vbcherepanov/tap/total-memory | Bottle-style install with tam and legacy claude-total-memory symlinks. |
| **Docker** (multi-arch) | docker run -p 37737:37737 -v ~/.tam:/data ghcr.io/vbcherepanov/total-agent-memory:12.0.0 | Containerized (linux/amd64 + linux/arm64). Dashboard on :37737. |
| **Manual clone** | git clone https://github.com/vbcherepanov/total-agent-memory ~/total-agent-memory && cd ~/total-agent-memory && ./install.sh --ide claude-code | Full control. Lets you hack on the server, run benchmarks, and pick which background services to enable. Detailed walkthrough below. |
All six channels land at the same MCP server. The npx and ./install.sh paths additionally configure IDE-specific MCP entries and hooks. Other channels start the server bare — you wire the IDE afterwards (see docs/installation.md).
Upgrade from v11.x? Whatever channel you pick will auto-migrate ~/.claude-memory/ → ~/.tam/ on first run and keep a symlink for backward compat. No manual data move required.
---
v11 default isMEMORY_MODE=fast. No LLM, no Ollama, no network in the save/search/recall hot path. To restore v10.5 synchronous-LLM behaviour setexport MEMORY_MODE=deep. Mode switching:LAUNCH.md§ Tuning.
Once installed, in any Claude Code / Codex CLI / Cursor session:
1. Resume where you left off (auto on session start, but you can also invoke)
session_init(project="my-api")
→ {summary: "yesterday: migrated auth middleware to JWT",
next_steps: ["update OpenAPI spec", "notify frontend team"],
pitfalls: ["don't revert migration 0042 — dev DB already migrated"]}
2. Save a decision (agent does this automatically after hooks are registered)
memory_save(
type="decision",
content="Chose pgvector over ChromaDB for multi-tenant RLS",
context="WHY: single Postgres instance, per-tenant row-level security",
project="my-api",
tags=["database", "multi-tenant"],
)
3. Recall across sessions / projects
memory_recall(query="vector database choice", project="my-api", limit=5)
→ RRF-fused results from 6 retrieval tiers
4. Predict approach before starting a task
workflow_predict(task_description="migrate auth middleware to JWT-only")
→ {confidence: 0.82, predicted_steps: [...], similar_past: [...]}
5. Check a file's risk before editing (auto via hook, also manual)
file_context(path="/Users/me/my-api/src/auth/middleware.go")
→ {risk_score: 0.71, warnings: ["last 3 edits caused test failures in ..."], hot_spots: [...]}
6. Get full stats
memory_stats()
→ {sessions: 515, knowledge: {active: 1859, ...}, storage_mb: 119.5, ...}
---
You: "remember we picked pgvector over ChromaDB because of multi-tenant RLS"
Claude: ✓ memory_save(type=decision, content="Chose pgvector over ChromaDB",
context="WHY: single Postgres, per-tenant RLS")
[3 days later, different session, possibly different project directory:]
You: "why did we pick pgvector again?"
Claude: ✓ memory_recall(query="vector database choice")
→ "Chose pgvector over ChromaDB for multi-tenant RLS. Single DB
instance, row-level security per tenant."
It's not just retrieval. It's procedural too:
You: "migrate auth middleware to JWT-only session tokens"
Claude: ✓ workflow_predict(task_description="migrate auth middleware...")
→ confidence 0.82, predicted steps:
1. read src/auth/middleware.go + tests
2. update session fixtures in tests/
3. run migration 0042
4. regenerate OpenAPI spec
similar past: wf#118 (success), wf#93 (success)
---
.venv/bin/python src/tools/merge_duplicate_nodes.py --dry-run .venv/bin/python src/tools/merge_duplicate_nodes.py --apply --add-unique ```
Verified on a real production DB (8304 nodes): 102 duplicates merged, 1472 stale edges cleaned, UNIQUE constraint installed.
Bug #2 — model never calls memory_save on its own. Sonnet/Haiku skip the priority-10 save rule when SessionStart context fades. v11.1 adds in-session nudges: a counter in ~/.claude-memory/state/ tracks writes-vs-saves per session, and hooks/post-tool-use.{sh,ps1} emits a stdout line that Claude reads as system context on the next turn. Soft nudge at 3 edits with 0 saves, hard at 7, and a MEMORY_FINAL_WARNING on session stop. A new priority-10 rule instructs the model to treat MEMORY_NUDGE as an immediate command.
Tunables: MEMORY_NUDGE_DISABLE=1 to silence; MEMORY_NUDGE_SOFT / _HARD / _STEP to retune (defaults 3 / 7 / 3).
Test coverage: +24 graph tests, +12 nudge tests. Full details in CHANGELOG.md.
---
Use OpenAI, Anthropic, or any OpenAI-compat endpoint (OpenRouter, Together, Groq, DeepSeek, LM Studio, llama.cpp) instead of local Ollama.
OpenAI:
export MEMORY_LLM_PROVIDER=openai
export MEMORY_LLM_API_KEY=sk-...
export MEMORY_LLM_MODEL=gpt-4o-mini
Anthropic:
export MEMORY_LLM_PROVIDER=anthropic
export MEMORY_LLM_API_KEY=sk-ant-...
export MEMORY_LLM_MODEL=claude-haiku-4-5
OpenRouter (100+ models via one endpoint):
export MEMORY_LLM_PROVIDER=openai
export MEMORY_LLM_API_BASE=https://openrouter.ai/api/v1
export MEMORY_LLM_API_KEY=sk-or-...
export MEMORY_LLM_MODEL=anthropic/claude-haiku-4.5
Per-phase routing (cheap model for bulk, quality for compression):
export MEMORY_TRIPLE_PROVIDER=openai
export MEMORY_TRIPLE_MODEL=gpt-4o-mini
export MEMORY_ENRICH_PROVIDER=anthropic
export MEMORY_ENRICH_MODEL=claude-haiku-4-5
Embeddings (dimension must match existing DB or re-embed required): ```bash export MEMORY_EMBED_PROVIDER=openai export MEMORY_EMBED_MODEL=text-embedding-3-small # 1536d
Environment variables (all optional):
export MEMORY_ENRICH_TICK_SEC=0.1 export MEMORY_ENRICH_BATCH=5 export MEMORY_ENRICH_MAX_ATTEMPTS=3 export MEMORY_ENRICH_STALE_AFTER_SEC=60 ```
Restart the MCP server. A background daemon thread now consumes enrichment_queue; you can watch it on the dashboard panel ⚡ v10.1 enrichment worker.
For Node.js / browser / any TS project that isn't an MCP-native agent:
npm i @vbch/total-agent-memory-client
import { connectStdio } from "@vbch/total-agent-memory-client";
const memory = await connectStdio();
await memory.save({
type: "decision",
content: "Picked pgvector over ChromaDB for multi-tenant RLS",
project: "my-api",
});
const hits = await memory.recallFlat({
query: "vector database choice",
project: "my-api",
limit: 5,
});
Also ships LangChain adapter example, procedural-memory integration, and HTTP transport (for team / serverless setups).
Package repo: github.com/vbcherepanov/total-agent-memory-client
---
When you only know the topic but not which records matter, use progressive disclosure:
memory_recall(query="auth refactor", mode="index", limit=20) → ~2 KB of {id, title, score, type, project, created_at} per hit. No content, no cognitive expansion.memory_recall(query="auth refactor", mode="timeline", limit=5, neighbors=2) → top-K hits padded with ±neighbours from the same session, sorted chronologically.memory_get(ids=[3622, 3606]) → full content for ONLY the IDs you chose (max 50 per call, detail="summary" truncates to 150 chars).Typical saving: 80-90 %% fewer tokens vs memory_recall(detail="full", limit=20) when you end up using 2-3 of the 20 hits.
<details> <summary><b>Core memory (15)</b></summary>
memory_recall · memory_get · memory_save · memory_update · memory_delete · memory_search_by_tag · memory_history · memory_timeline · memory_stats · memory_consolidate · memory_export · memory_forget · memory_relate · memory_extract_session · memory_observe
</details>
<details> <summary><b>Knowledge graph (6)</b></summary>
memory_graph · memory_graph_index · memory_graph_stats · memory_concepts · memory_associate · memory_context_build
</details>
<details> <summary><b>Episodic memory & skills (4)</b></summary>
memory_episode_save · memory_episode_recall · memory_skill_get · memory_skill_update
</details>
<details> <summary><b>Reflection & self-improvement (7)</b></summary>
memory_reflect_now · memory_self_assess · self_error_log · self_insight · self_patterns · self_reflect · self_rules · self_rules_context
</details>
<details> <summary><b>Temporal knowledge graph (4)</b></summary>
kg_add_fact · kg_invalidate_fact · kg_at · kg_timeline
</details>
<details> <summary><b>Procedural memory (3)</b></summary>
workflow_learn · workflow_predict · workflow_track
</details>
<details> <summary><b>Pre-flight guards & automation (8)</b></summary>
file_context (pre-edit risk scoring) · learn_error (auto-consolidating error capture) · session_init / session_end · ingest_codebase (AST, 9 languages) · analogize (cross-project analogy) · benchmark (regression gate)
</details>
Full JSON schemas: python -m total_agent_memory.cli tools --json or open the dashboard at localhost:37737/tools.
---
Public LongMemEval benchmark (xiaowu0162/longmemeval-cleaned, 470 questions, the dataset everyone publishes against):
R@5 (recall_any) on public LongMemEval
─────────────────────────────────────────
100% ─┤
│
96.2% ┤ ████ ← total-agent-memory v7.0 (LOCAL, 38.8 ms, MIT)
95.0% ┤ ████ ← Mastra "Observational" (cloud)
│ ████
│ ████
85.4% ┤ ████ ← Supermemory (cloud, $0.01/1k tok)
│ ████
│ ████
│ ████
80% ┤ ████
└──────────────────────────────────────────
Reproducible: evals/longmemeval-2026-04-17.json · Runner: benchmarks/longmemeval_bench.py
We're not replacing chatbot memory — we're occupying the coding-agent + MCP + local niche.
| mem0 | Letta | Zep | Supermemory | Cognee | LangMem | **total-agent-memory** | |
|---|---|---|---|---|---|---|---|
| Funding / status | $24M YC | $10M seed | $12M seed | $2.6M seed | $7.5M seed | in LangChain | self-funded OSS |
| Runs 100% local | 🟡 | ✅ | 🟡 | ❌ | 🟡 | 🟡 | **✅** |
| MCP-native | via SDK | ❌ | 🟡 Graphiti | 🟡 | ❌ | ❌ | **✅ 60+ tools** |
| Knowledge graph | 🔒 $249/mo | ❌ | ✅ | ✅ | ✅ | ❌ | **✅** |
**Temporal facts** (kg_at) | ❌ | ❌ | ✅ | ❌ | 🟡 | ❌ | **✅** |
| **Procedural memory** | ❌ | ❌ | ❌ | ❌ | ❌ | 🟡 | **✅ workflow_predict** |
| **Cross-project analogy** | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | **✅ analogize** |
| **Self-improving rules** | ❌ | ❌ | ❌ | ❌ | 🟡 | ❌ | **✅ learn_error** |
| **AST codebase ingest** | ❌ | ❌ | ❌ | ❌ | 🟡 | ❌ | **✅ tree-sitter 9 lang** |
| **Pre-edit risk warnings** | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | **✅ file_context** |
| 3D WebGL graph viewer | ❌ | ❌ | 🟡 | ✅ | ❌ | ❌ | **✅** |
| Price for graph features | $249/mo | free | cloud | usage | free | free | **free** |
Full side-by-side with pricing, latency, accuracy, "when to pick each" → docs/vs-competitors.md.
---
| Question type | Count | Our R@5 |
|---|---|---|
| knowledge-update | 72 | **100.0%** |
| single-session-user | 64 | **100.0%** |
| multi-session | 121 | 96.7% |
| single-session-assistant | 56 | 96.4% |
| temporal-reasoning | 127 | 95.3% ← bi-temporal KG pays off |
| single-session-preference | 30 | 80.0% ← weakest spot |
| **TOTAL** | **470** | **96.2%** |
总体介绍:这是一个学习如何工作(而不是仅仅学习你说过什么)的持久性本地内存层,适用于 AI 编码代理:Claude Code、Codex CLI、Cursor、任何 MCP 客户端。它包含时间知识图谱、程序性内存、AST 代码库摄取、跨项目类比和 3D WebGL 可视化。
功能介绍:本项目提供八项独特的功能,包括程序性内存、跨项目类比、预编辑风险警告等,帮助开发者更好地管理和维护代码。
环境依赖与系统要求:本项目需要 Python、Docker 和 MCP 等环境依赖,具体要求请参阅文档。
安装步骤:本项目支持多种安装方式,包括 pip、Docker 和源码安装,具体步骤请参阅文档。
使用教程:本项目提供多种使用方式,包括 CLI、API 和 MCP 等,具体使用方法请参阅文档。
配置说明:本项目支持多种配置方式,包括环境变量、MCP 配置和关键参数等,具体配置方法请参阅文档。
API/接口说明:本项目提供多种 API 和接口,包括 MCP 工具参考和 TypeScript SDK 等,具体 API 和接口请参阅文档。
工作流 / 模块说明:本项目提供多种工作流和模块,包括 token 效率 3 层工作流和进步披露等,具体工作流和模块请参阅文档。
FAQ 摘要:本项目提供多种常见问题和答案,包括知识更新、单会话用户和多会话等,具体 FAQ 请参阅文档。
创新的MCP内存方案,填补Claude工具的记忆空白。知识图谱自动化提取机制亮眼,但生态成熟度有限,适合早期采用者。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
总体来看,total-agent-memory MCP工具 是一款质量良好的MCP工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | total-agent-memory |
| 原始描述 | 开源MCP工具:Persistent memory for Claude Code & Codex CLI. Auto-extracted knowledge graph, m。⭐37 · Python |
| Topics | 记忆管理知识图谱Claude集成MCP协议持久化存储 |
| GitHub | https://github.com/vbcherepanov/total-agent-memory |
| License | MIT |
| 语言 | Python |
收录时间:2026-05-19 · 更新时间:2026-05-19 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端