AI Skill Hub 推荐使用:递归语言模型运行时 是一款优质的AI工具。AI 综合评分 7.5 分,在同类工具中表现稳健。如果你正在寻找可靠的AI工具解决方案,这是一个值得深入了解的选择。
递归语言模型运行时 是一款基于 Python 开发的开源工具,专注于 ai、anthropic、gpt 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
递归语言模型运行时 是一款基于 Python 开发的开源工具,专注于 ai、anthropic、gpt 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install pyrlm-runtime
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install pyrlm-runtime
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/apenab/pyrlm-runtime
cd pyrlm-runtime
pip install -e .
# 验证安装
python -c "import pyrlm_runtime; print('安装成功')"
# 命令行使用
pyrlm-runtime --help
# 基本用法
pyrlm-runtime input_file -o output_file
# Python 代码中调用
import pyrlm_runtime
# 示例
result = pyrlm_runtime.process("input")
print(result)
# pyrlm-runtime 配置文件示例(config.yml) app: name: "pyrlm-runtime" debug: false log_level: "INFO" # 运行时指定配置文件 pyrlm-runtime --config config.yml # 或通过环境变量配置 export PYRLM_RUNTIME_API_KEY="your-key" export PYRLM_RUNTIME_OUTPUT_DIR="./output"
Minimal Python runtime for Recursive Language Models (RLMs) — inspired by the MIT CSAIL paper "Recursive Language Models"_.
RLMs solve the long-context problem: instead of sending huge contexts directly to an LLM (which truncates or degrades), the context lives as environment state in a Python REPL. The LLM writes code to inspect, search, and chunk the data, making recursive subcalls to smaller models when needed. Result: handle arbitrarily large contexts with constant token usage per step.
uv sync
pip install pyrlm-runtime
Or with uv:
uv add pyrlm-runtime
For live terminal visualization of the REPL loop with rich:
pip install "pyrlm-runtime[rich]"
Requirements: Python 3.12+
Optional: For the secure Monty REPL backend (Rust sandbox):
pip install pydantic-monty
rlm = RLM(adapter=adapter, repl_backend="monty") ```
How MontyREPL handles complex objects: Python objects like Context can't run natively in the Rust sandbox. MontyREPL uses an object proxy system — methods are registered as external functions with {name}__{method} naming, and AST rewrites transform ctx.method() calls into ctx__method() calls transparently.
Variable persistence: MontyREPL uses AST-based detection of assignments, appending a capture dict to extract variable state from each execution.
Both backends implement the same REPLProtocol interface: exec(code) -> ExecResult, get(name), set(name, value).
First, install the optional Elasticsearch extra:
pip install "pyrlm-runtime[elasticsearch]"
from pyrlm_runtime import RLM
from pyrlm_runtime.adapters import OpenAICompatAdapter
from pyrlm_runtime.retrieval import ElasticsearchRetriever
retriever = ElasticsearchRetriever(
host="https://my-cluster.es.cloud.com",
api_key="xxx",
index="pdf_corpus",
embedding_model="text-embedding-3-small",
)
rlm = RLM(adapter=OpenAICompatAdapter(model="gpt-5"), retriever=retriever)
answer, trace = rlm.run("Who signed document X?") # No context needed
When a retriever is configured, four functions become available in the REPL:
es_search(query, top_k=10, filters=None) # BM25 keyword search
es_vector_search(query, top_k=10, filters=None) # Semantic similarity
es_hybrid_search(query, top_k=10, filters=None) # Combined (recommended)
es_get(doc_id) # Fetch full document
The retrieval layer is backend-agnostic: any object implementing the RetrieverProtocol (with search, vector_search, hybrid_search, get methods) works as a drop-in backend.
```python from pyrlm_runtime import RLM, Context from pyrlm_runtime.adapters import OpenAICompatAdapter
| Use case | Configuration |
|---|---|
| Small context (<8K chars) | Use SmartRouter — it will pick baseline automatically |
| Large corpus (10K+ docs) | RLM(adapter, retriever=ElasticsearchRetriever(...)) — search on demand |
| Large context (>100K chars) | RLM(adapter, conversation_history=True, parallel_subcalls=True) |
| Batch many independent prompts | Use llm_batch(prompts) — always parallel, no config needed |
| Cost-sensitive | Use a cheaper subcall_adapter for subcalls |
| Safety-critical code execution | repl_backend="monty" |
| Deterministic extraction | SmartRouter with DETERMINISTIC_FIRST profile |
| Complex multi-hop reasoning | recursive_subcalls=True, max_recursion_depth=2 |
| Example | Description | Requires API? |
|---|---|---|
[minimal.py](examples/minimal.py) | Basic RLM flow with FakeAdapter | No |
[rlm_vs_baseline.py](examples/rlm_vs_baseline.py) | Needle-in-haystack benchmark (MIT paper Figure 1) | Yes |
[smart_router_demo.py](examples/smart_router_demo.py) | SmartRouter auto-selecting baseline vs RLM by context size | Yes |
[bench_repl_python_vs_monty.py](examples/bench_repl_python_vs_monty.py) | Raw REPL performance: PythonREPL vs MontyREPL (no LLM calls) | No |
[bench_rlm_repl_backends.py](examples/bench_rlm_repl_backends.py) | Full RLM loop benchmark with both REPL backends (FakeAdapter) | No |
Run any example:
uv run python examples/minimal.py
RLM_CONTEXT_SIZES=5,30 uv run python examples/rlm_vs_baseline.py
es_search(query, top_k=10, filters=None)
# BM25 full-text search → list of {doc_id, preview, score, metadata}
es_vector_search(query, top_k=10, filters=None)
# Semantic similarity search → list of {doc_id, preview, score, metadata}
es_hybrid_search(query, top_k=10, filters=None)
# Combined BM25 + semantic (recommended) → list of {doc_id, preview, score, metadata}
es_get(doc_id)
# Fetch full document → {doc_id, content, metadata}
```bash
LLM_BASE_URL="https://..."
```bash export LLM_API_KEY="your-api-key-here"
from pyrlm_runtime import RLM, Context
from pyrlm_runtime.adapters import FakeAdapter
adapter = FakeAdapter(script=[
"snippet = peek(80)\nsummary = llm_query(f'Summarize: {snippet}')\nanswer = f'Summary -> {summary}'",
"FINAL_VAR: answer",
])
adapter.add_rule("You are a sub-LLM", "[fake] short summary")
context = Context.from_text("RLMs treat long prompts as environment state.")
output, trace = RLM(adapter=adapter).run("Summarize this.", context)
print(output) # Summary -> [fake] short summary
adapter = OpenAICompatAdapter( model="my-model", base_url="https://my-endpoint.com/v1", )
Uses environment variables: `LLM_API_KEY` (or `OPENAI_API_KEY`), `LLM_BASE_URL`.
#### GenericChatAdapter
For non-standard APIs with custom request/response formats.
python from pyrlm_runtime.adapters import GenericChatAdapter
adapter = GenericChatAdapter( base_url="https://custom-api.com", path="/chat/completions", model="custom-model", api_key="your-key", payload_builder=my_custom_builder, # Custom request format response_parser=my_custom_parser, # Custom response format timeout=60.0, max_retries=3, )
Auto-retries on 429, 500, 502, 503, 504 with exponential backoff. Supports context manager (`with GenericChatAdapter(...) as adapter:`).
#### FakeAdapter
Deterministic adapter for testing. No external API needed.
python from pyrlm_runtime.adapters import FakeAdapter
adapter = FakeAdapter( script=["code step 1", "code step 2", "FINAL_VAR: result"] )
LLM_API_KEY="your-key" # Primary OPENAI_API_KEY="your-key" # Fallback
For large corpora that don't fit in memory, the RLM can search external document indexes directly from the REPL loop. See the detailed architecture guide: docs/RETRIEVAL.md
The rlm_vs_baseline.py example reproduces the key finding from the MIT paper (Figure 1): RLMs maintain accuracy as context grows, while baseline approaches degrade due to truncation.

Figure 1: RLM accuracy remains high as distractor documents increase, while baseline accuracy drops.
answer, trace = rlm.run("What are the main themes across all documents?", context) print(answer) ```
pyrlm-runtime 是一个专为 Recursive Language Models (RLMs) 设计的轻量级 Python 运行时环境。受 MIT CSAIL 关于 Recursive Language Models 论文的启发,该项目旨在解决大语言模型(LLM)在处理超长上下文时的截断或性能退化问题。不同于传统的直接发送长文本方式,RLM 将上下文作为 Python REPL 中的环境状态(environment state)进行管理,允许 LLM 通过编写代码来主动执行检查、搜索和分块操作���从而实现对海量信息的精准操控。
在使用 pyrlm-runtime 之前,请确保您的开发环境已安装 Python 3.12 或更高版本。推荐使用 uv 工具进行依赖管理,通过 `uv sync` 命令可以快速同步项目所需的依赖环境。
您可以通过 pip 或 uv 进行安装。使用 pip 时执行 `pip install pyrlm-runtime`;若使用 uv,请执行 `uv add pyrlm-runtime`。为了在终端获得带有 `rich` 库美化效果的 REPL 交互体验,建议安装扩展包:`pip install "pyrlm-runtime[rich]"`。此外,若需使用基于 Rust 沙箱的安全型 Monty REPL 后端,请额外安装 `pydantic-monty`。
本项目提供了灵活的适配器机制。对于常规开发,可以使用 `OpenAICompatAdapter`。针对不同的应用场景,系统提供了差异化的配置方案:对于小规模上下文(<8K 字符),建议使用 `SmartRouter` 自动选择基准模型;对于大规模语料库(10K+ 文档),可通过配置 `ElasticsearchRetriever` 实现按需搜索;对于超长上下文(>100K 字符),则需配合特定的对话历史管理策略。
pyrlm-runtime 支持高度自定义的配置。您可以配置自定义的 API 端点(如 Ollama 或 LM Studio)来接入���地模型。当配置了检索器(Retriever)后,用户可以在 REPL 中直接调用 `es_search`(BM25 全文检索)、`es_vector_search`(语义相似度检索)或 `es_hybrid_search`(混合检索,推荐使用)来处理大规模文档,实现精准的知识检索。
开发者可以通过设置环境变量 `LLM_API_KEY` 来配置 API 密钥。为了方便测试,项目内置了 `FakeAdapter`,允许开发者通过编写模拟脚本来模拟 LLM 的行为,无需调用外部 API 即可验证逻辑。此外,项目提供了完整的 API Reference 供开发者查阅各类核心类与方法的调用规范。
针对无法全部装入内存的大规模语料库,pyrlm-runtime 集成了强大的 Retrieval Integration 功能。RLM 可以在 REPL 循环中直接与外部文档索引进行交互,实现实时搜索。关于检索系统的详细架构设计与工作流,请参阅 `docs/RETRIEVAL.md` 深度指南。
如果您需要针对整个上下文进行提问,可以直接使用 `rlm.run()` 方法。该方法会处理复杂的检索与推理逻辑,并返回最终的答案(answer)以及执行过程的追踪信息(trace),确保您能够清晰地了解模型是如何从海量文档中提取信息的。
高效的递归语言模型运行时,值得关注
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
总体来看,递归语言模型运行时 是一款质量良好的AI工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | pyrlm-runtime |
| 原始描述 | 开源AI工具:Minimal runtime for Recursive Language Models (RLMs) inspired by the MIT CSAIL p。⭐26 · Python |
| Topics | aianthropicgptllmopenaipython |
| GitHub | https://github.com/apenab/pyrlm-runtime |
| License | MIT |
| 语言 | Python |
收录时间:2026-05-25 · 更新时间:2026-05-26 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。