用 Rust 开发一个可落地的 AI Agent：从架构到代码

过去一年“Agent”这个词被说烂了：会思考、会调用工具、能自己拆解任务并持续执行。大多数教程都用Python写，但在需要高并发、可控资源、长期运行稳定的场景里（爬取、自动化运营、链上监控、日志分析、交易风控），Rust反而更合适：性能强、内存安全、可观测性好、部署简单。这篇文章带你

> 过去一年“Agent”这个词被说烂了：会思考、会调用工具、能自己拆解任务并持续执行。

大多数教程都用 Python 写，但在需要**高并发、可控资源、长期运行稳定**的场景里（爬取、自动化运营、链上监控、日志分析、交易风控），Rust 反而更合适：性能强、内存安全、可观测性好、部署简单。

![](https://fastly.jsdelivr.net/gh/bucketio/img16@main/2025/12/29/1767006497496-d487df09-a1a8-4f96-9129-bdd0575c43e9.png)

这篇文章带你用 Rust 写一个“能跑起来”的 Agent：

* 有 **Plan → Act → Observe** 循环
* 会 **工具调用（Tool Calling）**
* 有 **短期记忆（对话上下文）** + **长期记忆（本地存储/向量库接口预留）**
* 支持 **并发执行工具**、限流、重试
* 结构清晰，方便你后续接入任意 LLM 提供商

> 注：本文不绑定某一家模型厂商，接口用“可替换的 LLM Client”封装。你可以接 OpenAI、Azure、Anthropic、火山、通义、智谱等。

---

## Agent 到底是什么？

一个实用 Agent 通常有这几个部件：

1. **LLM（大脑）**：负责推理、生成行动指令
2. **Tools（手脚）**：HTTP 请求、数据库、搜索、代码执行、链上 RPC、发邮件等
3. **Memory（记忆）**：保存历史信息，避免“鱼类记忆”
4. **Loop（循环）**：反复执行“规划-行动-观察”，直到完成或触发停止条件
5. **Safety & Guardrails（护栏）**：避免无限循环、越权调用、危险操作

我们要实现的最小可用版本（MVP）：

* 输入一个目标：比如“查一下我项目的错误日志并总结原因，给出修复建议”
* Agent 自动拆解：需要查询日志 → 提取关键错误 → 归类 → 给建议
* 每次行动都是调用一个 tool，然后把结果喂回 LLM 继续下一步

---

## Rust 项目结构：把复杂度关进笼子

建议用这样的目录结构：

```shell
agent-rs/
  src/
    main.rs
    agent/mod.rs
    agent/loop.rs
    llm/mod.rs
    tools/mod.rs
    tools/http.rs
    tools/fs.rs
    memory/mod.rs
    memory/short_term.rs
    memory/long_term.rs
    types.rs
```

核心思想：

* **Agent 不直接依赖某个模型厂商**（llm 模块可替换）
* **Tools 用 trait 抽象**（你能无限加工具）
* **Memory 可插拔**（本地文件、SQLite、Redis、向量库都行）

---

## 定义核心数据结构：Message、Tool、Action

先把“协议”定下来：LLM 怎么告诉我们要调用工具？工具返回什么？循环怎么停止？

```rust
// src/types.rs
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ChatMessage {
    pub role: String,   // "system" | "user" | "assistant" | "tool"
    pub content: String,
    pub name: Option<String>, // tool name if role == "tool"
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ToolSpec {
    pub name: String,
    pub description: String,
    pub input_schema: serde_json::Value, // JSON Schema
}

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum AgentAction {
    // LLM 要求调用工具
    ToolCall {
        tool_name: String,
        input: serde_json::Value,
    },
    // LLM 认为任务完成
    Final {
        answer: String,
    },
}
```

这里我们用一个简单约定：让 LLM 输出 JSON，解析成 `AgentAction`。
（很多厂商支持原生 tool calling；但用 JSON 输出更通用。）

---

## Tools：用 trait 抽象“手脚”，把能力模块化

```rust
// src/tools/mod.rs
use async_trait::async_trait;
use serde_json::Value;
use anyhow::Result;

#[async_trait]
pub trait Tool: Send + Sync {
    fn name(&self) -> &str;
    fn spec(&self) -> crate::types::ToolSpec;

async fn call(&self, input: Value) -> Result<Value>;
}

pub struct ToolRegistry {
    tools: std::collections::HashMap<String, std::sync::Arc<dyn Tool>>,
}

impl ToolRegistry {
    pub fn new() -> Self { Self { tools: Default::default() } }

pub fn register<T: Tool + 'static>(&mut self, tool: T) {
        self.tools.insert(tool.name().to_string(), std::sync::Arc::new(tool));
    }

pub fn list_specs(&self) -> Vec<crate::types::ToolSpec> {
        self.tools.values().map(|t| t.spec()).collect()
    }

pub fn get(&self, name: &str) -> Option<std::sync::Arc<dyn Tool>> {
        self.tools.get(name).cloned()
    }
}
```

两个工具：HTTP GET + 读文件（你后面可扩展到 DB / RPC / Kafka 等）

```rust
// src/tools/http.rs
use async_trait::async_trait;
use serde_json::{json, Value};
use anyhow::{Result, anyhow};

pub struct HttpGetTool;

#[async_trait]
impl crate::tools::Tool for HttpGetTool {
    fn name(&self) -> &str { "http_get" }

fn spec(&self) -> crate::types::ToolSpec {
        crate::types::ToolSpec {
            name: self.name().into(),
            description: "Send HTTP GET request and return response text".into(),
            input_schema: json!({
              "type": "object",
              "properties": {
                "url": {"type":"string"}
              },
              "required": ["url"]
            }),
        }
    }

async fn call(&self, input: Value) -> Result<Value> {
        let url = input.get("url").and_then(|v| v.as_str()).ok_or_else(|| anyhow!("missing url"))?;
        let resp = reqwest::get(url).await?.text().await?;
        Ok(json!({ "text": resp }))
    }
}
```

```rust
// src/tools/fs.rs
use async_trait::async_trait;
use serde_json::{json, Value};
use anyhow::{Result, anyhow};

pub struct ReadFileTool;

#[async_trait]
impl crate::tools::Tool for ReadFileTool {
    fn name(&self) -> &str { "read_file" }

fn spec(&self) -> crate::types::ToolSpec {
        crate::types::ToolSpec {
            name: self.name().into(),
            description: "Read a local text file (UTF-8)".into(),
            input_schema: json!({
              "type":"object",
              "properties": { "path": {"type":"string"} },
              "required":["path"]
            }),
        }
    }

async fn call(&self, input: Value) -> Result<Value> {
        let path = input.get("path").and_then(|v| v.as_str()).ok_or_else(|| anyhow!("missing path"))?;
        let text = tokio::fs::read_to_string(path).await?;
        Ok(json!({ "text": text }))
    }
}
```

---

## Memory：短期 + 长期，两层更好用

短期记忆：就是对话上下文（消息列表），注意要做**窗口裁剪**，否则 tokens 爆掉。

```rust
// src/memory/short_term.rs
use crate::types::ChatMessage;

pub struct ShortTermMemory {
    pub messages: Vec<ChatMessage>,
    pub max_messages: usize,
}

impl ShortTermMemory {
    pub fn new(max_messages: usize) -> Self {
        Self { messages: vec![], max_messages }
    }

pub fn push(&mut self, msg: ChatMessage) {
        self.messages.push(msg);
        if self.messages.len() > self.max_messages {
            let overflow = self.messages.len() - self.max_messages;
            self.messages.drain(0..overflow);
        }
    }

pub fn all(&self) -> &[ChatMessage] {
        &self.messages
    }
}
```

长期记忆：先做一个最简单的本地 JSONL 追加写（后续你可换 SQLite/向量库）。

```rust
// src/memory/long_term.rs
use anyhow::Result;
use serde_json::Value;
use tokio::io::AsyncWriteExt;

pub struct LongTermMemory {
    path: String,
}

impl LongTermMemory {
    pub fn new(path: impl Into<String>) -> Self {
        Self { path: path.into() }
    }

pub async fn append_event(&self, event: &Value) -> Result<()> {
        let mut f = tokio::fs::OpenOptions::new()
            .create(true).append(true)
            .open(&self.path).await?;
        f.write_all(event.to_string().as_bytes()).await?;
        f.write_all(b"\n").await?;
        Ok(())
    }
}
```

---

## LLM Client：用 trait 隔离供应商差异

```rust
// src/llm/mod.rs
use async_trait::async_trait;
use anyhow::Result;
use crate::types::{ChatMessage, ToolSpec};

#[async_trait]
pub trait LlmClient: Send + Sync {
    async fn complete(
        &self,
        system_prompt: &str,
        messages: &[ChatMessage],
        tools: &[ToolSpec],
    ) -> Result<String>;
}
```

你可以实现一个 `OpenAIClient` / `AnthropicClient` / `LocalModelClient`。
本文重点是 Agent 架构，所以这里先不展开厂商细节（只要返回一个字符串即可）。

---

## 最关键：Agent Loop（Plan → Act → Observe）

我们让 LLM 每轮输出严格 JSON：

* `{"type":"ToolCall","tool_name":"read_file","input":{"path":"..."}}`
* 或 `{"type":"Final","answer":"..."}`

```rust
// src/agent/loop.rs
use anyhow::{Result, anyhow};
use serde_json::Value;
use crate::types::{AgentAction, ChatMessage};

pub struct AgentLoop<L: crate::llm::LlmClient> {
    pub llm: std::sync::Arc<L>,
    pub tools: crate::tools::ToolRegistry,
    pub short_memory: crate::memory::short_term::ShortTermMemory,
    pub long_memory: crate::memory::long_term::LongTermMemory,
    pub system_prompt: String,
    pub max_steps: usize,
}

impl<L: crate::llm::LlmClient> AgentLoop<L> {
    pub async fn run(&mut self, user_goal: &str) -> Result<String> {
        self.short_memory.push(ChatMessage {
            role: "user".into(),
            content: user_goal.into(),
            name: None,
        });

for step in 0..self.max_steps {
            let tool_specs = self.tools.list_specs();
            let raw = self.llm
                .complete(&self.system_prompt, self.short_memory.all(), &tool_specs)
                .await?;

let action: AgentAction = serde_json::from_str(&raw)
                .map_err(|e| anyhow!("LLM output is not valid AgentAction JSON: {e}. raw={raw}"))?;

match action {
                AgentAction::Final { answer } => {
                    self.long_memory.append_event(&serde_json::json!({
                        "type":"final",
                        "step": step,
                        "answer": answer
                    })).await?;
                    return Ok(answer);
                }
                AgentAction::ToolCall { tool_name, input } => {
                    let tool = self.tools.get(&tool_name)
                        .ok_or_else(|| anyhow!("tool not found: {tool_name}"))?;

let out = tool.call(input).await?;

// 记录工具返回到短期记忆（供下一轮推理）
                    self.short_memory.push(ChatMessage {
                        role: "tool".into(),
                        name: Some(tool_name.clone()),
                        content: out.to_string(),
                    });

// 也可记到长期记忆里，方便追溯
                    self.long_memory.append_event(&serde_json::json!({
                        "type":"tool_result",
                        "step": step,
                        "tool": tool_name,
                        "output": out
                    })).await?;
                }
            }
        }

Err(anyhow!("max_steps reached without Final"))
    }
}
```

---

## System Prompt：决定 Agent “像不像人”

一个好用的 system prompt 通常要做到：

* 约束输出格式（必须 JSON）
* 告诉它何时调用工具、何时结束
* 强调“工具输入要符合 schema”
* 防止无限循环（如果没进展要总结并结束）

示例：

```rust
const SYSTEM_PROMPT: &str = r#"
You are a helpful AI agent.
You MUST respond in valid JSON that matches one of:
1) {"type":"ToolCall","tool_name": "...", "input": {...}}
2) {"type":"Final","answer":"..."}
Rules:
- Use tools when you need external data.
- Tool input MUST follow the tool's JSON schema.
- If you have enough information, end with Final.
- If repeated tool calls do not improve progress, summarize and Final.
"#;
```

---

## main.rs：把一切组装起来

```rust
// src/main.rs
mod types;
mod llm;
mod tools;
mod memory;
mod agent;

use tools::{ToolRegistry};
use tools::http::HttpGetTool;
use tools::fs::ReadFileTool;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // 1) LLM client（你自己实现一个）
    let llm = std::sync::Arc::new(MyLlmClient::new_from_env()?);

// 2) tools
    let mut registry = ToolRegistry::new();
    registry.register(HttpGetTool);
    registry.register(ReadFileTool);

// 3) memory
    let short_memory = memory::short_term::ShortTermMemory::new(30);
    let long_memory = memory::long_term::LongTermMemory::new("agent_events.jsonl");

// 4) agent loop
    let mut agent = agent::loop_::AgentLoop {
        llm,
        tools: registry,
        short_memory,
        long_memory,
        system_prompt: SYSTEM_PROMPT.into(),
        max_steps: 20,
    };

let goal = "Read ./README.md and summarize it, then fetch https://example.com and compare topics.";
    let answer = agent.run(goal).await?;
    println!("{answer}");
    Ok(())
}
```

---

## 工程化建议：让 Agent “能长期跑”

写出 MVP 不难，难的是“上线跑一个月不崩”。给你几个关键点：

### ① 工具调用要有：超时、重试、限流

* `reqwest` 配超时
* 重试用 `tokio-retry` 或自己写指数退避
* 限流可以用 `governor`（你如果做过 Tokio 并发控制，会很顺）

### ② 并发工具执行（可选）

当 LLM 一次规划多个独立动作时，你可以并发执行：`tokio::join!` / `FuturesUnordered`。
注意：并发后要**合并 observation**，并保持输出可追溯（step id）。

### ③ 观测性：日志 + trace

* `tracing` + `tracing-subscriber`
* 每次 tool call 打 span：tool_name、latency、payload size、error

### ④ Guardrails：防止“跑飞”

* max_steps
* “重复调用同一工具且结果相似”就强制 Final
* 对高风险工具（写文件/转账/删库）做 allowlist + 人类确认

### ⑤ 记忆不要无脑塞上下文

长期记忆最好做：

* 事件流（JSONL） + 摘要（定期压缩）
* 需要时再检索（RAG/向量库），而不是全塞进 prompt

---

## 结语

Rust Agent 适合哪些场景？如果你要做的是：

* 高吞吐任务编排（大量工具调用、IO 密集）
* 需要稳定长期运行的自动化系统
* 对成本敏感（更高效、更少资源）
* 或者你想把 Agent 当成“基础设施”长期迭代

那 Rust 真的很合适。

过去一年“Agent”这个词被说烂了：会思考、会调用工具、能自己拆解任务并持续执行。

大多数教程都用 Python 写，但在需要高并发、可控资源、长期运行稳定的场景里（爬取、自动化运营、链上监控、日志分析、交易风控），Rust 反而更合适：性能强、内存安全、可观测性好、部署简单。

这篇文章带你用 Rust 写一个“能跑起来”的 Agent：

有 Plan → Act → Observe 循环
会 工具调用（Tool Calling）
有 短期记忆（对话上下文） + 长期记忆（本地存储/向量库接口预留）
支持 并发执行工具、限流、重试
结构清晰，方便你后续接入任意 LLM 提供商

注：本文不绑定某一家模型厂商，接口用“可替换的 LLM Client”封装。你可以接 OpenAI、Azure、Anthropic、火山、通义、智谱等。

Agent 到底是什么？

一个实用 Agent 通常有这几个部件：

LLM（大脑）：负责推理、生成行动指令
Tools（手脚）：HTTP 请求、数据库、搜索、代码执行、链上 RPC、发邮件等
Memory（记忆）：保存历史信息，避免“鱼类记忆”
Loop（循环）：反复执行“规划-行动-观察”，直到完成或触发停止条件
Safety & Guardrails（护栏）：避免无限循环、越权调用、危险操作

我们要实现的最小可用版本（MVP）：

输入一个目标：比如“查一下我项目的错误日志并总结原因，给出修复建议”
Agent 自动拆解：需要查询日志 → 提取关键错误 → 归类 → 给建议
每次行动都是调用一个 tool，然后把结果喂回 LLM 继续下一步

Rust 项目结构：把复杂度关进笼子

建议用这样的目录结构：

agent-rs/
  src/
    main.rs
    agent/mod.rs
    agent/loop.rs
    llm/mod.rs
    tools/mod.rs
    tools/http.rs
    tools/fs.rs
    memory/mod.rs
    memory/short_term.rs
    memory/long_term.rs
    types.rs

核心思想：

Agent 不直接依赖某个模型厂商（llm 模块可替换）
Tools 用 trait 抽象（你能无限加工具）
Memory 可插拔（本地文件、SQLite、Redis、向量库都行）

定义核心数据结构：Message、Tool、Action

先把“协议”定下来：LLM 怎么告诉我们要调用工具？工具返回什么？循环怎么停止？

// src/types.rs
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ChatMessage {
    pub role: String,   // "system" | "user" | "assistant" | "tool"
    pub content: String,
    pub name: Option&lt;String>, // tool name if role == "tool"
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ToolSpec {
    pub name: String,
    pub description: String,
    pub input_schema: serde_json::Value, // JSON Schema
}

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum AgentAction {
    // LLM 要求调用工具
    ToolCall {
        tool_name: String,
        input: serde_json::Value,
    },
    // LLM 认为任务完成
    Final {
        answer: String,
    },
}

这里我们用一个简单约定：让 LLM 输出 JSON，解析成 AgentAction。（很多厂商支持原生 tool calling；但用 JSON 输出更通用。）

Tools：用 trait 抽象“手脚”，把能力模块化

// src/tools/mod.rs
use async_trait::async_trait;
use serde_json::Value;
use anyhow::Result;

#[async_trait]
pub trait Tool: Send + Sync {
    fn name(&self) -> &str;
    fn spec(&self) -> crate::types::ToolSpec;

    async fn call(&self, input: Value) -> Result&lt;Value>;
}

pub struct ToolRegistry {
    tools: std::collections::HashMap&lt;String, std::sync::Arc&lt;dyn Tool>>,
}

impl ToolRegistry {
    pub fn new() -> Self { Self { tools: Default::default() } }

    pub fn register&lt;T: Tool + 'static>(&mut self, tool: T) {
        self.tools.insert(tool.name().to_string(), std::sync::Arc::new(tool));
    }

    pub fn list_specs(&self) -> Vec&lt;crate::types::ToolSpec> {
        self.tools.values().map(|t| t.spec()).collect()
    }

    pub fn get(&self, name: &str) -> Option&lt;std::sync::Arc&lt;dyn Tool>> {
        self.tools.get(name).cloned()
    }
}

两个工具：HTTP GET + 读文件（你后面可扩展到 DB / RPC / Kafka 等）

// src/tools/http.rs
use async_trait::async_trait;
use serde_json::{json, Value};
use anyhow::{Result, anyhow};

pub struct HttpGetTool;

#[async_trait]
impl crate::tools::Tool for HttpGetTool {
    fn name(&self) -> &str { "http_get" }

    fn spec(&self) -> crate::types::ToolSpec {
        crate::types::ToolSpec {
            name: self.name().into(),
            description: "Send HTTP GET request and return response text".into(),
            input_schema: json!({
              "type": "object",
              "properties": {
                "url": {"type":"string"}
              },
              "required": ["url"]
            }),
        }
    }

    async fn call(&self, input: Value) -> Result&lt;Value> {
        let url = input.get("url").and_then(|v| v.as_str()).ok_or_else(|| anyhow!("missing url"))?;
        let resp = reqwest::get(url).await?.text().await?;
        Ok(json!({ "text": resp }))
    }
}

// src/tools/fs.rs
use async_trait::async_trait;
use serde_json::{json, Value};
use anyhow::{Result, anyhow};

pub struct ReadFileTool;

#[async_trait]
impl crate::tools::Tool for ReadFileTool {
    fn name(&self) -> &str { "read_file" }

    fn spec(&self) -> crate::types::ToolSpec {
        crate::types::ToolSpec {
            name: self.name().into(),
            description: "Read a local text file (UTF-8)".into(),
            input_schema: json!({
              "type":"object",
              "properties": { "path": {"type":"string"} },
              "required":["path"]
            }),
        }
    }

    async fn call(&self, input: Value) -> Result&lt;Value> {
        let path = input.get("path").and_then(|v| v.as_str()).ok_or_else(|| anyhow!("missing path"))?;
        let text = tokio::fs::read_to_string(path).await?;
        Ok(json!({ "text": text }))
    }
}

Memory：短期 + 长期，两层更好用

短期记忆：就是对话上下文（消息列表），注意要做窗口裁剪，否则 tokens 爆掉。

// src/memory/short_term.rs
use crate::types::ChatMessage;

pub struct ShortTermMemory {
    pub messages: Vec&lt;ChatMessage>,
    pub max_messages: usize,
}

impl ShortTermMemory {
    pub fn new(max_messages: usize) -> Self {
        Self { messages: vec![], max_messages }
    }

    pub fn push(&mut self, msg: ChatMessage) {
        self.messages.push(msg);
        if self.messages.len() > self.max_messages {
            let overflow = self.messages.len() - self.max_messages;
            self.messages.drain(0..overflow);
        }
    }

    pub fn all(&self) -> &[ChatMessage] {
        &self.messages
    }
}

长期记忆：先做一个最简单的本地 JSONL 追加写（后续你可换 SQLite/向量库）。

// src/memory/long_term.rs
use anyhow::Result;
use serde_json::Value;
use tokio::io::AsyncWriteExt;

pub struct LongTermMemory {
    path: String,
}

impl LongTermMemory {
    pub fn new(path: impl Into&lt;String>) -> Self {
        Self { path: path.into() }
    }

    pub async fn append_event(&self, event: &Value) -> Result&lt;()> {
        let mut f = tokio::fs::OpenOptions::new()
            .create(true).append(true)
            .open(&self.path).await?;
        f.write_all(event.to_string().as_bytes()).await?;
        f.write_all(b"\n").await?;
        Ok(())
    }
}

LLM Client：用 trait 隔离供应商差异

// src/llm/mod.rs
use async_trait::async_trait;
use anyhow::Result;
use crate::types::{ChatMessage, ToolSpec};

#[async_trait]
pub trait LlmClient: Send + Sync {
    async fn complete(
        &self,
        system_prompt: &str,
        messages: &[ChatMessage],
        tools: &[ToolSpec],
    ) -> Result&lt;String>;
}

你可以实现一个 OpenAIClient / AnthropicClient / LocalModelClient。本文重点是 Agent 架构，所以这里先不展开厂商细节（只要返回一个字符串即可）。

最关键：Agent Loop（Plan → Act → Observe）

我们让 LLM 每轮输出严格 JSON：

{"type":"ToolCall","tool_name":"read_file","input":{"path":"..."}}
或 {"type":"Final","answer":"..."}

// src/agent/loop.rs
use anyhow::{Result, anyhow};
use serde_json::Value;
use crate::types::{AgentAction, ChatMessage};

pub struct AgentLoop&lt;L: crate::llm::LlmClient> {
    pub llm: std::sync::Arc&lt;L>,
    pub tools: crate::tools::ToolRegistry,
    pub short_memory: crate::memory::short_term::ShortTermMemory,
    pub long_memory: crate::memory::long_term::LongTermMemory,
    pub system_prompt: String,
    pub max_steps: usize,
}

impl&lt;L: crate::llm::LlmClient> AgentLoop&lt;L> {
    pub async fn run(&mut self, user_goal: &str) -> Result&lt;String> {
        self.short_memory.push(ChatMessage {
            role: "user".into(),
            content: user_goal.into(),
            name: None,
        });

        for step in 0..self.max_steps {
            let tool_specs = self.tools.list_specs();
            let raw = self.llm
                .complete(&self.system_prompt, self.short_memory.all(), &tool_specs)
                .await?;

            let action: AgentAction = serde_json::from_str(&raw)
                .map_err(|e| anyhow!("LLM output is not valid AgentAction JSON: {e}. raw={raw}"))?;

            match action {
                AgentAction::Final { answer } => {
                    self.long_memory.append_event(&serde_json::json!({
                        "type":"final",
                        "step": step,
                        "answer": answer
                    })).await?;
                    return Ok(answer);
                }
                AgentAction::ToolCall { tool_name, input } => {
                    let tool = self.tools.get(&tool_name)
                        .ok_or_else(|| anyhow!("tool not found: {tool_name}"))?;

                    let out = tool.call(input).await?;

                    // 记录工具返回到短期记忆（供下一轮推理）
                    self.short_memory.push(ChatMessage {
                        role: "tool".into(),
                        name: Some(tool_name.clone()),
                        content: out.to_string(),
                    });

                    // 也可记到长期记忆里，方便追溯
                    self.long_memory.append_event(&serde_json::json!({
                        "type":"tool_result",
                        "step": step,
                        "tool": tool_name,
                        "output": out
                    })).await?;
                }
            }
        }

        Err(anyhow!("max_steps reached without Final"))
    }
}

System Prompt：决定 Agent “像不像人”

一个好用的 system prompt 通常要做到：

约束输出格式（必须 JSON）
告诉它何时调用工具、何时结束
强调“工具输入要符合 schema”
防止无限循环（如果没进展要总结并结束）

示例：

const SYSTEM_PROMPT: &str = r#"
You are a helpful AI agent.
You MUST respond in valid JSON that matches one of:
1) {"type":"ToolCall","tool_name": "...", "input": {...}}
2) {"type":"Final","answer":"..."}
Rules:
- Use tools when you need external data.
- Tool input MUST follow the tool's JSON schema.
- If you have enough information, end with Final.
- If repeated tool calls do not improve progress, summarize and Final.
"#;

main.rs：把一切组装起来

// src/main.rs
mod types;
mod llm;
mod tools;
mod memory;
mod agent;

use tools::{ToolRegistry};
use tools::http::HttpGetTool;
use tools::fs::ReadFileTool;

#[tokio::main]
async fn main() -> anyhow::Result&lt;()> {
    // 1) LLM client（你自己实现一个）
    let llm = std::sync::Arc::new(MyLlmClient::new_from_env()?);

    // 2) tools
    let mut registry = ToolRegistry::new();
    registry.register(HttpGetTool);
    registry.register(ReadFileTool);

    // 3) memory
    let short_memory = memory::short_term::ShortTermMemory::new(30);
    let long_memory = memory::long_term::LongTermMemory::new("agent_events.jsonl");

    // 4) agent loop
    let mut agent = agent::loop_::AgentLoop {
        llm,
        tools: registry,
        short_memory,
        long_memory,
        system_prompt: SYSTEM_PROMPT.into(),
        max_steps: 20,
    };

    let goal = "Read ./README.md and summarize it, then fetch https://example.com and compare topics.";
    let answer = agent.run(goal).await?;
    println!("{answer}");
    Ok(())
}

工程化建议：让 Agent “能长期跑”

写出 MVP 不难，难的是“上线跑一个月不崩”。给你几个关键点：

① 工具调用要有：超时、重试、限流

reqwest 配超时
重试用 tokio-retry 或自己写指数退避
限流可以用 governor（你如果做过 Tokio 并发控制，会很顺）

② 并发工具执行（可选）

当 LLM 一次规划多个独立动作时，你可以并发执行：tokio::join! / FuturesUnordered。注意：并发后要合并 observation，并保持输出可追溯（step id）。

③ 观测性：日志 + trace

tracing + tracing-subscriber
每次 tool call 打 span：tool_name、latency、payload size、error

④ Guardrails：防止“跑飞”

max_steps
“重复调用同一工具且结果相似”就强制 Final
对高风险工具（写文件/转账/删库）做 allowlist + 人类确认

⑤ 记忆不要无脑塞上下文