SP1 审计报告 - 递归虚拟机

succinctlabs
发布于 2024-07-20 18:14
阅读 12

本文是由 rkm0959 审计，KALOS 发布的 SP1 审计报告，审计对象是 SP1 项目中的 Recursion VM 部分，commit hash 为 22f51bb8e1f343661c1a54140401a7cb3e365928，重点关注了 recursion/core/src 目录下的代码，审计时间为2024年4月15日至2024年5月31日。

## SP1 审计报告 - 递归 VM

由 rkm0959 审计。由 KALOS 发布报告。

## 概要

### 提交审计的代码库

https://github.com/succinctlabs/sp1

对于电路，我们审计 `eval()` 函数内部的 AIR 约束。

core/src

- alu - add_sub, bitwise, divrem, lt, mul, sll, sr
- bytes - air.rs, columns.rs
- cpu: air 子文件夹, columns 子文件夹
- memory: global.rs
- operations: 范围内文件夹中的所有 AIR
- program: 范围内文件夹中的所有 AIR
- syscall: precompiles/{edwards, keccak256, sha256, weierstrass, uint256} 中的所有 AIR
- air: builder.rs, word.rs, polynomial.rs, extension.rs, subbuilder.rs
- lookup: builder.rs
- utils: ec 子文件夹, buffer.rs
- stark: verifier.rs, folder.rs, machine.rs, permutation.rs, chip.rs, prover.rs, air.rs
- machine derive macro: https://github.com/succinctlabs/sp1/blob/main/derive/src/lib.rs

recursion/core/src

- cpu: 范围内文件夹中的所有 AIR
- memory: 范围内文件夹中的所有 AIR
- program: 范围内文件夹中的所有 AIR
- poseidon2: 范围内的所有 AIR
- poseidon2_wide: 范围内的所有 AIR
- fri_fold 范围内的所有 AIR
- multi: 范围内的所有 AIR
- range check: 范围内的所有 AIR
- air 文件夹
- stark 文件夹

zkvm

- zkvm/precompiles/src/secp256k1.rs
- zkvm/precompiles/src/io.rs
- zkvm/entrypoint/src/syscalls/ed25519.rs
- zkvm/entrypoint/src/syscalls/secp2561k1.rs
- zkvm/entrypoint/src/syscalls/sha_compress.rs
- zkvm/entrypoint/src/syscalls/io.rs

**此审计报告处理的是审计的 `recursion/core/src` 部分。** 最终提交的哈希值为 **22f51bb8e1f343661c1a54140401a7cb3e365928**，位于 `dev` 分支上。

### 审计时间表

- 2024/04/15 - 2024/05/31 (6 工程师周)

## 发现

### 1. [严重] `poseidon2/external` 允许在任意位置进行内存写入，并且哈希值也受到约束不足

`Poseidon2Chip` 在电路中评估 Poseidon2 哈希，使用 SBOX $x^7$，8 轮外部轮和 13 轮内部轮。代码库中有几个约束不足的地方，这会导致哈希函数评估和总体内存状态的健全性受到破坏。

首先，`local.rounds` 受到约束不足。目前唯一的约束是它们是布尔值，并且最多只能有一个为真。这些应以类似于 Plonky3 的 keccak 轮标志的方式处理，跟踪 24 周期。避免此处出现漏洞 #11。

`is_external_layer` 的计算不正确。迭代的范围必须是 `[2, rounds_p_beginning)` 和 `[rounds_p_end, 2 + rounds_p + rounds_f)`。

```rust
    // First half of the external rounds.
    let mut is_external_layer = (2..rounds_p_beginning)
        .map(|i| local.rounds[i].into())
        .sum::<AB::Expr>();

// Second half of the external rounds.
    is_external_layer += (rounds_p_end..rounds_p + rounds_f)
        .map(|i| local.rounds[i].into())
        .sum::<AB::Expr>();
```

从 `left_input` 和 `right_input` 的内存读取没有约束内存值不发生变化。这允许使用任意值覆盖内存。

系统调用在 `local.rounds[0]` 为真的行接收，因此 `dst_input` 的基础真实来源应在该行。但是，在 `eval_mem` 中，实际用于写入哈希值的地址是 `is_memory_write = local.rounds[23]` 为真的行中的 `dst_input` 值。由于没有检查 `dst_input` 在 24 周期内是否相等，因此可以忽略实际的系统调用，在任意位置写入哈希值。

我们建议检查 `clk, dst_input, left_input, right_input` 在 24 周期内是否相等。一般来说，一个好的注意点是仅在实际读取系统调用时才进行内存访问，并约束周期标志，而不管 `is_real` 如何。

#### 修复说明

此漏洞已在 [PR #747](https://github.com/succinctlabs/sp1/pull/747)、[PR #821 的此提交](https://github.com/succinctlabs/sp1/pull/821/commits/0f0a010d11c8473a03169146628990347f694856)、[PR #821 的此提交](https://github.com/succinctlabs/sp1/pull/821/commits/0ef836cd9c5efcd1600da833ce79f70a54d521b6) 和 [PR #821 的此提交](https://github.com/succinctlabs/sp1/pull/821/commits/38c93b2a5cee3748b063249b10dda32a45288043#diff-c35de0a3834bcf7574826b1c21c18c50db81122f0682d41e1dc8647bb2c3e145) 中修复。我们在下面总结了修复，回顾每个漏洞。

`is_external_layer` 的计算已修复。此外，无论 `is_real` 如何，行 24 周期都会受到约束，从 `rounds[0]` 为真开始，并且每行移动一格。此外，`clk, dst_input, left_input, right_input` 在 24 周期内保持相等。

`is_real` 的值在 24 周期内保持相等。此外，通过修复漏洞 #19，`recursion_eval_memory_access_single` 中的检查足以约束 `is_real` 为布尔值。当 `is_real` 为零时，`do_memory` 和 `do_receive` 都为零，导致不执行表接收和内存访问。在 `is_real = 1` 时，电路会约束。

此外，在 `is_memory_read`（即 `rounds[0]`）时，内存访问被约束为只读。

### 2. [低] `recursion` 的 `eval_memory_access` 应约束 `is_real` 为布尔值，就像在 `core` 中所做的那样

虽然 `core` 的 `eval_memory_access` 检查多重性是否为布尔值，但 `recursion` 的 `recursion_eval_memory_access` 中不存在此检查。为了保持一致性，除非有特殊原因，否则我们建议将此布尔检查添加到 `recursion` 中。

```rust
    // core
     fn eval_memory_access<E: Into<Self::Expr> + Clone>(
        &mut self,
        shard: impl Into<Self::Expr> + Clone>,
        clk: impl Into<Self::Expr>,
        addr: impl Into<Self::Expr>,
        memory_access: &impl MemoryCols<E>,
        do_check: impl Into<Self::Expr>,
    ) {
        let do_check: Self::Expr = do_check.into();
        let shard: Self::Expr = shard.into();
        let clk: Self::Expr = clk.into();
        let mem_access = memory_access.access();

self.assert_bool(do_check.clone());

// Verify that the current memory access time is greater than the previous's.
        self.eval_memory_access_timestamp(mem_access, do_check.clone(), shard.clone(), clk.clone());

// Add to the memory argument.
        let addr = addr.into();
        let prev_shard = mem_access.prev_shard.clone().into();
        let prev_clk = mem_access.prev_clk.clone().into();
        let prev_values = once(prev_shard)
            .chain(once(prev_clk))
            .chain(once(addr.clone()))
            .chain(memory_access.prev_value().clone().map(Into::into))
            .collect();
        let current_values = once(shard)
            .chain(once(clk))
            .chain(once(addr.clone()))
            .chain(memory_access.value().clone().map(Into::into))
            .collect();

// The previous values get sent with multiplicity = 1, for "read".
        self.send(AirInteraction::new(
            prev_values,
            do_check.clone(),
            InteractionKind::Memory,
        ));

// The current values get "received", i.e. multiplicity = -1
        self.receive(AirInteraction::new(
            current_values,
            do_check.clone(),
            InteractionKind::Memory,
        ));
    }

// recursion
    fn recursion_eval_memory_access<E: Into<Self::Expr> + Clone>(
        &mut self,
        timestamp: impl Into<Self::Expr>,
        addr: impl Into<Self::Expr>,
        memory_access: &impl MemoryCols<E, Block<E>>,
        is_real: impl Into<Self::Expr>,
    ) {
        let is_real: Self::Expr = is_real.into();
        let timestamp: Self::Expr = timestamp.into();
        let mem_access = memory_access.access();

self.eval_memory_access_timestamp(timestamp.clone(), mem_access, is_real.clone());

let addr = addr.into();
        let prev_timestamp = mem_access.prev_timestamp.clone().into();
        let prev_values = once(prev_timestamp)
            .chain(once(addr.clone()))
            .chain(memory_access.prev_value().clone().map(Into::into))
            .collect();
        let current_values = once(timestamp)
            .chain(once(addr.clone()))
            .chain(memory_access.value().clone().map(Into::into))
            .collect();

self.receive(AirInteraction::new(
            prev_values,
            is_real.clone(),
            InteractionKind::Memory,
        ));
        self.send(AirInteraction::new(
            current_values,
            is_real,
            InteractionKind::Memory,
        ));
    }
```

#### 修复说明

如 [PR #789](https://github.com/succinctlabs/sp1/pull/789/files#diff-f8e74178aadfae0c133554004485bce0a364dfe483c2c538db5d3f18fe9ee9f4) 中所建议的那样修复此问题。

### 3. [高] `MemoryGlobalChip` 允许为一个内存地址进行多次初始化，打破了内存参数

`MemoryGlobalChip` 允许初始化和完成内存。
在这里，没有检查每一行的内存地址是否不同。

```rust
    let main = builder.main();
        let local = main.row_slice(0);
        let local: &MemoryInitCols<AB::Var> = (*local).borrow();

// Verify that is_initialize and is_finalize are bool and that at most one is true.
        builder.assert_bool(local.is_initialize);
        builder.assert_bool(local.is_finalize);
        builder.assert_bool(local.is_initialize + local.is_finalize);

builder.send(AirInteraction::new(
            vec![
                local.timestamp.into(),
                local.addr.into(),
                local.value[0].into(),
                local.value[1].into(),
                local.value[2].into(),
                local.value[3].into(),
            ],
            local.is_initialize.into(),
            InteractionKind::Memory,
        ));
        builder.receive(AirInteraction::new(
            vec![
                local.timestamp.into(),
                local.addr.into(),
                local.value[0].into(),
                local.value[1].into(),
                local.value[2].into(),
                local.value[3].into(),
            ],
            local.is_finalize.into(),
            InteractionKind::Memory,
        ));
    }
```

这允许出现意外行为 - 允许读取 `prev_value` 的意外值。

我们用一个例子来说明这一点。将固定地址的内存写入为 `(value, timestamp)`。

- 初始化 `(5, 0)` 和 `(7, 0)`
- 在时钟 1，从时钟 0 读取先前值 5，然后变为 `(5, 0) -> (8, 1)`
- 在时钟 2，从时钟 0 读取先前值 7，然后变为 `(7, 0) -> (9, 2)`
- 在时钟 3，从时钟 1 读取先前值 8，然后变为 `(8, 1) -> (10, 3)`
- 在时钟 4，从时钟 2 读取先前值 9，然后变为 `(9, 2) -> (11, 4)`
- 完成 `(10, 3)` 和 `(11, 4)`

这允许我们读取不正确的先前值 - 在时钟 2，直观的先前值是 8，但读取的是 7。请注意，此攻击在每个时钟最多进行一次内存访问时有效。

应该强制每一行的地址都不同 - 这可以通过约束地址通过按位分解在行上增加来实现。请注意，仅在初始化阶段添加此检查就足以解决此问题。

此外，验证者需要强制仅存在一个 `MemoryGlobalChip` 表，以便不能使用多个表进行双重初始化/完成。

由于在 Core VM 中发现了类似的漏洞，因此可以在此处应用类似的防御措施。

#### 修复说明

如推荐的那样，已在 [此拉取请求](https://github.com/succinctlabs/sp1/pull/934/files) 中修复此问题。

### 4. [严重] 递归 VM 的 CPU 不约束在 LOAD 操作码中加载到寄存器的内存

对于加载操作码，CPU 应该从内存加载值并将其放入与 `local.a` 对应的寄存器中。但是，加载操作码的唯一检查是内存值在操作码之后不会改变。这意味着没有检查寄存器中的新值是否是来自内存的值本身，从而允许任意值。

```rust
    // Constraints on the memory column depending on load or store.
    // We read from memory when it is a load.
    builder.when(local.selectors.is_load).assert_block_eq(
        *memory_cols.memory.prev_value(),
        *memory_cols.memory.value(),
    );
    // When there is a store, we ensure that we are writing the value of the a operand to the memory.
    builder
        .when(local.selectors.is_store)
        .assert_block_eq(*local.a.value(), *memory_cols.memory.value());
```

我们建议也为 `is_load` 为真的情况添加检查 `a` 的值是否为内存的值。更改上面代码中第二个检查的选择器就足够了。

#### 修复说明

如 [拉取请求 #789](https://github.com/succinctlabs/sp1/pull/789/files#diff-bb73fc6e2645d5d723046d8384e5f7eb411486dec93a6b73ed9233981d42dd8a) 中所建议的那样添加了推荐的修复。

### 5. [高] 对于跳转操作码，递归 VM 的 `op_a` 寄存器值和 `fp` 值受到约束不足

以下是跳转指令的代码。

```rust
    pub fn eval_jump<AB>(
        &self,
        builder: &mut AB,
        local: &CpuCols<AB::Var>,
        next: &CpuCols<AB::Var>,
        next_pc: &mut AB::Expr,
    ) where
        AB: SP1RecursionAirBuilder<F = F>,
    {
        // Verify the next row's fp.
        builder
            .when_first_row()
            .assert_eq(local.fp, F::from_canonical_usize(STACK_SIZE));
        let not_jump_instruction = AB::Expr::one() - self.is_jump_instruction::<AB>(local);
        let expected_next_fp = local.selectors.is_jal * (local.fp + local.c.value()[0])
            + local.selectors.is_jalr * local.a.value()[0]
            + not_jump_instruction * local.fp;
        builder
            .when_transition()
            .when(next.is_real)
            .assert_eq(next.fp, expected_next_fp);

// Add to the `next_pc` expression.
        *next_pc += local.selectors.is_jal * (local.pc + local.b.value()[0]);
        *next_pc += local.selectors.is_jalr * local.b.value()[0];
    }
```

将其与递归 VM 的运行时行为进行比较。

```rust
    Opcode::JAL => {
        self.nb_branch_ops += 1;
        let (a_ptr, b_val, c_offset) = self.alu_rr(&instruction);
        let a_val = Block::from(self.pc);
        self.mw_cpu(a_ptr, a_val, MemoryAccessPosition::A);
        next_pc = self.pc + b_val[0];
        self.fp += c_offset[0];
        (a, b, c) = (a_val, b_val, c_offset);
    }
    Opcode::JALR => {
        self.nb_branch_ops += 1;
        let (a_ptr, b_val, c_val) = self.alu_rr(&instruction);
        let a_val = Block::from(self.pc + F::one());
        self.mw_cpu(a_ptr, a_val, MemoryAccessPosition::A);
        next_pc = b_val[0];
        self.fp = c_val[0];
        (a, b, c) = (a_val, b_val, c_val);
    }
```

我们看到以下差异，应相应地进行修复。

- `a` 值应使用与 `pc` 相关的相关值进行设置
- JALR 的预期下一个 `fp` 值为 `c_val[0]`，而不是 `a_val[0]`

#### 修复说明

这两个差异已在 [拉取请求 #789](https://github.com/succinctlabs/sp1/pull/789/files#diff-3283eead4e1efe6677ac4a7eecfdbfc734892d13fb4a0f50306e469e760763b8) 中修复。

### 6. [中] 递归 VM 的 `BNEINC` 操作码对新写入寄存器 `a` 的值约束不足

```rust
    // If the instruction is a BNEINC, verify that the a value is incremented by one.
    builder
        .when(local.is_real)
        .when(local.selectors.is_bneinc)
        .assert_eq(local.a.value()[0], local.a.prev_value()[0] + one.clone());
```

`BNEINC` 递增寄存器 `a` 中的值，并根据 `a` 中的新值是否与 `op_b_value` 匹配进行分支。但是，请注意，在上面的代码中仅约束了 `a.value()[0]` 部分。缺少一个约束，即 `a.value()[1..4]` 等于 `a.prev_value()[1..4]`，应将其添加到电路中。

我们还注意到关于 `BNEINC` 的注释不正确 - 实际上，操作码使用寄存器 `a` 中的**新**值，而不是如下面注释中所述的**先前**值。

```rust
    // Convert operand values from Block<Var> to BinomialExtension<Expr>.  Note that it gets the
    // previous value of the `a` and `b` operands, since BNENIC will modify `a`.
    let a_ext: BinomialExtension<AB::Expr> =
        BinomialExtensionUtils::from_block(local.a.value().map(|x| x.into()));
    let b_ext: BinomialExtension<AB::Expr> =
        BinomialExtensionUtils::from_block(local.b.value().map(|x| x.into()));
```

我们建议如上所述添加约束，并修复注释。

#### 修复说明

如建议的那样，已在 [拉取请求 #789](https://github.com/succinctlabs/sp1/pull/789/files#diff-061f406c5a5678477225937d3b2206499c0c76eaafd5ffa11446a2e715e53130) 中修复此问题。

### 7. [严重] 当 `CpuChip` 中的 `is_real` 为零时，应约束所有选择器为零且 `imm_b = imm_c = 1`

请注意，预处理的程序信息是从 `ProgramChip` 中获取的，在 `CpuChip` 中具有多重性 `is_real`。这意味着在 `is_real = 0` 时，该行的程序信息可以是任意的，并且未进行预处理。

```rust
    // Constrain the program.
    builder.send_program(local.pc, local.instruction, local.selectors, local.is_real);
```

因此，在这种情况下，应避免将任何系统调用或查找作为交互发送出去。通过强制所有选择器都为零，并且 `imm_b` 和 `imm_c` 等于 1，可以很容易地实现这一点。这将设置当 `is_real = 0` 时，所有内存访问和查找都以多重性零进行，如预期的那样。请注意，如果没有此附加约束，我们可以使用 `is_real = 0` 的行在任意时钟上进行任意内存访问。

#### 修复说明

已在 [此处](https://github.com/succinctlabs/sp1/pull/937/files#diff-6e5f27c21716211dcefc022f5bdd52bf1b41eb4c8a8129dbb01086709838db00R41-R58) 添加。请注意，`selector` 的 `is_noop` 已被启用为真，这是正确的。

### 8. [信息性] ALU 除法允许 `0/0` 为任意值

```rust
    // For div operation, we assert that b == a * c (equivalent to a == b / c).
    builder
        .when(local.selectors.is_div)
        .assert_ext_eq(b_ext, a_ext * c_ext);
```

为了检查 `a == b / c`，电路只会检查 `b == a * c`。这在通常情况下有效，但这允许当 `b = c = 0` 时，`a` 为任意值。虽然这本身不是一个严重的漏洞，但递归 VM 的所有用例都应该意识到这种行为。

#### 修复说明

作者通过在 [此处](https://github.com/succinctlabs/sp1/pull/937/files#diff-0f581efab1754ec2057245490521f6cc8a397f43a1fce4c7d8eab1a797fe57ce) 添加注释来表示已确认。

### 9. [中] 某些选择器未在 `is_op_a_read_only_instruction` 中考虑

某些指令仅将寄存器 `op_a` 用作只读，而某些指令则使用它来写入寄存器。只读行为检查如下。

```rust
    // If the instruction only reads from operand A, then verify that previous and current values are equal.
    let is_op_a_read_only = self.is_op_a_read_only_instruction::<AB>(local);
    builder
        .when(is_op_a_read_only)
        .assert_block_eq(*local.a.prev_value(), *local.a.value());
```

我们快速查看每个指令。

- 所有 ALU 显然都写入 `op_a`，因此它们不是只读的
- 对于分支，`BEQ` 和 `BNE` 是只读的，但 `BNEINC` 递增 `op_a`
- 堆扩展与 ADD 指令一起完成，因此不是只读的
- 对于跳转，指令将与 `pc` 相关的值写入 `op_a`
- 对于内存，`LOAD` 将内存值写入寄存器，但 `STORE` 将寄存器值写入内存，因此 `STORE` 是只读的，而 `LOAD` 不是只读的
- 系统调用使用 `op_a` 作为只读
- `COMMIT`、`TRAP`、`HALT` 应该是只读的

因此，`op_a` 只读指令是 `BEQ`、`BNE`、`STORE`、所有系统调用、`COMMIT`、`TRAP`、`HALT`。但是，似乎缺少了新添加的指令。

```rust
    /// Expr to check for instructions that only read from operand `a`.
    pub fn is_op_a_read_only_instruction<AB>(&self, local: &CpuCols<AB::Var>) -> AB::Expr
    where
        AB: SP1RecursionAirBuilder<F = F>,
    {
        local.selectors.is_beq
            + local.selectors.is_bne
            + local.selectors.is_fri_fold
            + local.selectors.is_poseidon
            + local.selectors.is_store
            + local.selectors.is_noop
            + local.selectors.is_ext_to_felt
    }
```

通常，我们建议查看新添加的指令的只读属性。

#### 修复说明

已在 [此处](https://github.com/succinctlabs/sp1/pull/937/files#diff-6e5f27c21716211dcefc022f5bdd52bf1b41eb4c8a8129dbb01086709838db00R231-R234) 的 `is_op_a_read_only_instruction` 中添加了新指令。

### 10. [高] `clk` 和 `pc` 未在 `CpuChip` 中初始化

Recursion VM `CpuChip` 中未约束 `clk` 和 `pc` 的第一行。这与例如 Recursion VM 的 `fp` 不同，后者在 `air/jump.rs` 中受到约束。

```rust
 // Verify the next row's fp.
    builder
        .when_first_row()
        .assert_eq(local.fp, F::from_canonical_usize(STACK_SIZE));
```

此外，请注意 `clk` 和 `pc` 通过以下方式在 Core VM 中受到约束

- 对于 `clk`，明确的是第一行的 `clk == 0`
- 对于 `pc`，约束是第一行将公共输入的 `pc` 作为程序计数器

我们建议为 `clk` 和 `pc` 的第一行添加适当的约束。

#### 修复说明

现在，初始 `clk` 和 `pc` 都被约束为零 [此处](https://github.com/succinctlabs/sp1/pull/937/files#diff-6e5f27c21716211dcefc022f5bdd52bf1b41eb4c8a8129dbb01086709838db00R61-R62)。

### 11. [高] MultiBuilder 对堆叠表的首行和末行的处理不正确，导致潜在的健全性中断

`MultiChip` 旨在在同一个表中“堆叠” `FriFoldChip` 和 `Poseidon2Chip`。为此，它有两个布尔列 `is_fri_fold` 和 `is_poseidon2`，表示该行是否属于 `FriFold` 块或 `Poseidon2` 块。之后，它约束这些块依次为 `FriFold` 块，然后是 `Poseidon2` 块，最后是一个仅为填充的块。`FriFoldChip` 的约束系统通过 `MultiBuilder` 调用，`MultiBuilder` 使用 `is_fri_fold` 作为当前行是否为要约束的“实际行”。此外，对内存访问和系统调用读取进行修改，使其仅在启用 `is_fri_fold` 时才启用。对 `Poseidon2Chip` 使用了类似的策略。

```rust
    let mut sub_builder =
            MultiBuilder::new(builder, local.is_fri_fold.into(), next.is_fri_fold.into());
    let fri_columns_local = local.fri_fold();
    sub_builder.assert_eq(
        local.is_fri_fold * FriFoldChip::<3>::do_memory_access::<AB::Var>(fri_columns_local),
        local.fri_fold_memory_access,
    );
    sub_builder.assert_eq(
        local.is_fri_fold * FriFoldChip::<3>::do_receive_table::<AB::Var>(fri_columns_local),
        local.fri_fold_receive_table,
    );
    let fri_fold_chip = FriFoldChip::<3>::default();
    fri_fold_chip.eval_fri_fold(
        &mut sub_builder,
        local.fri_fold(),
        next.fri_fold(),
        local.fri_fold_receive_table,
        local.fri_fold_memory_access,
    );

// similar for Poseidon2...
```

`MultiBuilder` 的工作方式如下。构建器接收一个 `local_condition`，指示当前行是否为“真实”，以及一个 `next_condition`，指示下一行是否为“真实”。然后，

- `is_first/last_row`: 使用与通用构建器相同的 `is_first/last_row`，并且必须启用 `local_condition` 才能实际放置约束
- `is_transition_window`: 使用与通用构建器相同的 `is_transition_window`，并且必须启用 `local_condition` 和 `next_condition` 才能放置约束
- `assert_zero`: 需要 `local_condition` 为真才能放置约束

在这里，虽然大多数约束都可以，但问题出现在 `is_first/last_row` 的情况下。例如，考虑 `Poseidon2Chip` 的第一行。在这里，`Poseidon2Chip` 可能对第一行有约束（实际上，在修复漏洞 #1 之后，确实如此）以进行初始化。但是，当通过 `MultiChip` 中的 `MultiBuilder` 进行约束时，可能不会放置此约束。实际上，如果顶部有一个非空的 `FriFoldChip` 堆栈，则第一行的条件将不会放置，因为 `is_poseidon2` 在第一行将为零。

这与预期行为不同，预期行为是 `Poseidon2Chip` 的 `is_first_row` 约束适用于 `Poseidon2Chip` 堆栈的第一行。 `FriFoldChip` 堆栈的 `is_last_row` 也适用类似的思想 - 这样的约束将需要该行是整个表的最后一行才能放置。

我们建议

- 删除 `MultiChip` 架构，仅单独使用两个表
- 在 `MultiChip` 中添加更多列以直接约束正确的堆叠
- 让 `is_first_row`、`is_last_row` 显式发送到 `MultiBuilder`

对于第三种情况，我们建议发送以下值。

`FriFoldChip`

- `is_first_row`: `is_first_row * is_fri_fold`
- `is_last_row`
  - 情况 1：我们在过渡窗口中，下一行不是 `FriFold`
    - `is_fri_fold * (1 - next.is_fri_fold)`
  - 情况 2：我们在最后一行，这是 `FriFold`
    - `is_fri_fold`

`Poseidon2Chip`

- `is_first_row`: 这一项非常棘手，因为你需要以某种方式访问前一行
  - 为此创建一个新列，称其为 `start`。按如下方式约束它
  - 在第一行，`start == is_poseidon2`
  - 在过渡窗口中，`next.start == is_fri_fold * next.is_poseidon2`
- `is_last_row`: 与 `FriFoldChip` 的 `is_last_row` 类似

#### 修复说明

这些修复已按建议在 [拉取请求 #997](https://github.com/succinctlabs/sp1/pull/997) 中实现。

### 12. [高] `FriFoldChip` 允许不正确的行为，因为系统调用读取和内存访问未正确连接

`FriFoldChip` 处理需要可变行数的系统调用。为了处理这个问题，使用了两列，我们将其表示为 `is_real` 和 `is_last_iteration`。我们注意到，在修复漏洞 #11 的情况下，如果在 `FriFoldChip` 堆栈上，则 `memory_access` 等于 `is_real` 且 `receive_table` 等于 `is_last_iteration`。此外，如果我们不在 `FriFoldChip` 堆栈上，则 `memory_access` 和 `receive_table` 都将被强制为零，因此不会执行任何系统调用或内存访问，正如我们所希望的那样。在这里，我们假设漏洞 #11 已修复。

```rust
    pub const fn do_receive_table<T: Copy>(local```
// 确保 FRI FOLD 调用的所有行都具有相同的 input_ptr、clk 和 sequential m 值。
    builder
        .when_transition()
        .when_not(local.is_last_iteration)
        .when(next.is_real)
        .assert_eq(next.m, local.m + AB::Expr::one());
    builder
        .when_transition()
        .when_not(local.is_last_iteration)
        .when(next.is_real)
        .assert_eq(local.input_ptr, next.input_ptr);
    builder
        .when_transition()
        .when_not(local.is_last_iteration)
        .when(next.is_real)
        .assert_eq(local.clk + AB::Expr::one(), next.clk);
```

我们在下面概述了各种攻击思路，并给出了具体的防御约束列表。

首先，存在实际上没有读取任何 syscall 但正在进行内存访问的情况。可以设置 `is_last_iteration` 为零，这样就不会读取任何 syscall，然后以不正确的方式放置 `is_real` 值和其他列值。在这里，还可以注意到，实际上没有对 `is_real` 施加任何约束，因此该值可以在零和一之间任意变化。

也可以从不同的角度来看待这个问题——请注意，这也可以看作是以下事实：没有检查一个所有 `is_real = 1` 的正在进行的块是否会在某个时候被最终确定。

此外，还存在这样一种情况：`is_last_iteration` 确实是 `1`，但 `is_real` 只是零——因此实际上没有进行内存访问。我们再次注意到 `is_real` 没有受到约束。

为了解决这个问题，目标应该是：

- `is_last_iteration == 1` 应该让所有相关行都具有 `is_real == 1`
- 具有 `is_last_iteration == 1` 的块内的任何行都不应具有 `is_real == 1`

为此，我们建议添加以下约束。

- #1: `local.is_last_iteration`、`local.is_real` 都是布尔值
- #2: `local.is_last_iteration == 0` 意味着 `local.is_real == next.is_real`
- #3: `local.is_last_iteration == 1` 意味着 `local.is_real == 1`
- #4: `local.is_real == 0` 意味着 `next.is_real == 0`
- #5: 在最后一行，我们有 `local.is_real == 0` 或 `is_last_iteration == 1`

我们注意到 #1 已经完成 (请注意，由于 issue #2 的修复，`is_real` 是布尔值)。

基本上，这个想法是让所有的 `is_real = 1` 行像往常一样在顶部，通过约束 #4 完成。
首先，我们必须约束 `is_last_iteration` 在底部的填充行中为零。这是通过约束 #3 完成的。现在，剩下的就是检查在最后的非填充行上，`is_last_iteration` 必须为真。如果这个最后的非填充行不是最后一行，约束 #2 足以约束这个事实。如果最后的非填充行只是最后一行，那么约束 #5 足以约束这个事实。完整性是相似的。

#### 修复说明

正如推荐的那样，这已在 [pull request #946](https://github.com/succinctlabs/sp1/pull/946) 中修复。

>- 原文链接： [github.com/succinctlabs/...](https://github.com/succinctlabs/sp1/blob/dev/audits/kalos.md)
>- 登链社区 AI 助手，为大家转译优秀英文文章，如有翻译不通的地方，还请包涵～

SP1 审计报告 - 递归 VM

由 rkm0959 审计。由 KALOS 发布报告。

概要

提交审计的代码库

https://github.com/succinctlabs/sp1

对于电路，我们审计 eval() 函数内部的 AIR 约束。

core/src

alu - add_sub, bitwise, divrem, lt, mul, sll, sr
bytes - air.rs, columns.rs
cpu: air 子文件夹, columns 子文件夹
memory: global.rs
operations: 范围内文件夹中的所有 AIR
program: 范围内文件夹中的所有 AIR
syscall: precompiles/{edwards, keccak256, sha256, weierstrass, uint256} 中的所有 AIR
air: builder.rs, word.rs, polynomial.rs, extension.rs, subbuilder.rs
lookup: builder.rs
utils: ec 子文件夹, buffer.rs
stark: verifier.rs, folder.rs, machine.rs, permutation.rs, chip.rs, prover.rs, air.rs
machine derive macro: https://github.com/succinctlabs/sp1/blob/main/derive/src/lib.rs

recursion/core/src

cpu: 范围内文件夹中的所有 AIR
memory: 范围内文件夹中的所有 AIR
program: 范围内文件夹中的所有 AIR
poseidon2: 范围内的所有 AIR
poseidon2_wide: 范围内的所有 AIR
fri_fold 范围内的所有 AIR
multi: 范围内的所有 AIR
range check: 范围内的所有 AIR
air 文件夹
stark 文件夹

zkvm

zkvm/precompiles/src/secp256k1.rs
zkvm/precompiles/src/io.rs
zkvm/entrypoint/src/syscalls/ed25519.rs
zkvm/entrypoint/src/syscalls/secp2561k1.rs
zkvm/entrypoint/src/syscalls/sha_compress.rs
zkvm/entrypoint/src/syscalls/io.rs

此审计报告处理的是审计的 recursion/core/src 部分。 最终提交的哈希值为 22f51bb8e1f343661c1a54140401a7cb3e365928，位于 dev 分支上。

审计时间表

2024/04/15 - 2024/05/31 (6 工程师周)

发现

1. [严重] `poseidon2/external` 允许在任意位置进行内存写入，并且哈希值也受到约束不足

Poseidon2Chip 在电路中评估 Poseidon2 哈希，使用 SBOX $x^7$，8 轮外部轮和 13 轮内部轮。代码库中有几个约束不足的地方，这会导致哈希函数评估和总体内存状态的健全性受到破坏。

首先，local.rounds 受到约束不足。目前唯一的约束是它们是布尔值，并且最多只能有一个为真。这些应以类似于 Plonky3 的 keccak 轮标志的方式处理，跟踪 24 周期。避免此处出现漏洞 #11。

is_external_layer 的计算不正确。迭代的范围必须是 [2, rounds_p_beginning) 和 [rounds_p_end, 2 + rounds_p + rounds_f)。

    // First half of the external rounds.
    let mut is_external_layer = (2..rounds_p_beginning)
        .map(|i| local.rounds[i].into())
        .sum::&lt;AB::Expr>();

    // Second half of the external rounds.
    is_external_layer += (rounds_p_end..rounds_p + rounds_f)
        .map(|i| local.rounds[i].into())
        .sum::&lt;AB::Expr>();

从 left_input 和 right_input 的内存读取没有约束内存值不发生变化。这允许使用任意值覆盖内存。

系统调用在 local.rounds[0] 为真的行接收，因此 dst_input 的基础真实来源应在该行。但是，在 eval_mem 中，实际用于写入哈希值的地址是 is_memory_write = local.rounds[23] 为真的行中的 dst_input 值。由于没有检查 dst_input 在 24 周期内是否相等，因此可以忽略实际的系统调用，在任意位置写入哈希值。

我们建议检查 clk, dst_input, left_input, right_input 在 24 周期内是否相等。一般来说，一个好的注意点是仅在实际读取系统调用时才进行内存访问，并约束周期标志，而不管 is_real 如何。

修复说明

此漏洞已在 PR #747、PR #821 的此提交、PR #821 的此提交和 PR #821 的此提交中修复。我们在下面总结了修复，回顾每个漏洞。

is_external_layer 的计算已修复。此外，无论 is_real 如何，行 24 周期都会受到约束，从 rounds[0] 为真开始，并且每行移动一格。此外，clk, dst_input, left_input, right_input 在 24 周期内保持相等。

is_real 的值在 24 周期内保持相等。此外，通过修复漏洞 #19，recursion_eval_memory_access_single 中的检查足以约束 is_real 为布尔值。当 is_real 为零时，do_memory 和 do_receive 都为零，导致不执行表接收和内存访问。在 is_real = 1 时，电路会约束。

此外，在 is_memory_read（即 rounds[0]）时，内存访问被约束为只读。

2. [低] `recursion` 的 `eval_memory_access` 应约束 `is_real` 为布尔值，就像在 `core` 中所做的那样

虽然 core 的 eval_memory_access 检查多重性是否为布尔值，但 recursion 的 recursion_eval_memory_access 中不存在此检查。为了保持一致性，除非有特殊原因，否则我们建议将此布尔检查添加到 recursion 中。

    // core
     fn eval_memory_access&lt;E: Into&lt;Self::Expr> + Clone>(
        &mut self,
        shard: impl Into&lt;Self::Expr> + Clone>,
        clk: impl Into&lt;Self::Expr>,
        addr: impl Into&lt;Self::Expr>,
        memory_access: &impl MemoryCols&lt;E>,
        do_check: impl Into&lt;Self::Expr>,
    ) {
        let do_check: Self::Expr = do_check.into();
        let shard: Self::Expr = shard.into();
        let clk: Self::Expr = clk.into();
        let mem_access = memory_access.access();

        self.assert_bool(do_check.clone());

        // Verify that the current memory access time is greater than the previous's.
        self.eval_memory_access_timestamp(mem_access, do_check.clone(), shard.clone(), clk.clone());

        // Add to the memory argument.
        let addr = addr.into();
        let prev_shard = mem_access.prev_shard.clone().into();
        let prev_clk = mem_access.prev_clk.clone().into();
        let prev_values = once(prev_shard)
            .chain(once(prev_clk))
            .chain(once(addr.clone()))
            .chain(memory_access.prev_value().clone().map(Into::into))
            .collect();
        let current_values = once(shard)
            .chain(once(clk))
            .chain(once(addr.clone()))
            .chain(memory_access.value().clone().map(Into::into))
            .collect();

        // The previous values get sent with multiplicity = 1, for "read".
        self.send(AirInteraction::new(
            prev_values,
            do_check.clone(),
            InteractionKind::Memory,
        ));

        // The current values get "received", i.e. multiplicity = -1
        self.receive(AirInteraction::new(
            current_values,
            do_check.clone(),
            InteractionKind::Memory,
        ));
    }

    // recursion
    fn recursion_eval_memory_access&lt;E: Into&lt;Self::Expr> + Clone>(
        &mut self,
        timestamp: impl Into&lt;Self::Expr>,
        addr: impl Into&lt;Self::Expr>,
        memory_access: &impl MemoryCols&lt;E, Block&lt;E>>,
        is_real: impl Into&lt;Self::Expr>,
    ) {
        let is_real: Self::Expr = is_real.into();
        let timestamp: Self::Expr = timestamp.into();
        let mem_access = memory_access.access();

        self.eval_memory_access_timestamp(timestamp.clone(), mem_access, is_real.clone());

        let addr = addr.into();
        let prev_timestamp = mem_access.prev_timestamp.clone().into();
        let prev_values = once(prev_timestamp)
            .chain(once(addr.clone()))
            .chain(memory_access.prev_value().clone().map(Into::into))
            .collect();
        let current_values = once(timestamp)
            .chain(once(addr.clone()))
            .chain(memory_access.value().clone().map(Into::into))
            .collect();

        self.receive(AirInteraction::new(
            prev_values,
            is_real.clone(),
            InteractionKind::Memory,
        ));
        self.send(AirInteraction::new(
            current_values,
            is_real,
            InteractionKind::Memory,
        ));
    }

修复说明

如 PR #789 中所建议的那样修复此问题。

3. [高] `MemoryGlobalChip` 允许为一个内存地址进行多次初始化，打破了内存参数

MemoryGlobalChip 允许初始化和完成内存。在这里，没有检查每一行的内存地址是否不同。

    let main = builder.main();
        let local = main.row_slice(0);
        let local: &MemoryInitCols&lt;AB::Var> = (*local).borrow();

        // Verify that is_initialize and is_finalize are bool and that at most one is true.
        builder.assert_bool(local.is_initialize);
        builder.assert_bool(local.is_finalize);
        builder.assert_bool(local.is_initialize + local.is_finalize);

        builder.send(AirInteraction::new(
            vec![
                local.timestamp.into(),
                local.addr.into(),
                local.value[0].into(),
                local.value[1].into(),
                local.value[2].into(),
                local.value[3].into(),
            ],
            local.is_initialize.into(),
            InteractionKind::Memory,
        ));
        builder.receive(AirInteraction::new(
            vec![
                local.timestamp.into(),
                local.addr.into(),
                local.value[0].into(),
                local.value[1].into(),
                local.value[2].into(),
                local.value[3].into(),
            ],
            local.is_finalize.into(),
            InteractionKind::Memory,
        ));
    }

这允许出现意外行为 - 允许读取 prev_value 的意外值。

我们用一个例子来说明这一点。将固定地址的内存写入为 (value, timestamp)。

初始化 (5, 0) 和 (7, 0)
在时钟 1，从时钟 0 读取先前值 5，然后变为 (5, 0) -> (8, 1)
在时钟 2，从时钟 0 读取先前值 7，然后变为 (7, 0) -> (9, 2)
在时钟 3，从时钟 1 读取先前值 8，然后变为 (8, 1) -> (10, 3)
在时钟 4，从时钟 2 读取先前值 9，然后变为 (9, 2) -> (11, 4)
完成 (10, 3) 和 (11, 4)

这允许我们读取不正确的先前值 - 在时钟 2，直观的先前值是 8，但读取的是 7。请注意，此攻击在每个时钟最多进行一次内存访问时有效。

应该强制每一行的地址都不同 - 这可以通过约束地址通过按位分解在行上增加来实现。请注意，仅在初始化阶段添加此检查就足以解决此问题。

此外，验证者需要强制仅存在一个 MemoryGlobalChip 表，以便不能使用多个表进行双重初始化/完成。

由于在 Core VM 中发现了类似的漏洞，因此可以在此处应用类似的防御措施。

修复说明

如推荐的那样，已在此拉取请求中修复此问题。

4. [严重] 递归 VM 的 CPU 不约束在 LOAD 操作码中加载到寄存器的内存

对于加载操作码，CPU 应该从内存加载值并将其放入与 local.a 对应的寄存器中。但是，加载操作码的唯一检查是内存值在操作码之后不会改变。这意味着没有检查寄存器中的新值是否是来自内存的值本身，从而允许任意值。

    // Constraints on the memory column depending on load or store.
    // We read from memory when it is a load.
    builder.when(local.selectors.is_load).assert_block_eq(
        *memory_cols.memory.prev_value(),
        *memory_cols.memory.value(),
    );
    // When there is a store, we ensure that we are writing the value of the a operand to the memory.
    builder
        .when(local.selectors.is_store)
        .assert_block_eq(*local.a.value(), *memory_cols.memory.value());

我们建议也为 is_load 为真的情况添加检查 a 的值是否为内存的值。更改上面代码中第二个检查的选择器就足够了。

修复说明

如拉取请求 #789 中所建议的那样添加了推荐的修复。

5. [高] 对于跳转操作码，递归 VM 的 `op_a` 寄存器值和 `fp` 值受到约束不足

以下是跳转指令的代码。

    pub fn eval_jump&lt;AB>(
        &self,
        builder: &mut AB,
        local: &CpuCols&lt;AB::Var>,
        next: &CpuCols&lt;AB::Var>,
        next_pc: &mut AB::Expr,
    ) where
        AB: SP1RecursionAirBuilder&lt;F = F>,
    {
        // Verify the next row's fp.
        builder
            .when_first_row()
            .assert_eq(local.fp, F::from_canonical_usize(STACK_SIZE));
        let not_jump_instruction = AB::Expr::one() - self.is_jump_instruction::&lt;AB>(local);
        let expected_next_fp = local.selectors.is_jal * (local.fp + local.c.value()[0])
            + local.selectors.is_jalr * local.a.value()[0]
            + not_jump_instruction * local.fp;
        builder
            .when_transition()
            .when(next.is_real)
            .assert_eq(next.fp, expected_next_fp);

        // Add to the `next_pc` expression.
        *next_pc += local.selectors.is_jal * (local.pc + local.b.value()[0]);
        *next_pc += local.selectors.is_jalr * local.b.value()[0];
    }

将其与递归 VM 的运行时行为进行比较。

    Opcode::JAL => {
        self.nb_branch_ops += 1;
        let (a_ptr, b_val, c_offset) = self.alu_rr(&instruction);
        let a_val = Block::from(self.pc);
        self.mw_cpu(a_ptr, a_val, MemoryAccessPosition::A);
        next_pc = self.pc + b_val[0];
        self.fp += c_offset[0];
        (a, b, c) = (a_val, b_val, c_offset);
    }
    Opcode::JALR => {
        self.nb_branch_ops += 1;
        let (a_ptr, b_val, c_val) = self.alu_rr(&instruction);
        let a_val = Block::from(self.pc + F::one());
        self.mw_cpu(a_ptr, a_val, MemoryAccessPosition::A);
        next_pc = b_val[0];
        self.fp = c_val[0];
        (a, b, c) = (a_val, b_val, c_val);
    }

我们看到以下差异，应相应地进行修复。

a 值应使用与 pc 相关的相关值进行设置
JALR 的预期下一个 fp 值为 c_val[0]，而不是 a_val[0]

修复说明

这两个差异已在拉取请求 #789 中修复。

6. [中] 递归 VM 的 `BNEINC` 操作码对新写入寄存器 `a` 的值约束不足

    // If the instruction is a BNEINC, verify that the a value is incremented by one.
    builder
        .when(local.is_real)
        .when(local.selectors.is_bneinc)
        .assert_eq(local.a.value()[0], local.a.prev_value()[0] + one.clone());

BNEINC 递增寄存器 a 中的值，并根据 a 中的新值是否与 op_b_value 匹配进行分支。但是，请注意，在上面的代码中仅约束了 a.value()[0] 部分。缺少一个约束，即 a.value()[1..4] 等于 a.prev_value()[1..4]，应将其添加到电路中。

我们还注意到关于 BNEINC 的注释不正确 - 实际上，操作码使用寄存器 a 中的新值，而不是如下面注释中所述的先前值。

    // Convert operand values from Block&lt;Var> to BinomialExtension&lt;Expr>.  Note that it gets the
    // previous value of the `a` and `b` operands, since BNENIC will modify `a`.
    let a_ext: BinomialExtension&lt;AB::Expr> =
        BinomialExtensionUtils::from_block(local.a.value().map(|x| x.into()));
    let b_ext: BinomialExtension&lt;AB::Expr> =
        BinomialExtensionUtils::from_block(local.b.value().map(|x| x.into()));

我们建议如上所述添加约束，并修复注释。

修复说明

如建议的那样，已在拉取请求 #789 中修复此问题。

7. [严重] 当 `CpuChip` 中的 `is_real` 为零时，应约束所有选择器为零且 `imm_b = imm_c = 1`

请注意，预处理的程序信息是从 ProgramChip 中获取的，在 CpuChip 中具有多重性 is_real。这意味着在 is_real = 0 时，该行的程序信息可以是任意的，并且未进行预处理。

    // Constrain the program.
    builder.send_program(local.pc, local.instruction, local.selectors, local.is_real);

因此，在这种情况下，应避免将任何系统调用或查找作为交互发送出去。通过强制所有选择器都为零，并且 imm_b 和 imm_c 等于 1，可以很容易地实现这一点。这将设置当 is_real = 0 时，所有内存访问和查找都以多重性零进行，如预期的那样。请注意，如果没有此附加约束，我们可以使用 is_real = 0 的行在任意时钟上进行任意内存访问。

修复说明

已在此处添加。请注意，selector 的 is_noop 已被启用为真，这是正确的。

8. [信息性] ALU 除法允许 `0/0` 为任意值

    // For div operation, we assert that b == a * c (equivalent to a == b / c).
    builder
        .when(local.selectors.is_div)
        .assert_ext_eq(b_ext, a_ext * c_ext);

为了检查 a == b / c，电路只会检查 b == a * c。这在通常情况下有效，但这允许当 b = c = 0 时，a 为任意值。虽然这本身不是一个严重的漏洞，但递归 VM 的所有用例都应该意识到这种行为。

修复说明

作者通过在此处添加注释来表示已确认。

9. [中] 某些选择器未在 `is_op_a_read_only_instruction` 中考虑

某些指令仅将寄存器 op_a 用作只读，而某些指令则使用它来写入寄存器。只读行为检查如下。

    // If the instruction only reads from operand A, then verify that previous and current values are equal.
    let is_op_a_read_only = self.is_op_a_read_only_instruction::&lt;AB>(local);
    builder
        .when(is_op_a_read_only)
        .assert_block_eq(*local.a.prev_value(), *local.a.value());

我们快速查看每个指令。

所有 ALU 显然都写入 op_a，因此它们不是只读的
对于分支，BEQ 和 BNE 是只读的，但 BNEINC 递增 op_a
堆扩展与 ADD 指令一起完成，因此不是只读的
对于跳转，指令将与 pc 相关的值写入 op_a
对于内存，LOAD 将内存值写入寄存器，但 STORE 将寄存器值写入内存，因此 STORE 是只读的，而 LOAD 不是只读的
系统调用使用 op_a 作为只读
COMMIT、TRAP、HALT 应该是只读的

因此，op_a 只读指令是 BEQ、BNE、STORE、所有系统调用、COMMIT、TRAP、HALT。但是，似乎缺少了新添加的指令。

    /// Expr to check for instructions that only read from operand `a`.
    pub fn is_op_a_read_only_instruction&lt;AB>(&self, local: &CpuCols&lt;AB::Var>) -> AB::Expr
    where
        AB: SP1RecursionAirBuilder&lt;F = F>,
    {
        local.selectors.is_beq
            + local.selectors.is_bne
            + local.selectors.is_fri_fold
            + local.selectors.is_poseidon
            + local.selectors.is_store
            + local.selectors.is_noop
            + local.selectors.is_ext_to_felt
    }

通常，我们建议查看新添加的指令的只读属性。

修复说明

已在此处的 is_op_a_read_only_instruction 中添加了新指令。

10. [高] `clk` 和 `pc` 未在 `CpuChip` 中初始化

Recursion VM CpuChip 中未约束 clk 和 pc 的第一行。这与例如 Recursion VM 的 fp 不同，后者在 air/jump.rs 中受到约束。

 // Verify the next row's fp.
    builder
        .when_first_row()
        .assert_eq(local.fp, F::from_canonical_usize(STACK_SIZE));

此外，请注意 clk 和 pc 通过以下方式在 Core VM 中受到约束

对于 clk，明确的是第一行的 clk == 0
对于 pc，约束是第一行将公共输入的 pc 作为程序计数器

我们建议为 clk 和 pc 的第一行添加适当的约束。

修复说明

现在，初始 clk 和 pc 都被约束为零此处。

11. [高] MultiBuilder 对堆叠表的首行和末行的处理不正确，导致潜在的健全性中断

MultiChip 旨在在同一个表中“堆叠” FriFoldChip 和 Poseidon2Chip。为此，它有两个布尔列 is_fri_fold 和 is_poseidon2，表示该行是否属于 FriFold 块或 Poseidon2 块。之后，它约束这些块依次为 FriFold 块，然后是 Poseidon2 块，最后是一个仅为填充的块。FriFoldChip 的约束系统通过 MultiBuilder 调用，MultiBuilder 使用 is_fri_fold 作为当前行是否为要约束的“实际行”。此外，对内存访问和系统调用读取进行修改，使其仅在启用 is_fri_fold 时才启用。对 Poseidon2Chip 使用了类似的策略。

    let mut sub_builder =
            MultiBuilder::new(builder, local.is_fri_fold.into(), next.is_fri_fold.into());
    let fri_columns_local = local.fri_fold();
    sub_builder.assert_eq(
        local.is_fri_fold * FriFoldChip::&lt;3>::do_memory_access::&lt;AB::Var>(fri_columns_local),
        local.fri_fold_memory_access,
    );
    sub_builder.assert_eq(
        local.is_fri_fold * FriFoldChip::&lt;3>::do_receive_table::&lt;AB::Var>(fri_columns_local),
        local.fri_fold_receive_table,
    );
    let fri_fold_chip = FriFoldChip::&lt;3>::default();
    fri_fold_chip.eval_fri_fold(
        &mut sub_builder,
        local.fri_fold(),
        next.fri_fold(),
        local.fri_fold_receive_table,
        local.fri_fold_memory_access,
    );

    // similar for Poseidon2...

MultiBuilder 的工作方式如下。构建器接收一个 local_condition，指示当前行是否为“真实”，以及一个 next_condition，指示下一行是否为“真实”。然后，

is_first/last_row: 使用与通用构建器相同的 is_first/last_row，并且必须启用 local_condition 才能实际放置约束
is_transition_window: 使用与通用构建器相同的 is_transition_window，并且必须启用 local_condition 和 next_condition 才能放置约束
assert_zero: 需要 local_condition 为真才能放置约束

在这里，虽然大多数约束都可以，但问题出现在 is_first/last_row 的情况下。例如，考虑 Poseidon2Chip 的第一行。在这里，Poseidon2Chip 可能对第一行有约束（实际上，在修复漏洞 #1 之后，确实如此）以进行初始化。但是，当通过 MultiChip 中的 MultiBuilder 进行约束时，可能不会放置此约束。实际上，如果顶部有一个非空的 FriFoldChip 堆栈，则第一行的条件将不会放置，因为 is_poseidon2 在第一行将为零。

这与预期行为不同，预期行为是 Poseidon2Chip 的 is_first_row 约束适用于 Poseidon2Chip 堆栈的第一行。 FriFoldChip 堆栈的 is_last_row 也适用类似的思想 - 这样的约束将需要该行是整个表的最后一行才能放置。

我们建议

删除 MultiChip 架构，仅单独使用两个表
在 MultiChip 中添加更多列以直接约束正确的堆叠
让 is_first_row、is_last_row 显式发送到 MultiBuilder

对于第三种情况，我们建议发送以下值。

FriFoldChip

is_first_row: is_first_row * is_fri_fold
is_last_row
- 情况 1：我们在过渡窗口中，下一行不是 FriFold
- is_fri_fold * (1 - next.is_fri_fold)
- 情况 2：我们在最后一行，这是 FriFold
- is_fri_fold

Poseidon2Chip

is_first_row: 这一项非常棘手，因为你需要以某种方式访问前一行
- 为此创建一个新列，称其为 start。按如下方式约束它
- 在第一行，start == is_poseidon2
- 在过渡窗口中，next.start == is_fri_fold * next.is_poseidon2
is_last_row: 与 FriFoldChip 的 is_last_row 类似

修复说明

这些修复已按建议在拉取请求 #997 中实现。

12. [高] `FriFoldChip` 允许不正确的行为，因为系统调用读取和内存访问未正确连接

FriFoldChip 处理需要可变行数的系统调用。为了处理这个问题，使用了两列，我们将其表示为 is_real 和 is_last_iteration。我们注意到，在修复漏洞 #11 的情况下，如果在 FriFoldChip 堆栈上，则 memory_access 等于 is_real 且 receive_table 等于 is_last_iteration。此外，如果我们不在 FriFoldChip 堆栈上，则 memory_access 和 receive_table 都将被强制为零，因此不会执行任何系统调用或内存访问，正如我们所希望的那样。在这里，我们假设漏洞 #11 已修复。

    pub const fn do_receive_table&lt;T: Copy>(local```
// 确保 FRI FOLD 调用的所有行都具有相同的 input_ptr、clk 和 sequential m 值。
    builder
        .when_transition()
        .when_not(local.is_last_iteration)
        .when(next.is_real)
        .assert_eq(next.m, local.m + AB::Expr::one());
    builder
        .when_transition()
        .when_not(local.is_last_iteration)
        .when(next.is_real)
        .assert_eq(local.input_ptr, next.input_ptr);
    builder
        .when_transition()
        .when_not(local.is_last_iteration)
        .when(next.is_real)
        .assert_eq(local.clk + AB::Expr::one(), next.clk);

我们在下面概述了各种攻击思路，并给出了具体的防御约束列表。

首先，存在实际上没有读取任何 syscall 但正在进行内存访问的情况。可以设置 is_last_iteration 为零，这样就不会读取任何 syscall，然后以不正确的方式放置 is_real 值和其他列值。在这里，还可以注意到，实际上没有对 is_real 施加任何约束，因此该值可以在零和一之间任意变化。

也可以从不同的角度来看待这个问题——请注意，这也可以看作是以下事实：没有检查一个所有 is_real = 1 的正在进行的块是否会在某个时候被最终确定。

此外，还存在这样一种情况：is_last_iteration 确实是 1，但 is_real 只是零——因此实际上没有进行内存访问。我们再次注意到 is_real 没有受到约束。

为了解决这个问题，目标应该是：

is_last_iteration == 1 应该让所有相关行都具有 is_real == 1
具有 is_last_iteration == 1 的块内的任何行都不应具有 is_real == 1

为此，我们建议添加以下约束。

1: local.is_last_iteration、local.is_real 都是布尔值
2: local.is_last_iteration == 0 意味着 local.is_real == next.is_real
3: local.is_last_iteration == 1 意味着 local.is_real == 1
4: local.is_real == 0 意味着 next.is_real == 0
5: 在最后一行，我们有 local.is_real == 0 或 is_last_iteration == 1

我们注意到 #1 已经完成 (请注意，由于 issue #2 的修复，is_real 是布尔值)。

基本上，这个想法是让所有的 is_real = 1 行像往常一样在顶部，通过约束 #4 完成。首先，我们必须约束 is_last_iteration 在底部的填充行中为零。这是通过约束 #3 完成的。现在，剩下的就是检查在最后的非填充行上，is_last_iteration 必须为真。如果这个最后的非填充行不是最后一行，约束 #2 足以约束这个事实。如果最后的非填充行只是最后一行，那么约束 #5 足以约束这个事实。完整性是相似的。

修复说明

正如推荐的那样，这已在 pull request #946 中修复。

原文链接： github.com/succinctlabs/...

登链社区 AI 助手，为大家转译优秀英文文章，如有翻译不通的地方，还请包涵～

本文参与登链社区写作激励计划，好文好收益，欢迎正在阅读的你也加入。

SP1 审计报告 - 递归虚拟机

SP1 审计报告 - 递归 VM

概要

提交审计的代码库

审计时间表

发现

1. [严重] `poseidon2/external` 允许在任意位置进行内存写入，并且哈希值也受到约束不足

修复说明

2. [低] `recursion` 的 `eval_memory_access` 应约束 `is_real` 为布尔值，就像在 `core` 中所做的那样

修复说明

3. [高] `MemoryGlobalChip` 允许为一个内存地址进行多次初始化，打破了内存参数

修复说明

4. [严重] 递归 VM 的 CPU 不约束在 LOAD 操作码中加载到寄存器的内存

修复说明

5. [高] 对于跳转操作码，递归 VM 的 `op_a` 寄存器值和 `fp` 值受到约束不足

修复说明

6. [中] 递归 VM 的 `BNEINC` 操作码对新写入寄存器 `a` 的值约束不足

修复说明

7. [严重] 当 `CpuChip` 中的 `is_real` 为零时，应约束所有选择器为零且 `imm_b = imm_c = 1`

修复说明

8. [信息性] ALU 除法允许 `0/0` 为任意值

修复说明

9. [中] 某些选择器未在 `is_op_a_read_only_instruction` 中考虑

修复说明

10. [高] `clk` 和 `pc` 未在 `CpuChip` 中初始化

修复说明

11. [高] MultiBuilder 对堆叠表的首行和末行的处理不正确，导致潜在的健全性中断

修复说明

12. [高] `FriFoldChip` 允许不正确的行为，因为系统调用读取和内存访问未正确连接

1: `local.is_last_iteration`、`local.is_real` 都是布尔值

2: `local.is_last_iteration == 0` 意味着 `local.is_real == next.is_real`

3: `local.is_last_iteration == 1` 意味着 `local.is_real == 1`

4: `local.is_real == 0` 意味着 `next.is_real == 0`

5: 在最后一行，我们有 `local.is_real == 0` 或 `is_last_iteration == 1`

修复说明

0 条评论

文章目录

SP1 审计报告 - 递归虚拟机

SP1 审计报告 - 递归 VM

概要

提交审计的代码库

审计时间表

发现

1. [严重] poseidon2/external 允许在任意位置进行内存写入，并且哈希值也受到约束不足

修复说明

2. [低] recursion 的 eval_memory_access 应约束 is_real 为布尔值，就像在 core 中所做的那样

修复说明

3. [高] MemoryGlobalChip 允许为一个内存地址进行多次初始化，打破了内存参数

修复说明

4. [严重] 递归 VM 的 CPU 不约束在 LOAD 操作码中加载到寄存器的内存

修复说明

5. [高] 对于跳转操作码，递归 VM 的 op_a 寄存器值和 fp 值受到约束不足

修复说明

6. [中] 递归 VM 的 BNEINC 操作码对新写入寄存器 a 的值约束不足

修复说明

7. [严重] 当 CpuChip 中的 is_real 为零时，应约束所有选择器为零且 imm_b = imm_c = 1

修复说明

8. [信息性] ALU 除法允许 0/0 为任意值

修复说明

9. [中] 某些选择器未在 is_op_a_read_only_instruction 中考虑

修复说明

10. [高] clk 和 pc 未在 CpuChip 中初始化

修复说明

11. [高] MultiBuilder 对堆叠表的首行和末行的处理不正确，导致潜在的健全性中断

修复说明

12. [高] FriFoldChip 允许不正确的行为，因为系统调用读取和内存访问未正确连接

1: local.is_last_iteration、local.is_real 都是布尔值

2: local.is_last_iteration == 0 意味着 local.is_real == next.is_real

3: local.is_last_iteration == 1 意味着 local.is_real == 1

4: local.is_real == 0 意味着 next.is_real == 0

5: 在最后一行，我们有 local.is_real == 0 或 is_last_iteration == 1

修复说明

0 条评论

文章目录

1. [严重] `poseidon2/external` 允许在任意位置进行内存写入，并且哈希值也受到约束不足

2. [低] `recursion` 的 `eval_memory_access` 应约束 `is_real` 为布尔值，就像在 `core` 中所做的那样

3. [高] `MemoryGlobalChip` 允许为一个内存地址进行多次初始化，打破了内存参数

5. [高] 对于跳转操作码，递归 VM 的 `op_a` 寄存器值和 `fp` 值受到约束不足

6. [中] 递归 VM 的 `BNEINC` 操作码对新写入寄存器 `a` 的值约束不足

7. [严重] 当 `CpuChip` 中的 `is_real` 为零时，应约束所有选择器为零且 `imm_b = imm_c = 1`

8. [信息性] ALU 除法允许 `0/0` 为任意值

9. [中] 某些选择器未在 `is_op_a_read_only_instruction` 中考虑

10. [高] `clk` 和 `pc` 未在 `CpuChip` 中初始化

12. [高] `FriFoldChip` 允许不正确的行为，因为系统调用读取和内存访问未正确连接

1: `local.is_last_iteration`、`local.is_real` 都是布尔值

2: `local.is_last_iteration == 0` 意味着 `local.is_real == next.is_real`

3: `local.is_last_iteration == 1` 意味着 `local.is_real == 1`

4: `local.is_real == 0` 意味着 `next.is_real == 0`

5: 在最后一行，我们有 `local.is_real == 0` 或 `is_last_iteration == 1`