⚠️ Draft Standards Track: Core

EIP-7748: 状态转换为 Verkle 树

描述了一种状态转换过程，用于将键值从现有的 Merkle Patricia Tree 迁移到 Verkle Tree。

Authors	Guillaume Ballet (@gballet), Ignacio Hagopian (@jsign), Gajinder Singh (@g11tech), Ansgar Dietrichs (@adietrichs), Gottfried Herold (@GottfriedHerold), Jamie Lokier (@jlokier), Tanishq Jasoria (@tanishqjasoria), Parithosh Jayanthi (@parithosh), Gabriel Rocheleau (@gabrocheleau), Karim Taam (@matkt)
Created	2024-07-23
Discussion Link	https://ethereum-magicians.org/t/eip-7748-state-conversion-to-verkle-tree/20625
Requires	EIP-7612

摘要

本 EIP 提出了一种过程，用于在每个区块上将固定数量的键值从现有的 Merkle Patricia Tree (MPT) 转换为 Verkle Tree (VKT)。

动机

账户状态太大，无法等待交易通过 Overlay Tree 有机地将其全部移动到 VKT。因此，我们需要一种策略，在合理的时间内转换所有状态。状态转换完成后，可以删除 EIP-7612 中引入的 Overlay Tree 抽象，并直接使用 VKT 进行所有状态访问。

规范

本文档中的关键词 “MUST”、“MUST NOT”、“REQUIRED”、“SHALL”、“SHALL NOT”、“SHOULD”、“SHOULD NOT”、“RECOMMENDED”、“NOT RECOMMENDED”、“MAY” 和 “OPTIONAL” 按照 RFC 2119 和 RFC 8174 中的描述进行解释。

常量

参数	值	描述
`CONVERSION_START_TIMESTAMP`	`TBD`	转换开始的时间戳。
`CONVERSION_STRIDE`	`TBD`	每个区块要转换的最大_转换单元_数

一个 转换单元 是：

一个合约存储槽。
一个账户数据（例如，余额、nonce、code-hash）和代码（如果有）。

执行规范的变更

在现有的 apply_body(...) 函数中包含以下代码：

def apply_body(state: State, ...) -> Tuple[Uint, Root, Root, Bloom, State, Root]:
    ...
    # <new_code>
    if block_time >= CONVERSION_START_TIMESTAMP and not state._conversion_finished:
        block_state_conversion(state, CONVERSION_STRIDE)
    # </new_code>
    
    for i, tx in enumerate(map(decode_transaction, transactions)):
        ...
    ...

在执行 txs 之前，它会调用 block_state_conversion(...)（如下所述），该函数为此区块执行状态转换步骤。

在 state.py 中，添加以下代码：

@dataclass
class StoragePhase:
    """
    账户的下一个转换步骤继续转换存储槽，
    其键大于等于 next_key。
    如果没有这样的存储槽，则账户必须移动到
    AccountDataPhase。
    """
    next_key: Bytes   
   
@dataclass
class AccountDataPhase:
    """
    账户的下一个转换步骤继续迁移账户
    代码（如果有）和基本数据。处理后，账户必须
    移动到 trie 中的下一个账户（如果它是
    最后一个）。
    """
    pass

@dataclass
class CurrentConvertingAccount:
    """
    包含状态转换的下一步骤。
    """
    address: Address
    phase : StoragePhase | AccountDataPhase

这些新结构允许 State 跟踪我们在转换过程中的位置。

通过添加以下属性来修改 State 类：

@dataclass
class State:
    # <new_code>
    _conversion_curr_account: Optional[CurrentConvertingAccount] = None
    _conversion_finished: bool = False
    # </new_code>
    ...
    

定义一个具有以下签名的函数：

def trie_get_next_at_key(trie: Trie[K, V], key_seek: Bytes) -> (K, V, Optional[Bytes]):
    # 返回 trie 中第一个 (key, value)，其中 trie-key >= key_seek。
    # 此方法只能在 secured=True 的 Trie 上使用，
    # 因为 key_seek 是 keccak256(K)。
    # 
    # 返回：
    # - K, V: 键和值（例如：地址/值，存储槽/值）
    # - next_key: trie 中存在的最小 trie-key，大于
    # key_seek，如果没有，则为 None。
    #
    # 由实现者决定最佳实现方式
    # 考虑到其客户端架构。

添加或修改以下函数：

# 新函数。
def get_conversion_account(state: State) -> CurrentConvertingAccount:
    # 启动转换时，使用 MPT 中的第一个账户初始化。
    if state._conversion_curr_account is None:
        # 使用账户 trie 中的第一个账户初始化。
        first_account = trie_get_next_at_key("0x0")
        # 账户转换从存储槽转换开始。
        phase = StoragePhase("0x0") # 从最低的存储槽键开始。
        state._conversion_curr_account = CurrentConvertingAccount(first_account, phase)
    
    return state._conversion_curr_account

# 新函数。
def conversion_move_to_next_account(state: State):
    curr_account = state.get_conversion_account()
    address, _, next_key = trie_get_next_at_key(state._main_trie, curr_account.phase.next_key)
    if next_key is None:
        # 我们完成了转换
        state._conversion_finished = True
    else:
        # 移动到下一个账户
        state._conversion_curr_account.address = address
        state._conversion_curr_account.phase = StoragePhase("0x00")

# 修改后的函数：添加新的 only_if_empty 可选参数。
def set_storage(
    state: State, addr: Address, key: Bytes, value: U256, only_if_empty: bool = True
) -> None:
    # <new_code>
    if only_if_empty:
        value = state._overlay_tree.get(get_tree_key_for_storage_slot(addr, key))
        if value is not None:
            return
    # </new_code>
    
    state._overlay_tree.set(get_tree_key_for_storage_slot(addr, key), value)

如前所述，下一个函数由 apply_body(...) 调用，以执行区块的转换步骤：

# 请注意，以下函数针对可读性进行了优化，而不是针对性能进行了优化。
def state_convert(state: State, stride: int):
    n = 0    
    while n < stride and not state._conversion_finished:
        curr_account = state.get_conversion_account()
        
        # 如果账户具有空的 code hash，则跳过翻译存储。
        # 跳过 nonce 为 0 且代码为空的账户的存储转换。
        # 这涵盖了 28 个 EIP-7610 账户，但也涵盖了所有
        # 其他链上的 pre-eip161 账户。
        if cur_account.nonce == 0 and len(cur_account.code) == 0:
            state._conversion_curr_account.phase = AccountDataPhase()
        
        # 账户存储。
        if curr_account.phase is StoragePhase:
            # 从 _storage_tries 中获取存储槽，它是 MPT 数据。
            trie = state._storage_tries.get(curr_account.address)
            
            if trie is not None:
                slot_num, slot_value, next_key = trie_get_next_at_key(trie, curr_account.phase.next_key)
                # Overlay Tree 将写入 VKT。我们使用新的
                # only_if_empty 参数来避免写入过时的值。
                set_storage(state, curr_account.address, slot_num, slot_value, only_if_empty=True)
                n += 1
    
                if next_key is not None:
                    # 还有更多的存储槽要转换，在此阶段继续。
                    state.conversion_curr_account.phase.next_key = next_key
                else:
                    # 没有更多的存储槽。移动到账户数据迁移。
                    state.conversion_curr_account.phase = AccountDataPhase()
            else:
                # 账户没有存储 trie，直接移动到
                # 迁移账户的数据和代码（如果有）。
                state.conversion_curr_account.phase = AccountDataPhase()
        # 账户代码和基本数据。
        else:
            # 从 Overlay Tree 获取代码是可以的，因为它承诺返回
            # 账户的完整代码，该代码将来自 MPT 或单独的代码数据库。
            account = get_account(state, curr_account.address)
            chunked_code = chunkify_code(account.code)
            
            for chunk_num in range(len(chunked_code)):
                state_set_codechunk(state, address, chunk_num, chunked_code[chunk_num])
                
            # 如果账户数据（即：nonce、balance、code-size、code-hash）存在于 MPT 中，
            # get_account 将从 MPT 中提取，然后我们写入到 VKT。如果账户
            # 数据已经存在于 VKT 中（即：它是由 tx 间接转换的），那么
            # 它将从 VKT 返回并再次写入（即：它是 noop）。
            # 因此，此操作在两种情况下都是正确的。也就是说，它不会
            # 写入过时的数据。
            account = get_account(state, curr_account.address)
            set_account(state, curr_account.address, account)
            n += 1
    
            state.conversion_move_to_next_account()

理由

区块执行中的状态转换步骤位置

在区块 txs 执行之前执行转换步骤有一些好处：

如果状态转换步骤在 txs 执行之后完成，则 txs 执行可能会与转换后的键值重叠，从而必须注意它们在同一区块中变得过时。通过提议的排序，它们只能因先前区块的写入而变得过时。
它可以降低优化的复杂性，例如在下一个区块到达之前抢先执行下一个区块的状态转换。

`CONVERSION_STRIDE` 建议值

进行了性能基准测试，以在以下方面实现适当的平衡：

不要让客户端承担太多额外的区块工作。
不要在可行的长时间重组期间在客户端中创建无法管理的负载。
尽快完成转换。

账户代码分块在一个步骤中完成

如果一个账户有代码，则将其分块并一次性插入到 VKT 中。另一种方法是包括一个 CodePhase 并让每个插入的块消耗一个 CONVERSION_STRIDE 单位。

我们决定不这样做是为了降低算法的复杂性。考虑到当前的最大代码大小，一个区块的最坏情况可能会使 CONVERSION_STRIDE 限制溢出 24k/31~=793 个单位。

删除 EIP-7610 账户

EIP-7610 提到存在 28 个账户，它们具有：

nonce 为 0，
没有代码
非空的存储树

这些存储槽将从转换过程中省略，因为它们无法访问。

预计完成转换的时间

TODO：我们有一个估计值，但可能值得在更接近提议的分叉时重新计算它，以获得更最新的状态大小估计。

错过槽

转换逻辑在每个区块的开始运行，因此错过的槽不会产生特殊情况。

账户存储 -> 账户数据顺序

提出的顺序与许多 EL 客户端的 flat-db 架构协同作用，从而最大限度地减少了随机磁盘 IO。

不计算 EIP-161 账户的 `CONVERSION_STRIDE` 限制

CONVERSION_STRIDE 参数试图限制有效写入的负载。由于我们尝试批量 EIP-158 删除剩余的账户，因此会跳过这些特殊账户。

这听起来可能很危险，因为如果存在 1k 个此类账户，并且所有账户都对应在同一区块中进行转换，那么我们将强制客户端迭代 1k 个账户，而不计算 CONVERSION_STRIDE 中的任何配额。要删除的剩余账户数量非常少（即：几十个）并且也不连续，因此这不应该是一个问题。

MPT preimage 解析

预计 EL 客户端至少满足以下条件之一：

它们具有适当的 flat-db 设计，不需要 preimage 解析。
它们具有可以解析 trie_key->preimage 的完整 preimage 数据库（但这可能具有较差的性能）。
它们已下载将在转换开始之前分发的 preimage 数据库映像。

向后兼容性

未发现向后兼容性问题。

测试用例

TODO：目前在外部文档中描述。

参考实现

github.com/gballet/go-ethereum 中的 transition-post-genesis 分支在命令行中将 --override.overlay-stride 设置为非零值时实现了这一点。

安全考虑

需要讨论。

版权

通过 CC0 放弃版权和相关权利。

Citation

Please cite this document as:

Guillaume Ballet (@gballet), Ignacio Hagopian (@jsign), Gajinder Singh (@g11tech), Ansgar Dietrichs (@adietrichs), Gottfried Herold (@GottfriedHerold), Jamie Lokier (@jlokier), Tanishq Jasoria (@tanishqjasoria), Parithosh Jayanthi (@parithosh), Gabriel Rocheleau (@gabrocheleau), Karim Taam (@matkt), "EIP-7748: 状态转换为 Verkle 树 [DRAFT]," Ethereum Improvement Proposals, no. 7748, July 2024. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-7748.