Cosmos安全：Otter 的指南

osecio
发布于 2025-06-10 17:12
阅读 916

本文深入探讨了Cosmos SDK开发中常见的安全问题，包括无限循环、Map的非确定性、AnteHandler的误用、以及存储键冲突等，并提供了实际案例和可操作的建议，旨在帮助开发者构建更安全的基于Cosmos的项目。文章强调了在Cosmos开发中需要开发者对安全问题有充分的认识，并详细介绍了开发者容易忽略的各种安全漏洞，以及相应的防范措施。

从无限循环和 map 确定性到 AnteHandler 的失误和存储键冲突，我们重点介绍了真实世界的漏洞以及构建更安全的基于 Cosmos 的项目的可行建议。

![Cosmos 安全：水獭指南的标题图片](https://img.learnblockchain.cn/2025/06/25/title.png)

## 简介

Cosmos SDK 是一个面向开发者的 "L1 工具包"。它提供了一个开源工具，增强了构建特定应用 L1 链的能力，同时优先考虑对整个运行时环境的灵活性和控制。不幸的是，有了 Cosmos SDK 的便利性，安全性可能会被忽视。

在这篇综合博文中，我们分解了开发者经常忽视的安全问题，并提供了来自实际项目的例子。我们的目标是提供对安全漏洞的实际探索，同时提供关于开发者如何识别和解决这些问题的见解。

## 循环开始了

使用 SDK 构建特定应用的 L1 和在已建立的 L1 链上构建合约存在显著差异。尤其重要的是要认识到，维护区块链的稳定性取决于开发者。

下面，我们开始演示使用 Solidity 编写智能合约与使用 Cosmos SDK 开发 L1 之间的区别。

下面是一个简单的示例供参考：

```
function sumWithStride(
    uint64 start,
    uint64 stride,
    uint64[] memory arr
) public returns (uint64) {
    uint64 idx = start;
    uint64 sum = 0;
    uint64 end = arr.length;

while (idx < end) {
        sum += arr[idx];
        idx += stride;
    }
    return sum;
}

```

```
type MsgSumWithStrideParams struct {
    Start uint64
    Stride uint64
    Arr []uint64
}

type MsgSumWithStrideResponse struct {
    Sum uint64
}

func (ms msgServer) SumWithStride(
    goCtx context.Context,
    msg *MsgSumWithStrideParams,
) (*MsgSumWithStrideResponse, error) {
    sum := uint64(0)
    end := uint64(len(msg.Arr))
    for idx := msg.Start; idx < end; idx += msg.Stride {
        sum += msg.Arr[idx]
    }
    return &MsgSumWithStrideResponse{Sum: sum}, nil
}

```

提供的 Solidity / Cosmos 代码片段具有一个公共函数，该函数使用提供的起始 `idx` 和 `stride` 计算数组的总和。至关重要的是要注意此函数缺乏健壮性。敏锐的观察者可能已经发现，如果用户提供步幅值为 0，则代码将导致无限循环。

虽然无限循环对于 Solidity 来说并不理想，但它可能仍然是可以容忍的。智能合约在其上运行的底层区块链负责监控 gas 和计算预算。它将在特定时间点干预并终止执行。有趣的是，这些类型的 "未处理错误" 模式在合约中非常常见。

但是，相同的逻辑并不直接适用于 Cosmos。在 Cosmos 中，用户负责实现整个 L1，并且没有自动停止代码执行的底层计算预算跟踪器。结果，任何潜在的逻辑 DoS 或无限循环都可能直接导致自定义 Cosmos L1 链停止或停顿。

这个玩具般的场景捕捉了 Cosmos 中对错误处理、边界情况和整体健壮性关注的重要性。

### 真实世界的例子

现在，让我们检查一些真实世界的实例。

在 [这个](https://learnblockchain.cn/article/17317) `CosmWasm` 漏洞的案例中，辅助方法 `write_to_contract` 疏忽地调用了不受信任的 Wasm 函数 `"allocate"`。

[代码片段的永久链接](https://github.com/CosmWasm/cosmwasm/blob/db426f9b15eabf18359df62878847bbaa7cb85ef/packages/vm/src/imports.rs#L409)

```
fn write_to_contract<A: BackendApi, S: Storage, Q: Querier>(
    env: &Environment<A, S, Q>,
    input: &[u8],
) -> VmResult<u32> {
    let out_size = to_u32(input.len())?;
    let result = env.call_function1("allocate", &[out_size.into()])?;
    let target_ptr = ref_to_u32(&result)?;
    if target_ptr == 0 {
        return Err(CommunicationError::zero_address().into());
    }
    write_region(&env.memory(), target_ptr, input)?;
    Ok(target_ptr)
}

```

由于用户完全可以控制 `allocate`，因此可以通过其他导入的函数重复回调 `write_to_contract`。这可能导致主机堆栈耗尽，最终导致 DoS。

其他真实世界的例子包括 [没有为格式错误的 txs 返回正确的值](https://github.com/cosmos/cosmos-sdk/issues/16676)。

## 秩序是人类的梦想

与作为智能合约的特定领域语言的 solidity 不同，Golang 不是。因此，开发者必须注意特定的潜在问题。一个值得注意的例子是非确定性。

考虑一个场景，其中需要为 map 中的每个条目发出一个事件。可能很想按如下所示实现此目的：

```
type ObjectMap map[string]string

func EmitEntries(objectMap ObjectMap) {
    for key, value := range objectMap {
        ctx.EventManager.EmitEvent(
            sdk.NewEvent(
                "MapContext",
                sdk.NewAttribute(key, value),
            )
        )
    }
}

```

重要的是要注意，Golang map 迭代器在设计上是无序的。正如 Golang 文档引用中下面所述，使用不同的验证器运行相同的代码可能会导致不同的事件顺序，从而可能导致共识问题。

> 使用 range 循环迭代 map 时，迭代顺序未指定，并且不能保证每次迭代都相同。

为了正确实现迭代顺序，开发者必须显式地对 `map` 的键进行排序，然后在使用排序后的键数组发出值之前获取值。

```
type ObjectMap map[string]string

func EmitEntries(objectMap ObjectMap) {
    var keys []string
    for key := range objectMap {
        keys = append(keys, key)
    }
    sort.Strings(keys)

for _, key := range keys {
        ctx.EventManager.EmitEvent(
            sdk.NewEvent(
                "MapContext",
                sdk.NewAttribute(key, objectMap[key]),
            )
        )
    }
}

```

外部 Golang 依赖项中隐藏的代码组合使得完全避免语言方面的怪癖变得困难。至关重要的是保持警惕，不要低估此类挥之不去的 bug 的严重性。

### 真实世界的例子

可以在 [此处](https://github.com/cosmos/cosmos-sdk/pull/12487) 找到导致确定性问题的 `map` 的真实示例，特别是由于迭代 `rs.stores` map 导致 `buildCommitInfo` 的结果不一致。

[代码片段的永久链接](https://github.com/cosmos/cosmos-sdk/blob/55054282d2df794d9a5fe2599ea25473379ebc3d/store/rootmulti/store.go#L909)

```
func (rs *Store) buildCommitInfo(
    version int64
) *types.CommitInfo {
    storeInfos := []types.StoreInfo{}
    for key, store := range rs.stores {
        if store.GetStoreType() == types.StoreTypeTransient {
            continue
        }
        storeInfos = append(storeInfos, types.StoreInfo{
            Name:     key.Name(),
            CommitId: store.LastCommitID(),
        })
    }
    return &types.CommitInfo{
        Version:    version,
        StoreInfos: storeInfos,
    }
}

```

导致确定性问题的其他因素是 [时间敏感函数](https://learnblockchain.cn/article/17315) 和 [竞态条件](https://github.com/cosmos/cosmos-sdk/issues/16638) 的使用。

## 你不应该通过...还是应该？

在开发智能合约时，通常会将某些底层任务（例如解析 `msg.value`、`msg.sender` 和收取交易费用）委托给底层区块链。

在 Cosmos 上，由于它是 L1 本身，因此没有可依赖的区块链！为了简化中间件类功能的开发，`Cosmos-SDK` 引入了 `AnteHandler` 装饰器来帮助完成此任务。虽然有预先编写的装饰器，但从交易和区块链状态提取的所有其他数据必须由开发者自己执行。

为了提供上下文，让我们首先了解如何处理 `AnteHandler`。每个 `AnteHandler` 都是一个状态转换函数，可以：

1. 转换与交易和执行上下文相关的区块状态。
2. 确定交易的行动方案。

1. 将交易传递给下一个 `AnteHandler`。
   2. 返回交易错误。

坏消息是开发 `AnteHandler` 并非易事。例如，让我们考虑一个场景，我们需要确保参与交易的所有签名者在交易执行时都具有大于 X 的余额。

`AnteHandle` 实现可能如下所示：

```
const (
    MIN_BALANCE = 100
)

func (abd AccountBalanceDecorator) AnteHandle(
    ctx sdk.Context,
    tx sdk.Tx,
    simulate bool,
    next sdk.AnteHandler,
) (sdk.Context, error) {
    sigTx, ok := tx.(authsigning.SigVerifiableTx)
    if !ok {
        return ctx, errorsmod.Wrap(
            sdkerrors.ErrTxDecode,
            "invalid tx type",
        )
    }

signers := sigTx.GetSigners()
    for i, signer := range signers {
        balance := abd.bk.getBalance(ctx, signer, ATOM)
        if balance.Amount < MIN_BALANCE {
            return ctx, errorsmod.Wrap(
                ErrInsufficientBalance,
                "Insufficient Balance",
            )
        }
    }

return next(ctx, tx, simulate)
}

```

相对于 cosmos-sdk 提供的其他 `AnteHandler`，此自定义 `AnteHandler` 应放置在哪里？
考虑到我们只关心满足我们检查的交易，将其插入 `SetUpContextDecorator` 之后应该可以，对吗？

[代码片段的永久链接](https://github.com/cosmos/cosmos-sdk/blob/f0aec3f30dd952e1b4b3a5b25e0412c1af5baaac/x/auth/ante/ante.go#L41)

```
anteDecorators := []sdk.AnteDecorator{
    NewSetUpContextDecorator(), // outermost AnteDecorator. SetUpContext must be called first
    // INSERT HERE
    NewExtensionOptionsDecorator(options.ExtensionOptionChecker),
    NewValidateBasicDecorator(),
    NewTxTimeoutHeightDecorator(),
    NewValidateMemoDecorator(options.AccountKeeper),
    NewConsumeGasForTxSizeDecorator(options.AccountKeeper),
    NewDeductFeeDecorator(options.AccountKeeper, options.BankKeeper, options.FeegrantKeeper, options.TxFeeChecker),
    NewSetPubKeyDecorator(options.AccountKeeper), // SetPubKeyDecorator must be called before all signature verification decorators
    NewValidateSigCountDecorator(options.AccountKeeper),
    NewSigGasConsumeDecorator(options.AccountKeeper, options.SigGasConsumer),
    NewSigVerificationDecorator(options.AccountKeeper, options.SignModeHandler),
    NewIncrementSequenceDecorator(options.AccountKeeper),
}

```

不幸的是，该顺序不起作用。这是因为还有其他 `AnteHandler`，例如 `SigGasConsumeDecorator` 和 `ConsumeGasForTxSizeDecorator`，它们会修改帐户余额。通过将我们的装饰器放置在链的开头，我们可能会通过检查，然后在到达装饰器链的末尾并开始交易执行之前扣除签名者的余额。因此，我们打算确保的不变性可能不再成立，从而使我们的检查无用。

最简单的 "缓解" 方法是将我们的装饰器向下移动到链列表中。我们轻描淡写地说，因为重要的是要考虑各种因素，例如是否允许嵌套 `msgs`（例如，是否存在 authz 模块），因为仅凭此预防措施可能不足以完全解决问题。如果不全面了解整个系统，仍有可能在 `AnteHandle` 链中犯错。

### 真实世界的例子

`AnteHandler` 滥用的一个例子是在 Cronos 合约中利用的[资金盗窃漏洞](https://learnblockchain.cn/article/17316)。

在这种情况下，`msgs` 通过用户控制的 `ExtensionOptionsEthereumTx` 选项多路复用到不同的 `AnteHandler` 集合。但是，由于缺少 tx 验证，如果 `MsgEthereumTx` 未指定 `ExtensionOptionsEthereumTx`，则会将其路由到非 Ethereum `AnteHandler`，从而无法按预期向用户收取费用。因此，攻击者可以利用交易处理结束时的费用退款来窃取资金。

[代码片段的永久链接](https://github.com/crypto-org-chain/ethermint/blob/82805507f7d2e83cad547736883dc22acfb52440/app/ante/ante.go#L33)

```
func NewAnteHandler(
    ak evmtypes.AccountKeeper,
    bankKeeper evmtypes.BankKeeper,
    evmKeeper EVMKeeper,
    feeGrantKeeper authante.FeegrantKeeper,
    channelKeeper channelkeeper.Keeper,
    signModeHandler authsigning.SignModeHandler,
) sdk.AnteHandler {
    return func(
        ctx sdk.Context, tx sdk.Tx, sim bool,
    ) (newCtx sdk.Context, err error) {
        var anteHandler sdk.AnteHandler

defer Recover(ctx.Logger(), &err)

txWithExtensions, ok := tx.(authante.HasExtensionOptionsTx)
        if ok {
            opts := txWithExtensions.GetExtensionOptions()
            if len(opts) > 0 {
                switch typeURL := opts[0].GetTypeUrl(); typeURL {
                case "/ethermint.evm.v1.ExtensionOptionsEthereumTx":
                    // handle as *evmtypes.MsgEthereumTx

anteHandler = sdk.ChainAnteDecorators(
                        NewEthSetUpContextDecorator(), // outermost AnteDecorator. SetUpContext must be called first
                        ...
                        NewEthIncrementSenderSequenceDecorator(ak), // innermost AnteDecorator.
                    )

default:
                    return ctx, stacktrace.Propagate(
                        sdkerrors.Wrap(sdkerrors.ErrUnknownExtensionOptions, typeURL),
                        "rejecting tx with unsupported extension option",
                    )
                }

return anteHandler(ctx, tx, sim)
            }
        }

// SHOULD CHECK TX IS NOT MsgEthereumTx HERE

switch tx.(type) {
        case sdk.Tx:
            anteHandler = sdk.ChainAnteDecorators(
                authante.NewSetUpContextDecorator(), // outermost AnteDecorator. SetUpContext must be called first
                 ...
                authante.NewIncrementSequenceDecorator(ak), // innermost AnteDecorator
            )
        default:
            return ctx, stacktrace.Propagate(
                sdkerrors.Wrapf(sdkerrors.ErrUnknownRequest, "invalid transaction type: %T", tx),
                "transaction is not an SDK tx",
            )
        }

return anteHandler(ctx, tx, sim)
    }
}

```

`AnteHandler` 使用不当的其他示例包括 [更多可绕过的检查和资金损失](https://jumpcrypto.com/writing/bypassing-ethermint-ante-handlers) 和 [区块链之间不正确的数据传递](https://github.com/cosmos/ibc-go/issues/853)。

## 错误？崩溃？我可以处理

智能合约开发者习惯于不正确地处理错误。这是可以接受的，因为大多数底层区块链在执行失败时都会恢复所有状态更改。

Cosmos 旨在提供类似体验。每当某些消息处理程序返回错误时，对持久状态的更改都会被删除。崩溃以类似的方式处理，其中恢复处理程序包装在消息执行周围，以将崩溃转换为供下游进程使用的错误。

这种设计非常巧妙，允许开发者以相当懒惰的方式编写代码。例如，以下代码可以完美地工作。如果 `k.keeper.TotalReward()` 返回零，则 `msg` 执行将简单地回滚，就像什么都没发生一样。

```
func (k msgServer) AllocateReward(
    goCtx context.Context,
    msg *types.MsgAllocateReward)
(*types.MsgAllocatRewardResponse, error) {

RewardPerShare := k.keeper.Shares() /  k.keeper.TotalReward()
    k.keeper.DistributeReward(RewardPerShare)

return &types.MsgAllocateRewardResponse, nil
}

```

但是，相同的假设并不总是成立。Cosmos 的某些部分（例如 `PreBlocker`、`BeginBlocker` 和 `EndBlocker`）不受错误处理机制的保护。因此，如果我们将奖励分配逻辑移动到 `BeginBlocker` 以在每个区块开始时自动分配奖励，则除以 0 引发的崩溃将停止链。

```
func BeginBlocker(ctx context.Context, keeper keeper.Keeper) error {

RewardPerShare := keeper.Shares() /  keeper.TotalReward()
    keeper.DistributeReward(RewardPerShare)

return nil
}

```

### 真实世界的例子

最近，开发者越来越意识到不受保护的 ABCI 函数，但这并不能阻止 DoS 错误的出现。那么问题是什么呢？

问题在于对实用程序函数缺乏正确的理解。此处的示例实现了一个桥，当观察到桥接事件时，该桥会在 `PreBlocker` 中铸造包装的 BTC 代币。值得注意的是，`bankKeeper.SendCoinsFromModuleToAccount` 返回的错误将通过 `PreBlocker` 冒出来并停止链。事实证明，攻击者可以通过将 `recipient` 设置为某个 `BlockedAddr` 来强制 `SendCoinsFromModuleToAccount` 返回错误，从而使代码容易受到 DoS 攻击。

[代码片段的永久链接](https://github.com/mezo-org/mezod/blob/d3b1a049a9acce977fdadd245cb381252f101922/x/bridge/keeper/assets_locked.go#L170)

```
func (pbh *PreBlockHandler) PreBlocker() sdk.PreBlocker {
    return func(
        ctx sdk.Context,
        req *cmtabci.RequestFinalizeBlock,
    ) (*sdk.ResponsePreBlock, error) {
        ...
        err := pbh.bridgeKeeper.AcceptAssetsLocked(ctx, events)
        if err != nil {
            return nil, fmt.Errorf("cannot accept AssetsLocked events: %w", err)
        }
        ...
    }
}

func (k Keeper) AcceptAssetsLocked(
    ctx sdk.Context,
    events types.AssetsLockedEvents,
) error {
    ...
    for _, event := range events {
        recipient, err := sdk.AccAddressFromBech32(event.Recipient)
        if err != nil {
            return fmt.Errorf("failed to parse recipient address: %w", err)
        }

if bytes.Equal(event.TokenBytes(), sourceBTCToken) {
            err = k.mintBTC(ctx, recipient, event.Amount)
            if err != nil {
                return fmt.Errorf(
                    "failed to mint BTC for event %v: %w",
                    event.Sequence,
                    err,
                )
            }
        } else {
            ...
        }
    }
    ...
}

func (k Keeper) mintBTC(
    ctx sdk.Context,
    recipient sdk.AccAddress,
    amount math.Int,
) error {
    ...
    err = k.bankKeeper.SendCoinsFromModuleToAccount(
        ctx,
        types.ModuleName,
        recipient,
        coins,
    )
    if err != nil {
        return fmt.Errorf("failed to send coins: %w", err)
    }
    ...
}

```

```
func (k BaseKeeper) SendCoinsFromModuleToAccount(
 ctx context.Context, senderModule string, recipientAddr sdk.AccAddress, amt sdk.Coins,
) error {
 ...
 if k.BlockedAddr(recipientAddr) {
  return errorsmod.Wrapf(sdkerrors.ErrUnauthorized, "%s is not allowed to receive funds", recipientAddr)
 }
 ...
}

```

这表明由于无法预料的不变性冲突，即使是众所周知的错误类别也会不时地再次出现。其他示例包括 [在 group 模块中使用不正确的十进制处理](https://hackerone.com/reports/3018307)。

## 相同，相同...但不同

Cosmos 公开了几个共识级别的接口，例如 `PrepareProposal`、`ProcessProposal`、`ExtendVote` 和 `VerifyVoteExtension`。这些 ABCI 方法允许开发者自定义区块的构建方式，以及将补充数据注入到每个区块中。

两个最著名的攻击面是

1. 由于 `ProcessProposal` ( `VerifyVoteExtension`) 过度验证导致 `PrepareProposal` ( `ExtendVote`) 输出被拒绝，从而导致活性失败。
2. 由于 `ProcessProposal` ( `VerifyVoteExtension`) 验证不足，导致未通过 `PrepareProposal` ( `ExtendVote`) 创建的恶意提案和投票扩展被接受。

本质上，处理程序对中的任何差异都可能导致安全问题。

这些问题还有一些鲜为人知的变体。一个实例是 `PrepareProposal` 中 `VoteExtensions` 的验证。为了提供上下文，我们首先介绍一下 CometBTF 共识和投票扩展。

共识从领导者创建一个提案，然后将其广播给每个验证者开始。然后，验证者继续投票决定是否接受该提案。在投票阶段，调用 `ExtendVote` 以将附加数据附加到投票中。一旦验证者收集了足够的通过 `VerifyVoteExtension` 的有效投票，该提案就被认为是已接受的并且可以被提交。在提交提案后，新的领导者开始创建下一个提案，使我们回到开始的位置。

那么，附加的投票扩展数据在哪里使用？事实证明，领导者应该在其提案中包含上一轮共识的投票扩展。可能很想得出结论，诚实的领导者接受的所有投票扩展都已通过 `VerifyVoteExtension` 检查，因此是有效的。因此，我们可以直接将所有投票扩展注入到我们的提案中。

不幸的是，CometBTF 直接接受延迟的预提交，而无需通过 `VerifyVoteExtension` 来检查它们。这暴露了一个时间窗口，拜占庭验证者可以将恶意投票偷偷放入下一个领导者的缓存中，从而诱使领导者将无效的投票扩展包含在其 `Proposal` 中。

```
func (cs *State) addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) {
    ...

// A precommit for the previous height?
    // These come in while we wait timeoutCommit
    if vote.Height+1 == cs.Height && vote.Type == types.PrecommitType {
        ...
        // Late precommits are not checked by VerifyVoteExtension
        added, err = cs.LastCommit.AddVote(vote)
        ...
        return added, err
    }
    extEnabled := cs.state.ConsensusParams.Feature.VoteExtensionsEnabled(vote.Height)
    if extEnabled {
        ...
        if vote.Type == types.PrecommitType && !vote.BlockID.IsNil() &&
            !bytes.Equal(vote.ValidatorAddress, myAddr) { // Skip the VerifyVoteExtension call if the vote was issued by this validator.
            ...
            err := cs.blockExec.VerifyVoteExtension(context.TODO(), vote)
            ...
        }
    } else if {
        ...
    }
    ...
}

```

如果开发者不了解 CometBTF 中有关投票扩展处理的细微细节，则很容易忽略针对这些攻击实施保护措施。

### 真实世界的例子

此处显示了我们刚刚描述的 bug 的示例。 `PrepareProposal` 仅检查每个投票是否已由 `ValidateVoteExtension` 中的验证者正确签名，但不会根据 `VerifyVoteExtention` 中的规则对其进行验证。因此，使领导者容易接受其提案中的恶意投票扩展。

[代码片段的永久链接](https://github.com/sedaprotocol/seda-chain/blob/66c1b593fa81c7d443ab5fa82757b45e68597f49/app/abci/handlers.go#L180)

```
func (h *Handlers) PrepareProposalHandler() sdk.PrepareProposalHandler {
    return func(ctx sdk.Context, req *abcitypes.RequestPrepareProposal) (*abcitypes.ResponsePrepareProposal, error) {
        ...
        var injection []byte
        if req.Height > ctx.ConsensusParams().Abci.VoteExtensionsEnableHeight && collectSigs {
            //Fails to verify vote extensions with VerifyVoteExtension rules
            err := baseapp.ValidateVoteExtensions(ctx, h.stakingKeeper, req.Height, ctx.ChainID(), req.LocalLastCommit)
            if err != nil {
                return nil, err
            }
            injection, err = json.Marshal(req.LocalLastCommit)
            if err != nil {
                h.logger.Error("failed to marshal extended votes", "err", err)
                return nil, err
            }
            ...
        }
        defaultRes, err := h.defaultPrepareProposal(ctx, req)
        ...
        proposalTxs := defaultRes.Txs
        if injection != nil {
            proposalTxs = append([][]byte{injection}, proposalTxs...)
            h.logger.Debug("injected local last commit", "height", req.Height)
        }
        return &abcitypes.ResponsePrepareProposal{
            Txs: proposalTxs,
        }, nil
    }
}

```

除了更复杂的变体外，纯粹的验证不匹配仍然很普遍，尽管它们是众所周知的攻击面。这源于 CometBTF 中隐藏的各种模糊检查对 `Proposal` ( `Vote`) 的拒绝。例如，此提交修复了一个 bug，其中 [PrepareProposal 可能会返回大于 MaxTxBytes 的 Proposal](https://github.com/babylonlabs-io/babylon/commit/aa827f875a16ebf85efee5d9a6c8c4e76dbfb7bd#diff-77659089b31367690393a968f4bfacfd1bf960ed300965729df216a6fb612699)，稍后将被 CometBTF 拒绝。

## 密钥管理员

状态（持久存储）是状态机的另一个关键组成部分。Cosmos 依赖于一个名为 `KVStore` 的自定义键值存储来有效地处理状态。在 `KVStore` 中，键和值都表示为简单的字节切片，这要求开发者在处理存储时处理更复杂结构的序列化和反序列化。

正确数据序列化背后的复杂性通常会导致有缺陷的代码和安全漏洞。下面，我们展示了相对简单（但有 bug）的实现，并逐步解决和缓解这些问题，直到代码被认为是安全的，免受漏洞利用。

让我们首先考虑一个场景，我们需要将下面提到的 `positionMap` 结构存储到存储中。

```
type VaultId uint64
type Username string
type PositionName string
type Position struct {
    data []byte
}
type PositionMap :=
    map[VaultId]map[Username]map[PositionName]Position

```

鉴于 `PositionMap` 中有两层键，我们应该尝试将这三个 map 键序列化为分层可搜索的存储键。最简单的缓解措施是将所有字段转换为字符串并将它们连接在一起。

```
storageKey := fmt.Sprintf(
    "%d%s%s",
    vaultId,
    username,
    positionName,
)

```

虽然简单的连接允许我们轻松地构造存储键，但很明显这种实现容易出现键冲突。

```
vaultId = 1,  username = "2a", positionName = "b"
    => storageKey = "12ab"

vaultId = 12, username = "a",  positionName = "b"
    => storageKey = "12ab"

```

**那么，我们如何缓解这个问题呢？**
也许我们可以在每个字段之间添加一个字段分隔符，它类似于以下内容：

```
const (
    Seperator = "|"
)

storageKey := fmt.Sprintf(
    "%d%s%s%s%s",
    vaultId,
    Seperator,
    username,
    Seperator,
    positionName,
)

```

插入分隔符有助于防止大多数意外冲突，但它是否完全解决了问题？

遗憾的是，它没有。由于 `username` 和 `vaultName` 都是可能包含任意字符（包括分隔符）的字符串，因此仍然可能发生冲突。

```
vaultId = 1, username = "a|", positionName = "b"
    => storageKey = "1|a||b"

vaultId = 1, username = "a",  positionName = "|b"
    => storageKey = "1|a||b"

```

为了进一步缓解这个问题，我们可以对所有字段进行编码，以确保分隔符不包含在单个字段中，从而使字段注入成为不可能。

```
const (
    Seperator = "|"
)

usernameEncoded := make(
    []byte,
    hex.EncodedLen(len(username)),
)
hex.Encode(usernameEncoded, username)

positionNameEncoded := make(
    []byte,
    hex.EncodedLen(len(positionName)),
)
hex.Encode(positionNameEncoded, positionName)

storageKey := fmt.Sprintf(
    "%d%s%s%s%s",
    vaultId,
    Seperator,
    usernameEncoded,
    Seperator,
    positionNameEncoded
)

```

我们做到了。我们终于消除了所有潜在的 `storageKey` 冲突。

到目前为止，我们的重点主要是存储单个结构。我们认识到，在实际应用中，我们经常遇到必须将多个结构存储为持久状态的情况。

在 Cosmos 框架中，每个 `Module` 拥有一组 `KVStore` 并有单独的 `Keeper` 管理对存储的访问是很常见的。同样重要的是要注意，每个 `KVStore` 应该彼此独立，从而减轻开发者担心不同 `Module` 之间键冲突的麻烦。

话虽如此，如果我们需要在同一个 `KVStore` 中维护多个结构怎么办？

为了演示这种情况，我们引入了 `NameToAddressMap` 结构，该结构将存储在我们之前使用的同一个 `KVStore` 中。

```
type VaultId uint64
type Username string

type PositionName string
type Position struct {
    data []byte
}
type PositionMap :=
    map[VaultId]map[Username]map[PositionName]Position

type AddressName string
type Address struct {
 data []byte
}
type AddressMap :=
    map[VaultId]map[Username]map[AddressName]Address

```

参考之前的示例，有必要对每个键字段进行清理/编码，并在字段之间添加分隔符以防止键冲突。通过将这些措施付诸实践，我们在下面展示了以下实现：

```
const (
    Seperator = "|"
)

func PositionMapKey(
    vaultId uint64,
    username, positionName []byte,
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

positionNameEncoded := make(
        []byte,
        hex.EncodedLen(len(positionName)),
    )
    hex.Encode(positionNameEncoded, positionName)

key := fmt.Sprintf(
        "%d%s%s%s%s",
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        positionNameEncoded,
    )
}

func AddressMapKey(
    vaultId uint64,
    username, addressName []byte
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

addressNameEncoded := make(
        []byte,
        hex.EncodedLen(len(addressName)),
    )
    hex.Encode(addressNameEncoded, addressName)

key := fmt.Sprintf(
        "%d%s%s%s%s",
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        addressNameEncoded,
    )
}

```

不幸的是，当在同一个 `KVStore` 中处理多个存储条目时，之前的实现不足以保证键的唯一性。虽然它仍然有效地防止了每个单独结构中的键冲突，但它并不能防止跨结构键冲突。

```
vaultId = 1, username = "a", positionName = "b"
    => PositionMapKey = "1|a|b"

vaultId = 1, username = "a", addressName = "b"
    => AddressMapKey = "1|a||b"

```

为了防止这种情况，请在每个键的开头添加一个特定于结构的 前缀，以充当域分隔符。

```
const (
    Seperator = "|"
    PositionMapPrefix = "\x01"
    AddressMapPrefix = "\x02"
)

positionNameEncoded := make(
        []byte,
        hex.EncodedLen(len(positionName)),
    )
    hex.Encode(positionNameEncoded, positionName)

key := fmt.Sprintf(
        "%s%d%s%s%s%s",
        PositionMapPrefix,
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        positionNameEncoded,
    )
}

func AddressMapKey(
    vaultId uint64,
    username, addressName []byte,
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

addressNameEncoded := make(
        []byte,
        hex.EncodedLen(len(addressName)),
    )
    hex.Encode(addressNameEncoded, addressName)

key := fmt.Sprintf(
        "%s%d%s%s%s%s",
        AddressMapPrefix,
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        addressNameEncoded,
    )
}
```
现在我们有了一个如何序列化存储密钥的正确示例。

然而，存储的意义远不止于此。如前所述，存储应支持其原有功能。就 \`map\` 而言，数据仍应可通过原始键检索。

让我们来看看这样一种情况：我们想从存储中检索与某个 \`VaultId\` 相关联的所有 \`map\[Username]map\[PositionName]Position\` 。我们怎样才能安全地做到这一点呢？

幸运的是，Cosmos-SDK 提供了获取与 \`storageKey\` 前缀相关联的所有条目的应用程序接口。下面是一个尝试获取 \`vaultId\` 数据的例子：

```
func FetchPositionMapWithVaultId(
    vaultId uint64,
) ([]map[Username]map[PositionName]Position) {
    values := map[Username]map[PositionName]Position{}
    i := sdk.KVStorePrefixIterator(
        kvStore,
        fmt.Sprintf("%s%d", PositionMapPrefix, vaultId)
    )
    for ; i.Valid(); i.Next() {
        k := strings.split(i.Key(), Seperator)

username := make([]byte, hex.DecodedLen(k[0]))
        _, err := hex.Decode(username, k[0])
        if err != nil {
            return nil, err
        }

positionName := make([]byte, hex.DecodedLen(k[1]))
        _, err := hex.Decode(positionName, k[1])
        if err != nil {
            return nil, err
        }

if entry, ok := values[username]; !ok {
            values[username] = make(map[PositionName])
        }

values[username][positionName] = Position {
            data: iterator.Value(),
        }
    }
    return values
}
```

现在，您可能已经注意到，这种实现方式存在字段延展性问题。设想一下 `vaultId = 1` 和 `vaultId = 10` 同时存在的情况。如果我们尝试获取 `vaultId = 1` 下的数据，那么也会返回 `vaultId = 10` 下的所有条目，原因很简单，因为 `1` 是 `10` 的前缀。要解决这个问题，我们必须再次将 `Separator` 添加到迭代器前缀中。

```
i := sdk.KVStorePrefixIterator(
    kvStore,
    fmt.Sprintf("%s%d%s", PositionMapPrefix, vaultId, Seperator),
)
```

起初，识别这些序列化问题似乎很容易。一旦数据结构和 `KVStore` 的使用变得越来越复杂，开发人员就会无意中忽略存储密钥解析错误。

在 Cosmos 上构建时，存储键仍然是一个乏味且持续存在的问题。在开发过程中，务必谨慎对待，以防止错误潜入代码。

### 真实世界的例子

`Cosmos-SDK` 之前缺乏对 KVStore [键冲突](https://github.com/cosmos/cosmos-sdk/pull/9363) 的保护。之前的疏忽导致开发者可能会无意中创建两个彼此不独立的 `KVStore`。

[代码片段的永久链接](https://github.com/cosmos/cosmos-sdk/blob/25bd118e4cc1d60ab2f9d2e0302d271416551aa9/types/store.go#L108)

```
func NewKVStoreKeys(names ...string) map[string]*KVStoreKey {
    keys := make(map[string]*KVStoreKey)
    for _, name := range names {
        keys[name] = NewKVStoreKey(name)
    }

return keys
}

```

由于核心开发人员的努力，现在强制执行检查，如果任何 `KVStore` 键是彼此的前缀，`Cosmos-SDK` 将拒绝运行。此实现减轻了开发人员不必担心 `KVStore` 级别的密钥冲突。

其他存储键问题，例如 Cosmos-SDK 中的细微错误，导致了[不正确的迭代器行为](https://github.com/cosmos/cosmos-sdk/issues/12661)。

值得注意的是，自 Cosmos v0.50 以来，[集合](https://github.com/cosmos/cosmos-sdk/tree/def657dafa615cb8e8bb072452663893157e073a/collections)存储助手的逐步采用，使得编写有缺陷的代码变得更加困难。这表明了及时了解最新的 SDK 开发以利用架构安全改进的重要性。

## 结论

对于那些想要创建自定义区块链的人来说， Cosmos SDK 是一个强大的工具。然而，这种灵活性带来了巨大的责任。开发人员必须密切关注细微之处，因为这些细微之处可能会暴露大量潜在的攻击面。

回顾一下，我们讨论了 Cosmos-SDK 的一些更基本的部分，展示了开发人员容易犯的常见错误。然而，重要的是要注意，我们只涉及了冰山一角。其他攻击面，例如与 IBC 接口相关的身份验证，绝对值得研究。

>- 原文链接： [osec.io/blog/2025-06-10-...](https://osec.io/blog/2025-06-10-cosmos-security)
>- 登链社区 AI 助手，为大家转译优秀英文文章，如有翻译不通的地方，还请包涵～

从无限循环和 map 确定性到 AnteHandler 的失误和存储键冲突，我们重点介绍了真实世界的漏洞以及构建更安全的基于 Cosmos 的项目的可行建议。

简介

循环开始了

使用 SDK 构建特定应用的 L1 和在已建立的 L1 链上构建合约存在显著差异。尤其重要的是要认识到，维护区块链的稳定性取决于开发者。

下面，我们开始演示使用 Solidity 编写智能合约与使用 Cosmos SDK 开发 L1 之间的区别。

下面是一个简单的示例供参考：

function sumWithStride(
    uint64 start,
    uint64 stride,
    uint64[] memory arr
) public returns (uint64) {
    uint64 idx = start;
    uint64 sum = 0;
    uint64 end = arr.length;

    while (idx &lt; end) {
        sum += arr[idx];
        idx += stride;
    }
    return sum;
}

type MsgSumWithStrideParams struct {
    Start uint64
    Stride uint64
    Arr []uint64
}

type MsgSumWithStrideResponse struct {
    Sum uint64
}

func (ms msgServer) SumWithStride(
    goCtx context.Context,
    msg *MsgSumWithStrideParams,
) (*MsgSumWithStrideResponse, error) {
    sum := uint64(0)
    end := uint64(len(msg.Arr))
    for idx := msg.Start; idx &lt; end; idx += msg.Stride {
        sum += msg.Arr[idx]
    }
    return &MsgSumWithStrideResponse{Sum: sum}, nil
}

提供的 Solidity / Cosmos 代码片段具有一个公共函数，该函数使用提供的起始 idx 和 stride 计算数组的总和。至关重要的是要注意此函数缺乏健壮性。敏锐的观察者可能已经发现，如果用户提供步幅值为 0，则代码将导致无限循环。

这个玩具般的场景捕捉了 Cosmos 中对错误处理、边界情况和整体健壮性关注的重要性。

真实世界的例子

现在，让我们检查一些真实世界的实例。

在这个 CosmWasm 漏洞的案例中，辅助方法 write_to_contract 疏忽地调用了不受信任的 Wasm 函数 "allocate"。

代码片段的永久链接

fn write_to_contract&lt;A: BackendApi, S: Storage, Q: Querier>(
    env: &Environment&lt;A, S, Q>,
    input: &[u8],
) -> VmResult&lt;u32> {
    let out_size = to_u32(input.len())?;
    let result = env.call_function1("allocate", &[out_size.into()])?;
    let target_ptr = ref_to_u32(&result)?;
    if target_ptr == 0 {
        return Err(CommunicationError::zero_address().into());
    }
    write_region(&env.memory(), target_ptr, input)?;
    Ok(target_ptr)
}

由于用户完全可以控制 allocate，因此可以通过其他导入的函数重复回调 write_to_contract。这可能导致主机堆栈耗尽，最终导致 DoS。

其他真实世界的例子包括没有为格式错误的 txs 返回正确的值。

秩序是人类的梦想

与作为智能合约的特定领域语言的 solidity 不同，Golang 不是。因此，开发者必须注意特定的潜在问题。一个值得注意的例子是非确定性。

考虑一个场景，其中需要为 map 中的每个条目发出一个事件。可能很想按如下所示实现此目的：

type ObjectMap map[string]string

func EmitEntries(objectMap ObjectMap) {
    for key, value := range objectMap {
        ctx.EventManager.EmitEvent(
            sdk.NewEvent(
                "MapContext",
                sdk.NewAttribute(key, value),
            )
        )
    }
}

使用 range 循环迭代 map 时，迭代顺序未指定，并且不能保证每次迭代都相同。

为了正确实现迭代顺序，开发者必须显式地对 map 的键进行排序，然后在使用排序后的键数组发出值之前获取值。

type ObjectMap map[string]string

func EmitEntries(objectMap ObjectMap) {
    var keys []string
    for key := range objectMap {
        keys = append(keys, key)
    }
    sort.Strings(keys)

    for _, key := range keys {
        ctx.EventManager.EmitEvent(
            sdk.NewEvent(
                "MapContext",
                sdk.NewAttribute(key, objectMap[key]),
            )
        )
    }
}

外部 Golang 依赖项中隐藏的代码组合使得完全避免语言方面的怪癖变得困难。至关重要的是保持警惕，不要低估此类挥之不去的 bug 的严重性。

真实世界的例子

可以在此处找到导致确定性问题的 map 的真实示例，特别是由于迭代 rs.stores map 导致 buildCommitInfo 的结果不一致。

代码片段的永久链接

func (rs *Store) buildCommitInfo(
    version int64
) *types.CommitInfo {
    storeInfos := []types.StoreInfo{}
    for key, store := range rs.stores {
        if store.GetStoreType() == types.StoreTypeTransient {
            continue
        }
        storeInfos = append(storeInfos, types.StoreInfo{
            Name:     key.Name(),
            CommitId: store.LastCommitID(),
        })
    }
    return &types.CommitInfo{
        Version:    version,
        StoreInfos: storeInfos,
    }
}

导致确定性问题的其他因素是时间敏感函数和竞态条件的使用。

你不应该通过...还是应该？

在开发智能合约时，通常会将某些底层任务（例如解析 msg.value、msg.sender 和收取交易费用）委托给底层区块链。

在 Cosmos 上，由于它是 L1 本身，因此没有可依赖的区块链！为了简化中间件类功能的开发，Cosmos-SDK 引入了 AnteHandler 装饰器来帮助完成此任务。虽然有预先编写的装饰器，但从交易和区块链状态提取的所有其他数据必须由开发者自己执行。

为了提供上下文，让我们首先了解如何处理 AnteHandler。每个 AnteHandler 都是一个状态转换函数，可以：

转换与交易和执行上下文相关的区块状态。
确定交易的行动方案。
1. 将交易传递给下一个 AnteHandler。
2. 返回交易错误。

坏消息是开发 AnteHandler 并非易事。例如，让我们考虑一个场景，我们需要确保参与交易的所有签名者在交易执行时都具有大于 X 的余额。

AnteHandle 实现可能如下所示：

const (
    MIN_BALANCE = 100
)

func (abd AccountBalanceDecorator) AnteHandle(
    ctx sdk.Context,
    tx sdk.Tx,
    simulate bool,
    next sdk.AnteHandler,
) (sdk.Context, error) {
    sigTx, ok := tx.(authsigning.SigVerifiableTx)
    if !ok {
        return ctx, errorsmod.Wrap(
            sdkerrors.ErrTxDecode,
            "invalid tx type",
        )
    }

    signers := sigTx.GetSigners()
    for i, signer := range signers {
        balance := abd.bk.getBalance(ctx, signer, ATOM)
        if balance.Amount &lt; MIN_BALANCE {
            return ctx, errorsmod.Wrap(
                ErrInsufficientBalance,
                "Insufficient Balance",
            )
        }
    }

    return next(ctx, tx, simulate)
}

相对于 cosmos-sdk 提供的其他 AnteHandler，此自定义 AnteHandler 应放置在哪里？考虑到我们只关心满足我们检查的交易，将其插入 SetUpContextDecorator 之后应该可以，对吗？

代码片段的永久链接

anteDecorators := []sdk.AnteDecorator{
    NewSetUpContextDecorator(), // outermost AnteDecorator. SetUpContext must be called first
    // INSERT HERE
    NewExtensionOptionsDecorator(options.ExtensionOptionChecker),
    NewValidateBasicDecorator(),
    NewTxTimeoutHeightDecorator(),
    NewValidateMemoDecorator(options.AccountKeeper),
    NewConsumeGasForTxSizeDecorator(options.AccountKeeper),
    NewDeductFeeDecorator(options.AccountKeeper, options.BankKeeper, options.FeegrantKeeper, options.TxFeeChecker),
    NewSetPubKeyDecorator(options.AccountKeeper), // SetPubKeyDecorator must be called before all signature verification decorators
    NewValidateSigCountDecorator(options.AccountKeeper),
    NewSigGasConsumeDecorator(options.AccountKeeper, options.SigGasConsumer),
    NewSigVerificationDecorator(options.AccountKeeper, options.SignModeHandler),
    NewIncrementSequenceDecorator(options.AccountKeeper),
}

不幸的是，该顺序不起作用。这是因为还有其他 AnteHandler，例如 SigGasConsumeDecorator 和 ConsumeGasForTxSizeDecorator，它们会修改帐户余额。通过将我们的装饰器放置在链的开头，我们可能会通过检查，然后在到达装饰器链的末尾并开始交易执行之前扣除签名者的余额。因此，我们打算确保的不变性可能不再成立，从而使我们的检查无用。

最简单的 "缓解" 方法是将我们的装饰器向下移动到链列表中。我们轻描淡写地说，因为重要的是要考虑各种因素，例如是否允许嵌套 msgs（例如，是否存在 authz 模块），因为仅凭此预防措施可能不足以完全解决问题。如果不全面了解整个系统，仍有可能在 AnteHandle 链中犯错。

真实世界的例子

AnteHandler 滥用的一个例子是在 Cronos 合约中利用的资金盗窃漏洞。

在这种情况下，msgs 通过用户控制的 ExtensionOptionsEthereumTx 选项多路复用到不同的 AnteHandler 集合。但是，由于缺少 tx 验证，如果 MsgEthereumTx 未指定 ExtensionOptionsEthereumTx，则会将其路由到非 Ethereum AnteHandler，从而无法按预期向用户收取费用。因此，攻击者可以利用交易处理结束时的费用退款来窃取资金。

代码片段的永久链接

func NewAnteHandler(
    ak evmtypes.AccountKeeper,
    bankKeeper evmtypes.BankKeeper,
    evmKeeper EVMKeeper,
    feeGrantKeeper authante.FeegrantKeeper,
    channelKeeper channelkeeper.Keeper,
    signModeHandler authsigning.SignModeHandler,
) sdk.AnteHandler {
    return func(
        ctx sdk.Context, tx sdk.Tx, sim bool,
    ) (newCtx sdk.Context, err error) {
        var anteHandler sdk.AnteHandler

        defer Recover(ctx.Logger(), &err)

        txWithExtensions, ok := tx.(authante.HasExtensionOptionsTx)
        if ok {
            opts := txWithExtensions.GetExtensionOptions()
            if len(opts) > 0 {
                switch typeURL := opts[0].GetTypeUrl(); typeURL {
                case "/ethermint.evm.v1.ExtensionOptionsEthereumTx":
                    // handle as *evmtypes.MsgEthereumTx

                    anteHandler = sdk.ChainAnteDecorators(
                        NewEthSetUpContextDecorator(), // outermost AnteDecorator. SetUpContext must be called first
                        ...
                        NewEthIncrementSenderSequenceDecorator(ak), // innermost AnteDecorator.
                    )

                default:
                    return ctx, stacktrace.Propagate(
                        sdkerrors.Wrap(sdkerrors.ErrUnknownExtensionOptions, typeURL),
                        "rejecting tx with unsupported extension option",
                    )
                }

                return anteHandler(ctx, tx, sim)
            }
        }

        // SHOULD CHECK TX IS NOT MsgEthereumTx HERE

        switch tx.(type) {
        case sdk.Tx:
            anteHandler = sdk.ChainAnteDecorators(
                authante.NewSetUpContextDecorator(), // outermost AnteDecorator. SetUpContext must be called first
                 ...
                authante.NewIncrementSequenceDecorator(ak), // innermost AnteDecorator
            )
        default:
            return ctx, stacktrace.Propagate(
                sdkerrors.Wrapf(sdkerrors.ErrUnknownRequest, "invalid transaction type: %T", tx),
                "transaction is not an SDK tx",
            )
        }

        return anteHandler(ctx, tx, sim)
    }
}

AnteHandler 使用不当的其他示例包括更多可绕过的检查和资金损失和区块链之间不正确的数据传递。

错误？崩溃？我可以处理

智能合约开发者习惯于不正确地处理错误。这是可以接受的，因为大多数底层区块链在执行失败时都会恢复所有状态更改。

这种设计非常巧妙，允许开发者以相当懒惰的方式编写代码。例如，以下代码可以完美地工作。如果 k.keeper.TotalReward() 返回零，则 msg 执行将简单地回滚，就像什么都没发生一样。

func (k msgServer) AllocateReward(
    goCtx context.Context,
    msg *types.MsgAllocateReward)
(*types.MsgAllocatRewardResponse, error) {

    RewardPerShare := k.keeper.Shares() /  k.keeper.TotalReward()
    k.keeper.DistributeReward(RewardPerShare)

    return &types.MsgAllocateRewardResponse, nil
}

但是，相同的假设并不总是成立。Cosmos 的某些部分（例如 PreBlocker、BeginBlocker 和 EndBlocker）不受错误处理机制的保护。因此，如果我们将奖励分配逻辑移动到 BeginBlocker 以在每个区块开始时自动分配奖励，则除以 0 引发的崩溃将停止链。

func BeginBlocker(ctx context.Context, keeper keeper.Keeper) error {

    RewardPerShare := keeper.Shares() /  keeper.TotalReward()
    keeper.DistributeReward(RewardPerShare)

 return nil
}

真实世界的例子

最近，开发者越来越意识到不受保护的 ABCI 函数，但这并不能阻止 DoS 错误的出现。那么问题是什么呢？

问题在于对实用程序函数缺乏正确的理解。此处的示例实现了一个桥，当观察到桥接事件时，该桥会在 PreBlocker 中铸造包装的 BTC 代币。值得注意的是，bankKeeper.SendCoinsFromModuleToAccount 返回的错误将通过 PreBlocker 冒出来并停止链。事实证明，攻击者可以通过将 recipient 设置为某个 BlockedAddr 来强制 SendCoinsFromModuleToAccount 返回错误，从而使代码容易受到 DoS 攻击。

代码片段的永久链接

func (pbh *PreBlockHandler) PreBlocker() sdk.PreBlocker {
    return func(
        ctx sdk.Context,
        req *cmtabci.RequestFinalizeBlock,
    ) (*sdk.ResponsePreBlock, error) {
        ...
        err := pbh.bridgeKeeper.AcceptAssetsLocked(ctx, events)
        if err != nil {
            return nil, fmt.Errorf("cannot accept AssetsLocked events: %w", err)
        }
        ...
    }
}

func (k Keeper) AcceptAssetsLocked(
    ctx sdk.Context,
    events types.AssetsLockedEvents,
) error {
    ...
    for _, event := range events {
        recipient, err := sdk.AccAddressFromBech32(event.Recipient)
        if err != nil {
            return fmt.Errorf("failed to parse recipient address: %w", err)
        }

        if bytes.Equal(event.TokenBytes(), sourceBTCToken) {
            err = k.mintBTC(ctx, recipient, event.Amount)
            if err != nil {
                return fmt.Errorf(
                    "failed to mint BTC for event %v: %w",
                    event.Sequence,
                    err,
                )
            }
        } else {
            ...
        }
    }
    ...
}

func (k Keeper) mintBTC(
    ctx sdk.Context,
    recipient sdk.AccAddress,
    amount math.Int,
) error {
    ...
    err = k.bankKeeper.SendCoinsFromModuleToAccount(
        ctx,
        types.ModuleName,
        recipient,
        coins,
    )
    if err != nil {
        return fmt.Errorf("failed to send coins: %w", err)
    }
    ...
}

func (k BaseKeeper) SendCoinsFromModuleToAccount(
 ctx context.Context, senderModule string, recipientAddr sdk.AccAddress, amt sdk.Coins,
) error {
 ...
 if k.BlockedAddr(recipientAddr) {
  return errorsmod.Wrapf(sdkerrors.ErrUnauthorized, "%s is not allowed to receive funds", recipientAddr)
 }
 ...
}

这表明由于无法预料的不变性冲突，即使是众所周知的错误类别也会不时地再次出现。其他示例包括在 group 模块中使用不正确的十进制处理。

相同，相同...但不同

Cosmos 公开了几个共识级别的接口，例如 PrepareProposal、ProcessProposal、ExtendVote 和 VerifyVoteExtension。这些 ABCI 方法允许开发者自定义区块的构建方式，以及将补充数据注入到每个区块中。

两个最著名的攻击面是

由于 ProcessProposal ( VerifyVoteExtension) 过度验证导致 PrepareProposal ( ExtendVote) 输出被拒绝，从而导致活性失败。
由于 ProcessProposal ( VerifyVoteExtension) 验证不足，导致未通过 PrepareProposal ( ExtendVote) 创建的恶意提案和投票扩展被接受。

本质上，处理程序对中的任何差异都可能导致安全问题。

这些问题还有一些鲜为人知的变体。一个实例是 PrepareProposal 中 VoteExtensions 的验证。为了提供上下文，我们首先介绍一下 CometBTF 共识和投票扩展。

共识从领导者创建一个提案，然后将其广播给每个验证者开始。然后，验证者继续投票决定是否接受该提案。在投票阶段，调用 ExtendVote 以将附加数据附加到投票中。一旦验证者收集了足够的通过 VerifyVoteExtension 的有效投票，该提案就被认为是已接受的并且可以被提交。在提交提案后，新的领导者开始创建下一个提案，使我们回到开始的位置。

那么，附加的投票扩展数据在哪里使用？事实证明，领导者应该在其提案中包含上一轮共识的投票扩展。可能很想得出结论，诚实的领导者接受的所有投票扩展都已通过 VerifyVoteExtension 检查，因此是有效的。因此，我们可以直接将所有投票扩展注入到我们的提案中。

不幸的是，CometBTF 直接接受延迟的预提交，而无需通过 VerifyVoteExtension 来检查它们。这暴露了一个时间窗口，拜占庭验证者可以将恶意投票偷偷放入下一个领导者的缓存中，从而诱使领导者将无效的投票扩展包含在其 Proposal 中。

func (cs *State) addVote(vote *types.Vote, peerID p2p.ID) (added bool, err error) {
    ...

    // A precommit for the previous height?
    // These come in while we wait timeoutCommit
    if vote.Height+1 == cs.Height && vote.Type == types.PrecommitType {
        ...
        // Late precommits are not checked by VerifyVoteExtension
        added, err = cs.LastCommit.AddVote(vote)
        ...
        return added, err
    }
    extEnabled := cs.state.ConsensusParams.Feature.VoteExtensionsEnabled(vote.Height)
    if extEnabled {
        ...
        if vote.Type == types.PrecommitType && !vote.BlockID.IsNil() &&
            !bytes.Equal(vote.ValidatorAddress, myAddr) { // Skip the VerifyVoteExtension call if the vote was issued by this validator.
            ...
            err := cs.blockExec.VerifyVoteExtension(context.TODO(), vote)
            ...
        }
    } else if {
        ...
    }
    ...
}

如果开发者不了解 CometBTF 中有关投票扩展处理的细微细节，则很容易忽略针对这些攻击实施保护措施。

真实世界的例子

此处显示了我们刚刚描述的 bug 的示例。 PrepareProposal 仅检查每个投票是否已由 ValidateVoteExtension 中的验证者正确签名，但不会根据 VerifyVoteExtention 中的规则对其进行验证。因此，使领导者容易接受其提案中的恶意投票扩展。

代码片段的永久链接

func (h *Handlers) PrepareProposalHandler() sdk.PrepareProposalHandler {
    return func(ctx sdk.Context, req *abcitypes.RequestPrepareProposal) (*abcitypes.ResponsePrepareProposal, error) {
        ...
        var injection []byte
        if req.Height > ctx.ConsensusParams().Abci.VoteExtensionsEnableHeight && collectSigs {
            //Fails to verify vote extensions with VerifyVoteExtension rules
            err := baseapp.ValidateVoteExtensions(ctx, h.stakingKeeper, req.Height, ctx.ChainID(), req.LocalLastCommit)
            if err != nil {
                return nil, err
            }
            injection, err = json.Marshal(req.LocalLastCommit)
            if err != nil {
                h.logger.Error("failed to marshal extended votes", "err", err)
                return nil, err
            }
            ...
        }
        defaultRes, err := h.defaultPrepareProposal(ctx, req)
        ...
        proposalTxs := defaultRes.Txs
        if injection != nil {
            proposalTxs = append([][]byte{injection}, proposalTxs...)
            h.logger.Debug("injected local last commit", "height", req.Height)
        }
        return &abcitypes.ResponsePrepareProposal{
            Txs: proposalTxs,
        }, nil
    }
}

除了更复杂的变体外，纯粹的验证不匹配仍然很普遍，尽管它们是众所周知的攻击面。这源于 CometBTF 中隐藏的各种模糊检查对 Proposal ( Vote) 的拒绝。例如，此提交修复了一个 bug，其中 PrepareProposal 可能会返回大于 MaxTxBytes 的 Proposal，稍后将被 CometBTF 拒绝。

密钥管理员

状态（持久存储）是状态机的另一个关键组成部分。Cosmos 依赖于一个名为 KVStore 的自定义键值存储来有效地处理状态。在 KVStore 中，键和值都表示为简单的字节切片，这要求开发者在处理存储时处理更复杂结构的序列化和反序列化。

让我们首先考虑一个场景，我们需要将下面提到的 positionMap 结构存储到存储中。

type VaultId uint64
type Username string
type PositionName string
type Position struct {
    data []byte
}
type PositionMap :=
    map[VaultId]map[Username]map[PositionName]Position

鉴于 PositionMap 中有两层键，我们应该尝试将这三个 map 键序列化为分层可搜索的存储键。最简单的缓解措施是将所有字段转换为字符串并将它们连接在一起。

storageKey := fmt.Sprintf(
    "%d%s%s",
    vaultId,
    username,
    positionName,
)

虽然简单的连接允许我们轻松地构造存储键，但很明显这种实现容易出现键冲突。

vaultId = 1,  username = "2a", positionName = "b"
    => storageKey = "12ab"

vaultId = 12, username = "a",  positionName = "b"
    => storageKey = "12ab"

那么，我们如何缓解这个问题呢？ 也许我们可以在每个字段之间添加一个字段分隔符，它类似于以下内容：

const (
    Seperator = "|"
)

storageKey := fmt.Sprintf(
    "%d%s%s%s%s",
    vaultId,
    Seperator,
    username,
    Seperator,
    positionName,
)

插入分隔符有助于防止大多数意外冲突，但它是否完全解决了问题？

遗憾的是，它没有。由于 username 和 vaultName 都是可能包含任意字符（包括分隔符）的字符串，因此仍然可能发生冲突。

vaultId = 1, username = "a|", positionName = "b"
    => storageKey = "1|a||b"

vaultId = 1, username = "a",  positionName = "|b"
    => storageKey = "1|a||b"

为了进一步缓解这个问题，我们可以对所有字段进行编码，以确保分隔符不包含在单个字段中，从而使字段注入成为不可能。

const (
    Seperator = "|"
)

usernameEncoded := make(
    []byte,
    hex.EncodedLen(len(username)),
)
hex.Encode(usernameEncoded, username)

positionNameEncoded := make(
    []byte,
    hex.EncodedLen(len(positionName)),
)
hex.Encode(positionNameEncoded, positionName)

storageKey := fmt.Sprintf(
    "%d%s%s%s%s",
    vaultId,
    Seperator,
    usernameEncoded,
    Seperator,
    positionNameEncoded
)

我们做到了。我们终于消除了所有潜在的 storageKey 冲突。

到目前为止，我们的重点主要是存储单个结构。我们认识到，在实际应用中，我们经常遇到必须将多个结构存储为持久状态的情况。

在 Cosmos 框架中，每个 Module 拥有一组 KVStore 并有单独的 Keeper 管理对存储的访问是很常见的。同样重要的是要注意，每个 KVStore 应该彼此独立，从而减轻开发者担心不同 Module 之间键冲突的麻烦。

话虽如此，如果我们需要在同一个 KVStore 中维护多个结构怎么办？

为了演示这种情况，我们引入了 NameToAddressMap 结构，该结构将存储在我们之前使用的同一个 KVStore 中。

type VaultId uint64
type Username string

type PositionName string
type Position struct {
    data []byte
}
type PositionMap :=
    map[VaultId]map[Username]map[PositionName]Position

type AddressName string
type Address struct {
 data []byte
}
type AddressMap :=
    map[VaultId]map[Username]map[AddressName]Address

const (
    Seperator = "|"
)

func PositionMapKey(
    vaultId uint64,
    username, positionName []byte,
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

    positionNameEncoded := make(
        []byte,
        hex.EncodedLen(len(positionName)),
    )
    hex.Encode(positionNameEncoded, positionName)

    key := fmt.Sprintf(
        "%d%s%s%s%s",
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        positionNameEncoded,
    )
}

func AddressMapKey(
    vaultId uint64,
    username, addressName []byte
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

    addressNameEncoded := make(
        []byte,
        hex.EncodedLen(len(addressName)),
    )
    hex.Encode(addressNameEncoded, addressName)

    key := fmt.Sprintf(
        "%d%s%s%s%s",
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        addressNameEncoded,
    )
}

不幸的是，当在同一个 KVStore 中处理多个存储条目时，之前的实现不足以保证键的唯一性。虽然它仍然有效地防止了每个单独结构中的键冲突，但它并不能防止跨结构键冲突。

vaultId = 1, username = "a", positionName = "b"
    => PositionMapKey = "1|a|b"

vaultId = 1, username = "a", addressName = "b"
    => AddressMapKey = "1|a||b"

为了防止这种情况，请在每个键的开头添加一个特定于结构的前缀，以充当域分隔符。

const (
    Seperator = "|"
    PositionMapPrefix = "\x01"
    AddressMapPrefix = "\x02"
)

func PositionMapKey(
    vaultId uint64,
    username, positionName []byte,
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

    positionNameEncoded := make(
        []byte,
        hex.EncodedLen(len(positionName)),
    )
    hex.Encode(positionNameEncoded, positionName)

    key := fmt.Sprintf(
        "%s%d%s%s%s%s",
        PositionMapPrefix,
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        positionNameEncoded,
    )
}

func AddressMapKey(
    vaultId uint64,
    username, addressName []byte,
) (key []byte) {
    usernameEncoded := make(
        []byte,
        hex.EncodedLen(len(username)),
    )
    hex.Encode(usernameEncoded, username)

    addressNameEncoded := make(
        []byte,
        hex.EncodedLen(len(addressName)),
    )
    hex.Encode(addressNameEncoded, addressName)

    key := fmt.Sprintf(
        "%s%d%s%s%s%s",
        AddressMapPrefix,
        vaultId,
        Seperator,
        usernameEncoded,
        Seperator,
        addressNameEncoded,
    )
}

现在我们有了一个如何序列化存储密钥的正确示例。

然而，存储的意义远不止于此。如前所述，存储应支持其原有功能。就 `map` 而言，数据仍应可通过原始键检索。

让我们来看看这样一种情况：我们想从存储中检索与某个 `VaultId` 相关联的所有 `map[Username]map[PositionName]Position` 。我们怎样才能安全地做到这一点呢？

幸运的是，Cosmos-SDK 提供了获取与 `storageKey` 前缀相关联的所有条目的应用程序接口。下面是一个尝试获取 `vaultId` 数据的例子：

func FetchPositionMapWithVaultId(
    vaultId uint64,
) ([]map[Username]map[PositionName]Position) {
    values := map[Username]map[PositionName]Position{}
    i := sdk.KVStorePrefixIterator(
        kvStore,
        fmt.Sprintf("%s%d", PositionMapPrefix, vaultId)
    )
    for ; i.Valid(); i.Next() {
        k := strings.split(i.Key(), Seperator)

        username := make([]byte, hex.DecodedLen(k[0]))
        _, err := hex.Decode(username, k[0])
        if err != nil {
            return nil, err
        }

        positionName := make([]byte, hex.DecodedLen(k[1]))
        _, err := hex.Decode(positionName, k[1])
        if err != nil {
            return nil, err
        }

        if entry, ok := values[username]; !ok {
            values[username] = make(map[PositionName])
        }

        values[username][positionName] = Position {
            data: iterator.Value(),
        }
    }
    return values
}

现在，您可能已经注意到，这种实现方式存在字段延展性问题。设想一下 vaultId = 1 和 vaultId = 10 同时存在的情况。如果我们尝试获取 vaultId = 1 下的数据，那么也会返回 vaultId = 10 下的所有条目，原因很简单，因为 1 是 10 的前缀。要解决这个问题，我们必须再次将 Separator 添加到迭代器前缀中。

i := sdk.KVStorePrefixIterator(
    kvStore,
    fmt.Sprintf("%s%d%s", PositionMapPrefix, vaultId, Seperator),
)

起初，识别这些序列化问题似乎很容易。一旦数据结构和 KVStore 的使用变得越来越复杂，开发人员就会无意中忽略存储密钥解析错误。

在 Cosmos 上构建时，存储键仍然是一个乏味且持续存在的问题。在开发过程中，务必谨慎对待，以防止错误潜入代码。

真实世界的例子

Cosmos-SDK 之前缺乏对 KVStore 键冲突的保护。之前的疏忽导致开发者可能会无意中创建两个彼此不独立的 KVStore。

代码片段的永久链接

func NewKVStoreKeys(names ...string) map[string]*KVStoreKey {
    keys := make(map[string]*KVStoreKey)
    for _, name := range names {
        keys[name] = NewKVStoreKey(name)
    }

    return keys
}

由于核心开发人员的努力，现在强制执行检查，如果任何 KVStore 键是彼此的前缀，Cosmos-SDK 将拒绝运行。此实现减轻了开发人员不必担心 KVStore 级别的密钥冲突。

其他存储键问题，例如 Cosmos-SDK 中的细微错误，导致了不正确的迭代器行为。

值得注意的是，自 Cosmos v0.50 以来，集合存储助手的逐步采用，使得编写有缺陷的代码变得更加困难。这表明了及时了解最新的 SDK 开发以利用架构安全改进的重要性。

结论

原文链接： osec.io/blog/2025-06-10-...

登链社区 AI 助手，为大家转译优秀英文文章，如有翻译不通的地方，还请包涵～

翻译
学分: 10
分类: Cosmos
标签: Cosmos SDK 安全漏洞 AnteHandler 存储键冲突无限循环确定性

本文参与登链社区写作激励计划，好文好收益，欢迎正在阅读的你也加入。

Cosmos安全：Otter 的指南

简介

循环开始了

真实世界的例子

秩序是人类的梦想

真实世界的例子

你不应该通过...还是应该？

真实世界的例子

错误？崩溃？我可以处理

真实世界的例子

相同，相同...但不同

真实世界的例子

密钥管理员

真实世界的例子

结论

0 条评论

文章目录