逐行代码解读Solana的Hello world程序

biakia
发布于 2023-03-14 15:13
阅读 5372

逐行代码解读Solana的入门程序

本文最早发布在CSDN，但是没啥人看，感觉这里的氛围更好些，所以稍作润色转发到这里来。

# 1 项目源码
https://github.com/solana-labs/example-helloworld/tree/master/src/program-rust

```rust
use borsh::{BorshDeserialize, BorshSerialize};
use solana_program::{
    account_info::{next_account_info, AccountInfo},
    entrypoint,
    entrypoint::ProgramResult,
    msg,
    program_error::ProgramError,
    pubkey::Pubkey,
};

/// Define the type of state stored in accounts
#[derive(BorshSerialize, BorshDeserialize, Debug)]
pub struct GreetingAccount {
    /// number of greetings
    pub counter: u32,
}

// Declare and export the program's entrypoint
entrypoint!(process_instruction);

// Program entrypoint's implementation
pub fn process_instruction(
    program_id: &Pubkey, // Public key of the account the hello world program was loaded into
    accounts: &[AccountInfo], // The account to say hello to
    _instruction_data: &[u8], // Ignored, all helloworld instructions are hellos
) -> ProgramResult {
    msg!("Hello World Rust program entrypoint");

// Iterating accounts is safer than indexing
    let accounts_iter = &mut accounts.iter();

// Get the account to say hello to
    let account = next_account_info(accounts_iter)?;

// The account must be owned by the program in order to modify its data
    if account.owner != program_id {
        msg!("Greeted account does not have the correct program id");
        return Err(ProgramError::IncorrectProgramId);
    }

// Increment and store the number of times the account has been greeted
    let mut greeting_account = GreetingAccount::try_from_slice(&account.data.borrow())?;
    greeting_account.counter += 1;
    greeting_account.serialize(&mut &mut account.data.borrow_mut()[..])?;

msg!("Greeted {} time(s)!", greeting_account.counter);

Ok(())
}
```
这个程序是Solana实现的一个最简单的例子，下面我将按行解读源码

# 2、逐行解读
```rust
use borsh::{BorshDeserialize, BorshSerialize};
```
Rust通过关键词use来引入外部依赖，这里引入的是borsh这个包里的BorshDeserialize和BorshSerialize。这两个模块是用来序列化和反序列化的。BorshDeserialize可以将二进制反序列化为struct结构体，而BorshSerialize可以将strcut结构体序列化为二进制。
```rust
use solana_program::{
    account_info::{next_account_info, AccountInfo},
    entrypoint,
    entrypoint::ProgramResult,
    msg,
    program_error::ProgramError,
    pubkey::Pubkey,
};
```
solana_program模块是solana官方的SDK，包含了一系列写solana需要的数据结构和工具类。account_info包里的AccountInfo代表了solana里的账户概念。在solidity里，每个合约既有程序逻辑（各种function），也有数据结构（各种struct、map等），逻辑和状态是在一起的。而在solana中，只有程序逻辑，而数据结构是需要传进来的，而这个传进来的数据结构就是账户。在这里，你可以把账户想象成一个个文件，每个用户有自己的文件，他在调用程序的时候，必须把需要操作的文件传进来。solana这样设计，是基于性能的考虑，当多个交易操作的是不同的文件的时候，理论上就可以进行并行操作，这样就可以大大提升了tps，而基于solidity的EVM都是串行的。

next_account_info其实是个方法，你没看错，Rust可以把方法导入，如果不导入，你在调用该方法的时候就必须从solana_program开始一级一级的调用下去。这个方法其实就是个迭代器，因为我们传入的AccountInfo是个数组，这个方法可以帮我们拿到下一个AccountInfo。

entrypoint是个宏定义，它是solana自己写的，用来定义整个程序的执行入口，具体用法下文会详细说明。
entrypoint::ProgramResult是个统一的返回值包装结构，也是solana自己定义的

msg也是个宏定义，是用来打印信息的，有点像println

program_error::ProgramError是solana官方定义的一些常见的错误枚举

pubkey::Pubkey是账户的公钥，要操作一个账户，必须用到它的公钥。这里你可以想象成solidity里的address地址，比如ETH里的0x开头的地址。
```rust
#[derive(BorshSerialize, BorshDeserialize, Debug)]
pub struct GreetingAccount {
    /// number of greetings
    pub counter: u32,
}
```
#[derive(BorshSerialize, BorshDeserialize, Debug)]也是个宏定义，它的作用类似于继承，使用了这个宏定义的数据结构就会拥有BorshSerialize、BorshDeserialize和Debug里的功能。这里我们给GreetingAccount这个struct使用了宏定义，那么GreetingAccount就会拥有序列化、反序列化以及debug的能力。我们可以直接调用相应的方法进行序列化和反序列化，而不需要自己从头实现。
```rust
// Declare and export the program's entrypoint
entrypoint!(process_instruction);
```
这行也是个宏定义，作用是定义程序的入口，传入的是方法名，下面我们来看看具体实现（代码在solana-program-1.7.9的entrypoint.rs文件里）：
```rust
/// Declare the entry point of the program and use the default local heap
/// implementation
///
/// Deserialize the program input arguments and call the user defined
/// `process_instruction` function. Users must call this macro otherwise an
/// entry point for their program will not be created.
#[macro_export]
macro_rules! entrypoint {
    ($process_instruction:ident) => {
        /// # Safety
        #[no_mangle]
        pub unsafe extern "C" fn entrypoint(input: *mut u8) -> u64 {
            let (program_id, accounts, instruction_data) =
                unsafe { $crate::entrypoint::deserialize(input) };
            match $process_instruction(&program_id, &accounts, &instruction_data) {
                Ok(()) => $crate::entrypoint::SUCCESS,
                Err(error) => error.into(),
            }
        }
        $crate::custom_heap_default!();
        $crate::custom_panic_default!();
    };
}
```
#[macro_export]和macro_rules!是Rust里宏定义的固定写法，具体可以参考Rust手册（https://doc.rust-lang.org/book/ch19-06-macros.html）
entrypoint就是这个宏的名字，定义后可以直接使用entrypoint! 来调用宏。

($process_instruction:ident) 这里的写法类似于match，当传入的参数是ident类型时，这个条件就成立，下面的逻辑就会执行，当传入的是其他的类型的时候，这个条件就不满足，那么下面的逻辑也就不执行。

#[no_mangle] 这里也是个宏定义，是Rust的一个开关，用来告诉编译器，不要对我们写的函数进行混淆，而是保持原来的名称，因为有时候编译器会帮我们把函数名简化或者加一些前缀。这里使用这个宏的原因是，下面定义的方法是个被外部语言调用的方法（你可以想象成是solana引擎来调用），因此函数名不能变，不然外部调用者就找不到这个函数了。

```rust
pub unsafe extern "C" fn entrypoint(input: *mut u8) -> u64  
```
这里定义了一个unsafe的函数，之所以要定义为unsafe的函数，是为了要使用原始指针。我们都知道，Rust对指针的使用有相当严格的规范，为了避免这些规范提升指针灵活性，开发者就可以使用unsafe来解除规范，当然，这也是有代价的，那就是有可能写出bug来。不过这里的代码都是solana官方写的，bug应该比较少。后面的extern "C"指的是，这个函数是被外部的C语言程序调用的。后面的entrypoint就是这个函数的名字，入参是个*mut u8类型的值，这是一个可修改的u8类型的原始指针（原始指针相关内容可以看：https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html ）返回的是个u64数据类型的值。

```rust
let (program_id, accounts, instruction_data) =unsafe { $crate::entrypoint::deserialize(input) };
```
这里调用了本模块的deserialize方法，返回了三个值：program_id、accounts和instruction_data。

```rust
match $process_instruction(&program_id, &accounts, &instruction_data) 
```
这里调用了process_instruction函数来处理具体逻辑，这个函数定义在我们的helloworld程序了，处理返回的是个ProgramResult，这里使用match操作来匹配结果，如果是OK，就返回$crate::entrypoint::SUCCESS，如果是Err，就返回对应错误枚举的值。

```rust
$crate::custom_heap_default!();
$crate::custom_panic_default!();
```
这两个不太重要，留给读者自己研究。

下面来看看$crate::entrypoint::deserialize(input)到底干了什么：
```rust
#[allow(clippy::type_complexity)]
pub unsafe fn deserialize<'a>(input: *mut u8) -> (&'a Pubkey, Vec<AccountInfo<'a>>, &'a [u8]) {
    let mut offset: usize = 0;

// Number of accounts present

#[allow(clippy::cast_ptr_alignment)]
    let num_accounts = *(input.add(offset) as *const u64) as usize;
    offset += size_of::<u64>();

// Account Infos

let mut accounts = Vec::with_capacity(num_accounts);
    for _ in 0..num_accounts {
        let dup_info = *(input.add(offset) as *const u8);
        offset += size_of::<u8>();
        if dup_info == std::u8::MAX {
            #[allow(clippy::cast_ptr_alignment)]
            let is_signer = *(input.add(offset) as *const u8) != 0;
            offset += size_of::<u8>();

#[allow(clippy::cast_ptr_alignment)]
            let is_writable = *(input.add(offset) as *const u8) != 0;
            offset += size_of::<u8>();

#[allow(clippy::cast_ptr_alignment)]
            let executable = *(input.add(offset) as *const u8) != 0;
            offset += size_of::<u8>();

offset += size_of::<u32>(); // padding to u64

let key: &Pubkey = &*(input.add(offset) as *const Pubkey);
            offset += size_of::<Pubkey>();

let owner: &Pubkey = &*(input.add(offset) as *const Pubkey);
            offset += size_of::<Pubkey>();

#[allow(clippy::cast_ptr_alignment)]
            let lamports = Rc::new(RefCell::new(&mut *(input.add(offset) as *mut u64)));
            offset += size_of::<u64>();

#[allow(clippy::cast_ptr_alignment)]
            let data_len = *(input.add(offset) as *const u64) as usize;
            offset += size_of::<u64>();

let data = Rc::new(RefCell::new({
                from_raw_parts_mut(input.add(offset), data_len)
            }));
            offset += data_len + MAX_PERMITTED_DATA_INCREASE;
            offset += (offset as *const u8).align_offset(align_of::<u128>()); // padding

#[allow(clippy::cast_ptr_alignment)]
            let rent_epoch = *(input.add(offset) as *const u64);
            offset += size_of::<u64>();

accounts.push(AccountInfo {
                key,
                is_signer,
                is_writable,
                lamports,
                data,
                owner,
                executable,
                rent_epoch,
            });
        } else {
            offset += 7; // padding

// Duplicate account, clone the original
            accounts.push(accounts[dup_info as usize].clone());
        }
    }

// Instruction data

#[allow(clippy::cast_ptr_alignment)]
    let instruction_data_len = *(input.add(offset) as *const u64) as usize;
    offset += size_of::<u64>();

let instruction_data = { from_raw_parts(input.add(offset), instruction_data_len) };
    offset += instruction_data_len;

// Program Id

let program_id: &Pubkey = &*(input.add(offset) as *const Pubkey);

(program_id, accounts, instruction_data)
}
```
下面我们逐行分析代码：
```rust
let mut offset: usize = 0; 
```
这里定义了一个偏移量，主要是配合\*mut u8这个原始指针来使用的，目的是读取对应位置的数据。
```rust
let num_accounts = *(input.add(offset) as *const u64) as usize;  
```
这里的代码比较复杂，我们一步一步来分析，首先input.add(offset)找到对应位置，由于offset是0，所以就是起始位置，as \*const u64表示把这个可修改的8位原始指针，强制转换成不可修改的64位原始指针，然后通过 * 操作符获取到对应的值，这个值是64位的，最后通过as强制转换成usize。这里的代码大致作用就是用来获取传入的AccountInfo的数量。
```rust
offset += size_of::<u64>(); 
```
这行代码将偏移增加u64的长度，也就是num_accounts的长度，准备读取接下来的数据。
```rust
let mut accounts = Vec::with_capacity(num_accounts);
```
这行代码用来初始化容量为num_accounts的Vec。
```rust
for _ in 0..num_accounts
```
开始for循环迭代
```rust
let dup_info = *(input.add(offset) as *const u8); 
```
由上面num_accounts代码分析可以知道，这里其实就是取下一个u8类型的数据，命名为dup_info。这其实是个位标记，当我们传入的AccountInfo有重复的时候，我们可以用位标记代替，而不是传入全部数据，这样可以减少数据传输量。
```rust
offset += size_of::<u8>();
```
指针偏移增加
```rust
if dup_info == std::u8::MAX
```
如果dup_info是255，表示没有重复的AccountInfo，这里是需要读取AccountInfo的数据。
```rust
let is_signer = *(input.add(offset) as *const u8) != 0; 
```
读取下一个u8类型的数据，如果不为0，那么is_signer就是true，否则就是false，这里的is_signer是AccountInfo的一个成员变量。
```rust
offset += size_of::<u8>();
```
指针偏移增加
```rust
let is_writable = *(input.add(offset) as *const u8) != 0;
```
读取下一个u8类型的数据，如果不为0，那么is_writable就是true，否则就是false，这里的is_writable是AccountInfo的一个成员变量。
```rust
offset += size_of::<u8>();
```
指针偏移增加
```rust
let executable = *(input.add(offset) as *const u8) != 0; 
```
读取下一个u8类型的数据，如果不为0，那么executable就是true，否则就是false，这里的executable是AccountInfo的一个成员变量。
```rust
offset += size_of::<u8>();
```
指针偏移增加
```rust
offset += size_of::<u32>(); 
```
因为前面读了4个u8类型，而solana的数据格式需要按64位对齐，这里再加32就是为了对齐偏移量。
```rust
let key: &Pubkey = &*(input.add(offset) as *const Pubkey); 
```
读取下一个Pubkey类型的数据，并通过&操作符，获取Pubkey的引用，然后赋值给key。
```rust
offset += size_of::<Pubkey>();
```
指针偏移增加
```rust
let owner: &Pubkey = &*(input.add(offset) as *const Pubkey);
```
读取下一个Pubkey类型的数据，并通过&操作符，获取Pubkey的引用，然后赋值给owner。
```rust
offset += size_of::<Pubkey>();
```
指针偏移增加
```rust
let lamports = Rc::new(RefCell::new(&mut *(input.add(offset) as *mut u64)));
```
这里的代码也比较复杂，需要一步一步分析，首先input.add(offset)得到的是下一个数据的原始指针，然后强制转换成\*mut u64，表示可修改的u64类型原始指针，然后通过 * 操作符获取该位置的值，然后通过&mut操作符转换成可修改的引用，然后使用RefCell包裹这个引用，最后使用Rc包裹RefCell。Rc和RefCell可以看这两篇文章：
https://doc.rust-lang.org/book/ch15-04-rc.html
https://doc.rust-lang.org/book/ch15-05-interior-mutability.html

```rust
offset += size_of::<u64>();
```
指针偏移增加
```rust
let data_len = *(input.add(offset) as *const u64) as usize; 
```
这里获取data的长度
```rust
offset += size_of::<u64>();
```
指针偏移增加
```rust
let data = Rc::new(RefCell::new({from_raw_parts_mut(input.add(offset), data_len)}));
```
这里通过from_raw_parts_mut这个底层方法获取实际数据，然后使用RefCell和Rc包裹
```rust
offset += data_len + MAX_PERMITTED_DATA_INCREASE; 
```
这里直接把最大的可读取的范围加到了偏移上。
```rust
offset += (offset as *const u8).align_offset(align_of::<u128>());
```
这里也是为了对齐
```rust
let rent_epoch = *(input.add(offset) as *const u64);
```
读取rent_epoch
```rust
offset += size_of::<u64>();
```
指针偏移增加
```rust
accounts.push(AccountInfo {key,is_signer,is_writable,lamports,data,owner,executable,rent_epoch,});
```
这里生成一个AccountInfo并且push进accounts里。
```rust
else {

offset += 7; // padding

// Duplicate account, clone the original

accounts.push(accounts[dup_info as usize].clone());

}
```
else语句里表示有重复的AccountInfo，直接使用dup_info作为下标，找到对应的AccountInfo，然后克隆一个出来。
```rust
let instruction_data_len = *(input.add(offset) as *const u64) as usize;
```
获取instruction_data的长度
```rust
offset += size_of::<u64>();
```
指针偏移增加
```rust
let instruction_data = { from_raw_parts(input.add(offset), instruction_data_len) }; 
```
获取instruction_data数据
```rust
offset += instruction_data_len; 
```
指针偏移增加
```rust
let program_id: &Pubkey = &*(input.add(offset) as *const Pubkey);
```
获取program_id
```rust
(program_id, accounts, instruction_data) 
```
返回解析出来的三个值。
到此，我们把整个entrypoint宏定义解释了一遍。总的来说，这个宏做的事大致分为三步：

1、解析二进制数据，转换成program_id, accounts, instruction_data

2、使用解析的program_id, accounts, instruction_data调用process_instruction函数

3、判断process_instruction函数返回的值，正常就返回SUCCESS，错误就返回对应的错误码。

下面，让我们返回用户写的程序helloworld，也就是process_instruction：
```rust
// Program entrypoint's implementation
pub fn process_instruction(
    program_id: &Pubkey, // Public key of the account the hello world program was loaded into
    accounts: &[AccountInfo], // The account to say hello to
    _instruction_data: &[u8], // Ignored, all helloworld instructions are hellos
) -> ProgramResult {
    msg!("Hello World Rust program entrypoint");

// Iterating accounts is safer than indexing
    let accounts_iter = &mut accounts.iter();

// Get the account to say hello to
    let account = next_account_info(accounts_iter)?;

msg!("Greeted {} time(s)!", greeting_account.counter);

Ok(())
}
```
这里我们可以看到，process_instruction的三个参数正好就是解析的program_id, accounts, instruction_data，或者说，我们必须定义这三个参数，这是为了配合宏定义的规范。
```rust
msg!("Hello World Rust program entrypoint"); 
```
打印入口信息，这说明我们已经进入helloworld的主程序里了
```rust
let accounts_iter = &mut accounts.iter();
```
由于accounts是个Vec的引用，因此我们可以拿到它的迭代器
```rust
let account = next_account_info(accounts_iter)?; 
```
通过next_account_info方法拿到accounts的第一个数据。这里的？其实是语法糖，表示next_account_info如果返回的是正常结果，就赋值给account否则返回错误。
```rust
if account.owner != program_id {
        msg!("Greeted account does not have the correct program id");
        return Err(ProgramError::IncorrectProgramId);
}
```
这行判断第一个account的owner是不是等于program_id。在solana里，每个account都属于且只属于一个program，如果你传入的account是其他program的，那么当前程序是操作不了这个account的，运行会报错。所以这里提前判断了一下，返回一个有意义的错误码。
```rust
let mut greeting_account = GreetingAccount::try_from_slice(&account.data.borrow())?;
```
这行代码使用try_from_slice将account里二进制数据data转换成GreetingAccount，这是通过BorshDeserialize来实现的，具体原理就不解释了，比较复杂。
```rust
greeting_account.counter += 1;
```
greeting_account的counter变量加一，此时数据变化只是在内存中
```rust
greeting_account.serialize(&mut &mut account.data.borrow_mut()[..])?;
```
序列化GreetingAccount，这里通过BorshSerialize实现的，此时数据变化同步到了account中
```rust
msg!("Greeted {} time(s)!", greeting_account.counter); 
```
打印变化后的值
```rust
Ok(())
```
返回正常结果。

# 3、总结
到此，整个solana的helloworld项目就分析完了，下面做个小小的总结：

1、必须定义entrypoint!(process_instruction); 否则找不到程序入口

2、process_instruction的入参是固定的，否则可能调用失败

3、对于每个AccountInfo，里面有个data字段，这个字段保存的就是用户自定义的数据，我们可以在程序里定义struct，然后通过BorshSerialize和BorshDeserialize来序列化和反序列化data数据。

4、程序如果正常返回就返回Ok，否则返回Err和对应错误码

本文最早发布在CSDN，但是没啥人看，感觉这里的氛围更好些，所以稍作润色转发到这里来。

1 项目源码

https://github.com/solana-labs/example-helloworld/tree/master/src/program-rust

use borsh::{BorshDeserialize, BorshSerialize};
use solana_program::{
    account_info::{next_account_info, AccountInfo},
    entrypoint,
    entrypoint::ProgramResult,
    msg,
    program_error::ProgramError,
    pubkey::Pubkey,
};

/// Define the type of state stored in accounts
#[derive(BorshSerialize, BorshDeserialize, Debug)]
pub struct GreetingAccount {
    /// number of greetings
    pub counter: u32,
}

// Declare and export the program's entrypoint
entrypoint!(process_instruction);

// Program entrypoint's implementation
pub fn process_instruction(
    program_id: &Pubkey, // Public key of the account the hello world program was loaded into
    accounts: &[AccountInfo], // The account to say hello to
    _instruction_data: &[u8], // Ignored, all helloworld instructions are hellos
) -> ProgramResult {
    msg!("Hello World Rust program entrypoint");

    // Iterating accounts is safer than indexing
    let accounts_iter = &mut accounts.iter();

    // Get the account to say hello to
    let account = next_account_info(accounts_iter)?;

    // The account must be owned by the program in order to modify its data
    if account.owner != program_id {
        msg!("Greeted account does not have the correct program id");
        return Err(ProgramError::IncorrectProgramId);
    }

    // Increment and store the number of times the account has been greeted
    let mut greeting_account = GreetingAccount::try_from_slice(&account.data.borrow())?;
    greeting_account.counter += 1;
    greeting_account.serialize(&mut &mut account.data.borrow_mut()[..])?;

    msg!("Greeted {} time(s)!", greeting_account.counter);

    Ok(())
}

这个程序是Solana实现的一个最简单的例子，下面我将按行解读源码

2、逐行解读

use borsh::{BorshDeserialize, BorshSerialize};

Rust通过关键词use来引入外部依赖，这里引入的是borsh这个包里的BorshDeserialize和BorshSerialize。这两个模块是用来序列化和反序列化的。BorshDeserialize可以将二进制反序列化为struct结构体，而BorshSerialize可以将strcut结构体序列化为二进制。

use solana_program::{
    account_info::{next_account_info, AccountInfo},
    entrypoint,
    entrypoint::ProgramResult,
    msg,
    program_error::ProgramError,
    pubkey::Pubkey,
};

solana_program模块是solana官方的SDK，包含了一系列写solana需要的数据结构和工具类。account_info包里的AccountInfo代表了solana里的账户概念。在solidity里，每个合约既有程序逻辑（各种function），也有数据结构（各种struct、map等），逻辑和状态是在一起的。而在solana中，只有程序逻辑，而数据结构是需要传进来的，而这个传进来的数据结构就是账户。在这里，你可以把账户想象成一个个文件，每个用户有自己的文件，他在调用程序的时候，必须把需要操作的文件传进来。solana这样设计，是基于性能的考虑，当多个交易操作的是不同的文件的时候，理论上就可以进行并行操作，这样就可以大大提升了tps，而基于solidity的EVM都是串行的。

entrypoint是个宏定义，它是solana自己写的，用来定义整个程序的执行入口，具体用法下文会详细说明。 entrypoint::ProgramResult是个统一的返回值包装结构，也是solana自己定义的

msg也是个宏定义，是用来打印信息的，有点像println

program_error::ProgramError是solana官方定义的一些常见的错误枚举

pubkey::Pubkey是账户的公钥，要操作一个账户，必须用到它的公钥。这里你可以想象成solidity里的address地址，比如ETH里的0x开头的地址。

#[derive(BorshSerialize, BorshDeserialize, Debug)]
pub struct GreetingAccount {
    /// number of greetings
    pub counter: u32,
}

[derive(BorshSerialize, BorshDeserialize, Debug)]也是个宏定义，它的作用类似于继承，使用了这个宏定义的数据结构就会拥有BorshSerialize、BorshDeserialize和Debug里的功能。这里我们给GreetingAccount这个struct使用了宏定义，那么GreetingAccount就会拥有序列化、反序列化以及debug的能力。我们可以直接调用相应的方法进行序列化和反序列化，而不需要自己从头实现。

// Declare and export the program's entrypoint
entrypoint!(process_instruction);

这行也是个宏定义，作用是定义程序的入口，传入的是方法名，下面我们来看看具体实现（代码在solana-program-1.7.9的entrypoint.rs文件里）：

/// Declare the entry point of the program and use the default local heap
/// implementation
///
/// Deserialize the program input arguments and call the user defined
/// `process_instruction` function. Users must call this macro otherwise an
/// entry point for their program will not be created.
#[macro_export]
macro_rules! entrypoint {
    ($process_instruction:ident) => {
        /// # Safety
        #[no_mangle]
        pub unsafe extern "C" fn entrypoint(input: *mut u8) -> u64 {
            let (program_id, accounts, instruction_data) =
                unsafe { $crate::entrypoint::deserialize(input) };
            match $process_instruction(&program_id, &accounts, &instruction_data) {
                Ok(()) => $crate::entrypoint::SUCCESS,
                Err(error) => error.into(),
            }
        }
        $crate::custom_heap_default!();
        $crate::custom_panic_default!();
    };
}

[macro_export]和macro_rules!是Rust里宏定义的固定写法，具体可以参考Rust手册（https://doc.rust-lang.org/book/ch19-06-macros.html）

entrypoint就是这个宏的名字，定义后可以直接使用entrypoint! 来调用宏。

[no_mangle] 这里也是个宏定义，是Rust的一个开关，用来告诉编译器，不要对我们写的函数进行混淆，而是保持原来的名称，因为有时候编译器会帮我们把函数名简化或者加一些前缀。这里使用这个宏的原因是，下面定义的方法是个被外部语言调用的方法（你可以想象成是solana引擎来调用），因此函数名不能变，不然外部调用者就找不到这个函数了。

pub unsafe extern "C" fn entrypoint(input: *mut u8) -> u64

这里定义了一个unsafe的函数，之所以要定义为unsafe的函数，是为了要使用原始指针。我们都知道，Rust对指针的使用有相当严格的规范，为了避免这些规范提升指针灵活性，开发者就可以使用unsafe来解除规范，当然，这也是有代价的，那就是有可能写出bug来。不过这里的代码都是solana官方写的，bug应该比较少。后面的extern "C"指的是，这个函数是被外部的C语言程序调用的。后面的entrypoint就是这个函数的名字，入参是个*mut u8类型的值，这是一个可修改的u8类型的原始指针（原始指针相关内容可以看：https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html ）返回的是个u64数据类型的值。

let (program_id, accounts, instruction_data) =unsafe { $crate::entrypoint::deserialize(input) };

这里调用了本模块的deserialize方法，返回了三个值：program_id、accounts和instruction_data。

match $process_instruction(&program_id, &accounts, &instruction_data)

这里调用了process_instruction函数来处理具体逻辑，这个函数定义在我们的helloworld程序了，处理返回的是个ProgramResult，这里使用match操作来匹配结果，如果是OK，就返回$crate::entrypoint::SUCCESS，如果是Err，就返回对应错误枚举的值。

$crate::custom_heap_default!();
$crate::custom_panic_default!();

这两个不太重要，留给读者自己研究。

下面来看看$crate::entrypoint::deserialize(input)到底干了什么：

#[allow(clippy::type_complexity)]
pub unsafe fn deserialize&lt;'a>(input: *mut u8) -> (&'a Pubkey, Vec&lt;AccountInfo&lt;'a>>, &'a [u8]) {
    let mut offset: usize = 0;

    // Number of accounts present

    #[allow(clippy::cast_ptr_alignment)]
    let num_accounts = *(input.add(offset) as *const u64) as usize;
    offset += size_of::&lt;u64>();

    // Account Infos

    let mut accounts = Vec::with_capacity(num_accounts);
    for _ in 0..num_accounts {
        let dup_info = *(input.add(offset) as *const u8);
        offset += size_of::&lt;u8>();
        if dup_info == std::u8::MAX {
            #[allow(clippy::cast_ptr_alignment)]
            let is_signer = *(input.add(offset) as *const u8) != 0;
            offset += size_of::&lt;u8>();

            #[allow(clippy::cast_ptr_alignment)]
            let is_writable = *(input.add(offset) as *const u8) != 0;
            offset += size_of::&lt;u8>();

            #[allow(clippy::cast_ptr_alignment)]
            let executable = *(input.add(offset) as *const u8) != 0;
            offset += size_of::&lt;u8>();

            offset += size_of::&lt;u32>(); // padding to u64

            let key: &Pubkey = &*(input.add(offset) as *const Pubkey);
            offset += size_of::&lt;Pubkey>();

            let owner: &Pubkey = &*(input.add(offset) as *const Pubkey);
            offset += size_of::&lt;Pubkey>();

            #[allow(clippy::cast_ptr_alignment)]
            let lamports = Rc::new(RefCell::new(&mut *(input.add(offset) as *mut u64)));
            offset += size_of::&lt;u64>();

            #[allow(clippy::cast_ptr_alignment)]
            let data_len = *(input.add(offset) as *const u64) as usize;
            offset += size_of::&lt;u64>();

            let data = Rc::new(RefCell::new({
                from_raw_parts_mut(input.add(offset), data_len)
            }));
            offset += data_len + MAX_PERMITTED_DATA_INCREASE;
            offset += (offset as *const u8).align_offset(align_of::&lt;u128>()); // padding

            #[allow(clippy::cast_ptr_alignment)]
            let rent_epoch = *(input.add(offset) as *const u64);
            offset += size_of::&lt;u64>();

            accounts.push(AccountInfo {
                key,
                is_signer,
                is_writable,
                lamports,
                data,
                owner,
                executable,
                rent_epoch,
            });
        } else {
            offset += 7; // padding

            // Duplicate account, clone the original
            accounts.push(accounts[dup_info as usize].clone());
        }
    }

    // Instruction data

    #[allow(clippy::cast_ptr_alignment)]
    let instruction_data_len = *(input.add(offset) as *const u64) as usize;
    offset += size_of::&lt;u64>();

    let instruction_data = { from_raw_parts(input.add(offset), instruction_data_len) };
    offset += instruction_data_len;

    // Program Id

    let program_id: &Pubkey = &*(input.add(offset) as *const Pubkey);

    (program_id, accounts, instruction_data)
}

下面我们逐行分析代码：

let mut offset: usize = 0;

这里定义了一个偏移量，主要是配合*mut u8这个原始指针来使用的，目的是读取对应位置的数据。

let num_accounts = *(input.add(offset) as *const u64) as usize;

这里的代码比较复杂，我们一步一步来分析，首先input.add(offset)找到对应位置，由于offset是0，所以就是起始位置，as *const u64表示把这个可修改的8位原始指针，强制转换成不可修改的64位原始指针，然后通过 * 操作符获取到对应的值，这个值是64位的，最后通过as强制转换成usize。这里的代码大致作用就是用来获取传入的AccountInfo的数量。

offset += size_of::&lt;u64>();

这行代码将偏移增加u64的长度，也就是num_accounts的长度，准备读取接下来的数据。

let mut accounts = Vec::with_capacity(num_accounts);

这行代码用来初始化容量为num_accounts的Vec。

for _ in 0..num_accounts

开始for循环迭代

let dup_info = *(input.add(offset) as *const u8);

由上面num_accounts代码分析可以知道，这里其实就是取下一个u8类型的数据，命名为dup_info。这其实是个位标记，当我们传入的AccountInfo有重复的时候，我们可以用位标记代替，而不是传入全部数据，这样可以减少数据传输量。

offset += size_of::&lt;u8>();

指针偏移增加

if dup_info == std::u8::MAX

如果dup_info是255，表示没有重复的AccountInfo，这里是需要读取AccountInfo的数据。

let is_signer = *(input.add(offset) as *const u8) != 0;

读取下一个u8类型的数据，如果不为0，那么is_signer就是true，否则就是false，这里的is_signer是AccountInfo的一个成员变量。

offset += size_of::&lt;u8>();

指针偏移增加

let is_writable = *(input.add(offset) as *const u8) != 0;

读取下一个u8类型的数据，如果不为0，那么is_writable就是true，否则就是false，这里的is_writable是AccountInfo的一个成员变量。

offset += size_of::&lt;u8>();

指针偏移增加

let executable = *(input.add(offset) as *const u8) != 0;

读取下一个u8类型的数据，如果不为0，那么executable就是true，否则就是false，这里的executable是AccountInfo的一个成员变量。

offset += size_of::&lt;u8>();

指针偏移增加

offset += size_of::&lt;u32>();

因为前面读了4个u8类型，而solana的数据格式需要按64位对齐，这里再加32就是为了对齐偏移量。

let key: &Pubkey = &*(input.add(offset) as *const Pubkey);

读取下一个Pubkey类型的数据，并通过&操作符，获取Pubkey的引用，然后赋值给key。

offset += size_of::&lt;Pubkey>();

指针偏移增加

let owner: &Pubkey = &*(input.add(offset) as *const Pubkey);

读取下一个Pubkey类型的数据，并通过&操作符，获取Pubkey的引用，然后赋值给owner。

offset += size_of::&lt;Pubkey>();

指针偏移增加

let lamports = Rc::new(RefCell::new(&mut *(input.add(offset) as *mut u64)));

这里的代码也比较复杂，需要一步一步分析，首先input.add(offset)得到的是下一个数据的原始指针，然后强制转换成*mut u64，表示可修改的u64类型原始指针，然后通过 * 操作符获取该位置的值，然后通过&mut操作符转换成可修改的引用，然后使用RefCell包裹这个引用，最后使用Rc包裹RefCell。Rc和RefCell可以看这两篇文章： https://doc.rust-lang.org/book/ch15-04-rc.html https://doc.rust-lang.org/book/ch15-05-interior-mutability.html

offset += size_of::&lt;u64>();

指针偏移增加

let data_len = *(input.add(offset) as *const u64) as usize;

这里获取data的长度

offset += size_of::&lt;u64>();

指针偏移增加

let data = Rc::new(RefCell::new({from_raw_parts_mut(input.add(offset), data_len)}));

这里通过from_raw_parts_mut这个底层方法获取实际数据，然后使用RefCell和Rc包裹

offset += data_len + MAX_PERMITTED_DATA_INCREASE;

这里直接把最大的可读取的范围加到了偏移上。

offset += (offset as *const u8).align_offset(align_of::&lt;u128>());

这里也是为了对齐

let rent_epoch = *(input.add(offset) as *const u64);

读取rent_epoch

offset += size_of::&lt;u64>();

指针偏移增加

accounts.push(AccountInfo {key,is_signer,is_writable,lamports,data,owner,executable,rent_epoch,});

这里生成一个AccountInfo并且push进accounts里。

else {

offset += 7; // padding

// Duplicate account, clone the original

accounts.push(accounts[dup_info as usize].clone());

}

else语句里表示有重复的AccountInfo，直接使用dup_info作为下标，找到对应的AccountInfo，然后克隆一个出来。

let instruction_data_len = *(input.add(offset) as *const u64) as usize;

获取instruction_data的长度

offset += size_of::&lt;u64>();

指针偏移增加

let instruction_data = { from_raw_parts(input.add(offset), instruction_data_len) };

获取instruction_data数据

offset += instruction_data_len;

指针偏移增加

let program_id: &Pubkey = &*(input.add(offset) as *const Pubkey);

获取program_id

(program_id, accounts, instruction_data)

返回解析出来的三个值。到此，我们把整个entrypoint宏定义解释了一遍。总的来说，这个宏做的事大致分为三步：

1、解析二进制数据，转换成program_id, accounts, instruction_data

2、使用解析的program_id, accounts, instruction_data调用process_instruction函数

3、判断process_instruction函数返回的值，正常就返回SUCCESS，错误就返回对应的错误码。

下面，让我们返回用户写的程序helloworld，也就是process_instruction：

// Program entrypoint's implementation
pub fn process_instruction(
    program_id: &Pubkey, // Public key of the account the hello world program was loaded into
    accounts: &[AccountInfo], // The account to say hello to
    _instruction_data: &[u8], // Ignored, all helloworld instructions are hellos
) -> ProgramResult {
    msg!("Hello World Rust program entrypoint");

    // Iterating accounts is safer than indexing
    let accounts_iter = &mut accounts.iter();

    // Get the account to say hello to
    let account = next_account_info(accounts_iter)?;

    // The account must be owned by the program in order to modify its data
    if account.owner != program_id {
        msg!("Greeted account does not have the correct program id");
        return Err(ProgramError::IncorrectProgramId);
    }

    // Increment and store the number of times the account has been greeted
    let mut greeting_account = GreetingAccount::try_from_slice(&account.data.borrow())?;
    greeting_account.counter += 1;
    greeting_account.serialize(&mut &mut account.data.borrow_mut()[..])?;

    msg!("Greeted {} time(s)!", greeting_account.counter);

    Ok(())
}

这里我们可以看到，process_instruction的三个参数正好就是解析的program_id, accounts, instruction_data，或者说，我们必须定义这三个参数，这是为了配合宏定义的规范。

msg!("Hello World Rust program entrypoint");

打印入口信息，这说明我们已经进入helloworld的主程序里了

let accounts_iter = &mut accounts.iter();

由于accounts是个Vec的引用，因此我们可以拿到它的迭代器

let account = next_account_info(accounts_iter)?;

通过next_account_info方法拿到accounts的第一个数据。这里的？其实是语法糖，表示next_account_info如果返回的是正常结果，就赋值给account否则返回错误。

if account.owner != program_id {
        msg!("Greeted account does not have the correct program id");
        return Err(ProgramError::IncorrectProgramId);
}

这行判断第一个account的owner是不是等于program_id。在solana里，每个account都属于且只属于一个program，如果你传入的account是其他program的，那么当前程序是操作不了这个account的，运行会报错。所以这里提前判断了一下，返回一个有意义的错误码。

let mut greeting_account = GreetingAccount::try_from_slice(&account.data.borrow())?;

这行代码使用try_from_slice将account里二进制数据data转换成GreetingAccount，这是通过BorshDeserialize来实现的，具体原理就不解释了，比较复杂。

greeting_account.counter += 1;

greeting_account的counter变量加一，此时数据变化只是在内存中

greeting_account.serialize(&mut &mut account.data.borrow_mut()[..])?;

序列化GreetingAccount，这里通过BorshSerialize实现的，此时数据变化同步到了account中

msg!("Greeted {} time(s)!", greeting_account.counter);

打印变化后的值

Ok(())

返回正常结果。

3、总结

到此，整个solana的helloworld项目就分析完了，下面做个小小的总结：

1、必须定义entrypoint!(process_instruction); 否则找不到程序入口

2、process_instruction的入参是固定的，否则可能调用失败

4、程序如果正常返回就返回Ok，否则返回Err和对应错误码

原创
学分: 4
分类: Solana
标签:

本文参与登链社区写作激励计划，好文好收益，欢迎正在阅读的你也加入。

逐行代码解读Solana的Hello world程序

1 项目源码

2、逐行解读

[macro_export]和macro_rules!是Rust里宏定义的固定写法，具体可以参考Rust手册（https://doc.rust-lang.org/book/ch19-06-macros.html）

3、总结

0 条评论

文章目录