Add opt in reentrancy to soroban #1491

heytdep · 2024-11-06T10:06:36Z

What

This PR adds two new host functions for contracts to opt in reentrancy and enable the immediate call to be re-entrant.

Implementation details

I've chosen to go with two separate host functions for two main reasons:

It allows to easily verify which contracts are and are not using reentrancy looking at the imports sigs.
It's fully backwards compatible.
There's no overhead thanks to streamlined linking (CAP 55 or 54 iirc).

Other implementation alternatives would be for the contract to have somewhere a flag set to enable reentrancy either in the binary (e.g stylus contracts, even though it's a different story there since theirs is a heavily memory based implementation), but it limits a contract that wants to use both reentrant and non-reentrant functions and is in general a worse implementation approach imo since it likely requires xdr changes for the flag to be set.

Why

This feature has been previously requested, it seems initially by orbitcdp (likely for non-approval-based flash loans), or on the dev server (https://discord.com/channels/897514728459468821/966788672164855829/1303384724181352449).

I believe that enabling opt-in re-entrancy is in general a good idea. Especially if added as separate host functions for discoverability (easy to discover since the binary would have the reentrancy call signature explicitly hardcoded). There are some use cases here, including doing better, and more similarly to the EVM defi logic (think callbacks).

Known limitations

It's very important that:

The sdk implementation advises against using reentrancy. For instance it could be set as an unsafe function.
Contracts that make cross contract calls understand that they need safeguards in place if the imported binary has reentrancy enabled. I think that this could be also exposed in the contractimport! macro, i.e reading from the binary's imports and raising a warning if the imported binary uses reentrant calls.

heytdep · 2024-11-06T10:16:32Z

I think it's also worth pointing out that this change makes it so that it's not the caller contract that decides whether the called contract can be reentrant or not. Given Soroban's current codebase layout this was the obvious implementation imo, but maybe we should change the host function's name because current naming might get developers confused and thinking that If they use call and not call_reentrant then everything is fine, which could result in hacks.

Tests

Also, about tests, I have to admit I'm not completely familiar with the Soroban SDK, so I just manually imported the external function into the contract. Happy to change that once/if the sdk supports this. Currently my test binaries look much different from the others, e.g

#![no_std]
use soroban_sdk::{contract, contractimpl, Env, Val, Address, Vec, Symbol, vec};

#[link(wasm_import_module = "d")]
extern "C" {
    #[allow(improper_ctypes)]
    #[link_name = "1"]
    pub fn call_reentrant(contract: i64, func: i64, args: i64, ) -> i64;
}

#[contract]
pub struct Contract;

#[contractimpl]
impl Contract {
    pub fn do_reentry(env: Env, caller: Address) {
        let args: Vec<Val> = vec![&env];
        let func = Symbol::new(&env, "do_nothing");
        let called_val = caller.as_val().get_payload() as i64;
        let func_val = func.as_val().get_payload() as i64;
        let args_val = args.as_val().get_payload() as i64;
        
        unsafe {
            call_reentrant(called_val, func_val, args_val);
        };
    }
}

heytdep · 2024-11-07T05:32:01Z

Following the discord discussions and concerns about the following situation: A calls B, B calls_reentrant A, A gets called without issues, i.e a situation where the caller has no control over whether the called contract makes a reentrant call or not; I've implemented a reentry guard to be specified by the caller contract which will make the linker check on whether the called contract (B in this case) uses reentry or not.

Now the caller contract looks like the following:

#![no_std]
use soroban_sdk::{contract, contractimpl, Env, Address, Symbol, TryIntoVal, Vec, Val};

#[link(wasm_import_module = "d")]
extern "C" {
    #[allow(improper_ctypes)]
    #[link_name = "_"]
    pub fn call_contract(contract: i64, func: i64, args: i64, ) -> i64;

    #[allow(improper_ctypes)]
    #[link_name = "3"]
    pub fn set_reentrant(enabled: i64, ) -> i64;
}

#[contract]
pub struct Contract;

#[contractimpl]
impl Contract {
    pub fn test_reentry(env: Env, called: Address) {
        let args: Vec<Val> = (env.current_contract_address(), ).try_into_val(&env).unwrap();
        let func = Symbol::new(&env, "do_reentry");
        let called_val = called.as_val().get_payload() as i64;
        let func_val = func.as_val().get_payload() as i64;
        let args_val = args.as_val().get_payload() as i64;

        let set_reentrant_flag = Val::from_bool(true).as_val().get_payload() as i64;
        
        unsafe {
            set_reentrant(set_reentrant_flag);
            call_contract(called_val, func_val, args_val);
        };
    }

    pub fn do_nothing(env: Env) {
        env.events().publish((Symbol::new(&env, "first_soroban_reentry"),), ());
    }
}

Note the set_reentrant, which is a new function that enables to switch reentry mode in the host's context.

This implementation makes the following assumptions which should be checked:

VMs instantiated from the cache will always need a reentry guard. I'm really doubtful on whether this actually is correct.
VMs instantiated for uploads and to check the protocol version should not need a reentry guard (also because these can't set the reentry enabled in the context so all uploads would fail).

leighmcculloch

@heytdep Thanks for looking at how reentry could be enabled.

Please correct me if I'm interpreting incorrectly, but I think this change proposes two ways to enable reentry.

I think there are benefits to there being a single way to enable the feature, and a single pattern to look out for to know if a contract is reentrant or not.

We should consider what the SDK interface will be and design down from that. I suspect when we do we'll find only one is needed.

The call_entrant approach has clear scope with a begin and end at a single call out. The set_reentrant has with raw usage an unbounded scope, which could lead to code being in scope that isn't intended to be, however it could have bounded scope if exposes to users in a way where set_reentrant(false) was always guaranteed to be called.

Both affects use in libraries. For example, a library may wish to make a call not knowing if the importer wants reentry or not. There needs to be a way for the importer to choose reentry. On the same note, a library may use reentry, but it's important that the scope doesn't escape the library.

I think the way I'd like to see this implemented in the SDK that would serve both examples above, while retaining scope, is to offer one SDK function that controls reentry for a scope of code, and the SDK in the implementation of that function call set_reentry_enabled(true|false) at the start and at the end of the function. For example:

use soroban_sdk::{contract, contractimpl, vec, symbol_short, BytesN, Env, Symbol, Vec};

#[contract]
pub struct Contract;

#[contractimpl]
impl Contract {
    pub fn exec(env: Env) {
        // ...
        env.allow_reentry(|| {
            client_for_other_contract.call_fn(...)
            library::fncallthatmaycallothercontract(...)
        });
        // ...
    }
}

leighmcculloch · 2024-11-12T06:06:36Z

soroban-env-common/env.json

+                        }
+                    ],
+                    "return": "Val",
+                    "docs": "Calls a function in another contract with arguments contained in vector `args`. If the call is successful, returns the result of the called function. Traps otherwise. This functions enables re-entrancy in the immediate cross-contract call.",


This functions enables re-entrancy in the immediate cross-contract call.

In an outgoing call graph that looks like this:

flowchart LR A --> B --> C --> D

Loading

Does it mean 1️⃣ that reentry into A is only allowed from B? For example:

flowchart LR A --> B --> C --> D B --> A

Loading

Or 2️⃣ that reentry into A is allowed from any contract that is executing further down the stack than A's call out? For example:

flowchart LR A --> B --> C --> D B --> A C --> A D --> A

Loading

I think 2️⃣ fits better with narrative on what it means to call out to another contract. What that other contract does, whether it is a monolith or a micro component is irrelevant to the calling contract. The calling contract merely needs to acknowledge that it is capable to handle being re-entered during the call. By who, is mostly irrelevant.

Are there cases where the who is important?

If I'm completely misunderstanding, please correct me 😄.

cc @dmkozh @sisuresh

Yeah this was also partially discussed on the thread. Currently the implementation is fairly unsafe because when reentrancy is enabled the called contract can be reentrant on any contract down the stack. This means that scenario 2 (which is what I think we should go for) is possible, but it also makes possible a scenario where e.g D is reentrant on B which didn't have reentrancy enabled.

leighmcculloch · 2024-11-12T06:10:22Z

soroban-env-common/env.json

+                    "export": "3",
+                    "name": "set_reentrant",
+                    "args": [
+                        {
+                            "name": "enabled",
+                            "type": "Bool"
+                        }
+                    ],
+                    "return": "Void",
+                    "docs": "Enables the current contract to specify the reentrancy rules.",
+                    "min_supported_protocol": 21


Is the idea with set_reentrant that it provides an alternative way to enable reentry on all the following regular call / try_call host fns?

Set reentrant just instructs the host that the contract in this context (or better, currently it sets the flag on the whole host). The current implementation requires two specialized host functions to have the callparam option with the reentry enabled (while call and try_call always have it disabled), and set_reentrant is an additional guard to make sure that if it's set to false call_reentrant will fail. So set_reentrant(true) and call in the current implementation would result in a trap anyways. This would change if we inherit the reentry guard from the context i.e what dmytro brought up in the discord.

heytdep · 2024-11-12T12:05:15Z

For future reference, this implementation is unsafe and will be changed into a call stack based reentry guard:

This scenario is wrongly allowed: A calls B, B sets reentrant to true and calls C, C is reentrant on A
Connecting to the above the set reentrant sets a single bool in the host which is shared among VMs in the call stack. So allowing a host function to change this is unsafe.

jayz22 · 2024-11-22T16:35:53Z

For future reference, this implementation is unsafe and will be changed into a call stack based reentry guard:

This scenario is wrongly allowed: A calls B, B sets reentrant to true and calls C, C is reentrant on A

Connecting to the above the set reentrant sets a single bool in the host which is shared among VMs in the call stack. So allowing a host function to change this is unsafe.

Sorry for being late into the discussion. I would just like to comment that in general exposing a mechanism allowing a contract to modify the shared global state of the host should be disallowed/strongly discouraged (unless for testing).

In this particular case, iiuc the linker-level reentry guard makes a specific assumption about the ordering of when contracts are instantiated (i.e. A sets reentry flag before calling B, which gets instantiated afterwards). We should make no such assumptions, as the host makes no guarantee of contract instantiation ordering. There is a cache containing pre-instantiated contract modules and it is highly possible the cache will be more global (e.g. across multiple ledgers) or include a fully instantiated contract VM (not just a wasmi::Module) in the future. Which invalidates any temporal assumption of contract instantiation ordering.

heytdep · 2024-11-24T13:50:13Z

@jayz22 yeah absolutely, both the linker guard (which was actually a "hotfix") and the global host reentry state would not be in the actual proposal since I plan for the guard to be inferred by the context stack as I mentioned above (it's better in every way).

heytdep added 2 commits November 6, 2024 16:55

add opt in reentrancy to soroban

43a3ed8

change supported protocol

efd3537

heytdep requested review from graydon, sisuresh, dmkozh and a team as code owners November 6, 2024 10:06

fmt

22f5770

heytdep added 3 commits November 7, 2024 12:16

allow caller contract to specify reentry rules

d3bc92b

add test where caller does not enable reentry

450cbf3

fmt

abe39c3

leighmcculloch reviewed Nov 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opt in reentrancy to soroban #1491

Add opt in reentrancy to soroban #1491

heytdep commented Nov 6, 2024

heytdep commented Nov 6, 2024

heytdep commented Nov 7, 2024

leighmcculloch left a comment •

edited

Loading

leighmcculloch Nov 12, 2024

heytdep Nov 12, 2024

leighmcculloch Nov 12, 2024

heytdep Nov 12, 2024

heytdep commented Nov 12, 2024

jayz22 commented Nov 22, 2024

heytdep commented Nov 24, 2024

Add opt in reentrancy to soroban #1491

Are you sure you want to change the base?

Add opt in reentrancy to soroban #1491

Conversation

heytdep commented Nov 6, 2024

What

Implementation details

Why

Known limitations

heytdep commented Nov 6, 2024

Tests

heytdep commented Nov 7, 2024

leighmcculloch left a comment • edited Loading

Choose a reason for hiding this comment

leighmcculloch Nov 12, 2024

Choose a reason for hiding this comment

heytdep Nov 12, 2024

Choose a reason for hiding this comment

leighmcculloch Nov 12, 2024

Choose a reason for hiding this comment

heytdep Nov 12, 2024

Choose a reason for hiding this comment

heytdep commented Nov 12, 2024

jayz22 commented Nov 22, 2024

heytdep commented Nov 24, 2024

leighmcculloch left a comment •

edited

Loading