Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid forcing the Ledger pulsers when serializing the Ledger State #4191

Open
dnadales opened this issue Mar 11, 2024 · 2 comments
Open

Avoid forcing the Ledger pulsers when serializing the Ledger State #4191

dnadales opened this issue Mar 11, 2024 · 2 comments
Assignees

Comments

@dnadales
Copy link
Member

Storing the ledger state on disk requires forcing the pulser, which consumes a significant amount of CPU time and memory, and raises garbage collection activity significantly. As a consequence, slot leadership checks are missed.

Possible ways to mitigate this are:

  • Store a representation of the thunks involved in the pulser.
  • Drop the information used by the pulser, and restore it when we load the ledger state from disk.
@TimSheard
Copy link
Contributor

TimSheard commented Mar 12, 2024

Some Notes. The pulser is a complicated structure. It is a polymorpjic data type, that is used at only one type.
Here is the type it us used at: RewardPulser c ShelleyBase (RewardAns c))
Here are some data type definitions of things stored in the Pulser:

data FreeVars c = FreeVars
  { fvDelegs :: !(VMap VB VB (Credential 'Staking c) (KeyHash 'StakePool c))
  , fvAddrsRew :: !(Set (Credential 'Staking c))
  , fvTotalStake :: !Coin
  , fvProtVer :: !ProtVer
  , fvPoolRewardInfo :: !(Map (KeyHash 'StakePool c) (PoolRewardInfo c))
  }
data RewardAns c = RewardAns
  { accumRewardAns :: !(Map (Credential 'Staking c) (Reward c))
  , recentRewardAns :: !(RewardEvent c)
  }

Instantiating the RewardPulser at the only type it is ever used at, the type of its constructor is thus

 RSLP ::
    !Int ->
    !(FreeVars c) ->
    !(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
    !ans ->
    RewardPulser c m ans

Here is a pattern that matches RSLP at runtime. This helps us visualize what is stored in side, and which components need to be serialized, so that only the initial state is serialized.

RSLP itemsPerPulse 
     (FreeVars delegates rewaddress totalstake protocolversion rewinfo)
     itemsLeftToPluse
     (RewardAns accumAns deltaAnsLastPulse)

The itemsPerPulse never changes and is needed to reset to the initial state
The whole FreeVars structure never changes, and is needed to reset to the initial state
The itemsLeftToPulse, is an intermediate value, The initial itemsPerPulse is not available.
The whole RewardAns is an intermediate value. But should be reset to (RewardAns Map.empty Map.empty) for the initial state.

So to reset we need to remember the initial state of the itemsLeftToPulse, which is a huge data structure with type

(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin))

Luckily this structure consists of a wrapper and a huge array. Even luckier, is that the function that updates this at
each pulse VMap.splitAt creates a new wrapper, but the huge array remains the same. So we can add two copies
with almost no increase in storage size.

So the strategy is to change the Pulser construcot to have the type

 RSLP ::
    !Int ->
    !(FreeVars c) ->
    !(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
    -- ^ the initial value
    !(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
    -- ^ the current value
    !ans ->
    RewardPulser c m ans

To serialize we get the protVer from the FreeVars to know how to serialize

  1. serialize the Int
    2 serialize the Freevars
  2. serialize the initial value

To deserialize

  1. n <- deserialize
  2. free <- deserialize
  3. initial <- desrialize
    return (RSLP n free initial initial (RewardAs Map.empty Map.empty))

Hopefully completing this pulser will give the same answer as completing the one we serialized.
It will have to do some extra work.

@TimSheard TimSheard mentioned this issue Mar 12, 2024
9 tasks
@dnadales
Copy link
Member Author

Wow, thank you for the excellent analysis. The solution makes sense; I wonder if it'd be possible to benchmark the memory consumption of patch #4196 with the baseline (current master).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants