-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent an Application CONSENSUS_FAILURE #186
Comments
The population of Tendermint Headers happen in the following part of Tendermint code. github.com/tendermint/tendermint/state/state.go:MakeBlock()
|
Due to the implementation of Tendermint ABCI Application, it is not possible to reject/deny a block for consensus when the application layer does not consider a block valid. IMO that is a clear limitation on Tendermint code I created an issue to request that feature tendermint/tendermint#3755 |
The block time is being populated by the MedianTIme of every validator preCommit using as weight the voting power of each of the validators, as we see in the following impl:
|
According to the answer given by Tendermint team at created issue tendermint/tendermint#3755 IT IS IMPOSSIBLE to generate two blocks within the same second if the The
|
At the moment the invalid block was created the consensus setup was the one as follow:
Therefore it was possible that a timeout situation happened overwritting the "default behavour" mentioned above abouve
Therefore we could discard that the issue was caused due to low values on the |
After a team discussion around how to solve this issue we achieved the following statements:
According to what was written on above points, the solution cannot involve a complex or hacky code, and it has to rely on Tendermint protocol. We cannot either remove the entire ethereum verify header. The proposed solution is:
|
After the fix was applied we have identified another correction of the time on ethereum code which enforce nodes to sleep when blocks are persisted in the future, that improve the proposed solution as it will slow down the node and adjust block times without letting the block time goes to far in future, max of 4 seconds
|
Lightchain uses Ethereum as the blockchain storage and due to that our application need to compliant with Ethereum restrictions. One of them is
timestamp equals parent's
, two consecutive blocks cannot be done within the same second otherwise it fells in an error which causes the following log output and causes a consensus failure which cannot recoverLog output
How to reproduce it
To reproduce this issue in a safe manner we are going to use
Standalone
network and apply the consensus config values:Change the genesis or the consensus params to use less than 1000ms
After that we will run the workload test as follow. Maybe it requires more than one try-out
Sample wal logs of failure
The text was updated successfully, but these errors were encountered: