-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move settlement queue to the driver #3129
base: main
Are you sure you want to change the base?
Conversation
/// Archive node URL used to index CoW AMM | ||
archive_node_url: Option<Url>, | ||
} | ||
|
||
fn default_settle_queue_size() -> usize { | ||
3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it makes sense to keep more than 3 pending settlements, at least for mainnet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to still review better the spawning thread logic, but it would be really cool if we could have e2e tests in which we can hit the limit 🙏
Yes, before implementing it, I just wanted to finalize the approach itself to avoid back-and-forth changes. |
This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed. |
@squadgazzz should I review this or aren't we going to have the queue in the end? |
@m-lord-renkse, we still need the queue for solvers that cannot handle multiple settlements in the same block. |
"settle deadline exceeded. unable to return a response" | ||
); | ||
} | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we reach here is because the deadline hit before we could even call /settle
, it could be the first call or it could be the 5th one. Shouldn't we log here for debug purposes? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we reach here, response_sender
successfully sent the error, where receiver logs the error:
services/crates/driver/src/domain/competition/mod.rs
Lines 352 to 355 in c7dd81d
response_rx.await.map_err(|err| { | |
tracing::error!(?err, "Failed to dequeue /settle response"); | |
Error::SubmissionError | |
})? |
.await | ||
.unwrap_or_else(|_| Err(competition::Error::SubmissionError))?) | ||
let sender = state.settle_queue_sender(); | ||
let (response_tx, response_rx) = oneshot::channel(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be created within settle_queue_sender
? as we have now, we create sender
and then response_tx
just to be passed to sender
in send()
function. The queue sender should create its corresponding resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume, this is addressed now.
competition::Error::UnableToEnqueue | ||
})?; | ||
|
||
Ok(response_rx.await.map_err(|err| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we have a timeout here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since the mempool internally checks for the submission deadline and the task will be dropped once it is reached:
services/crates/driver/src/domain/mempools.rs
Lines 124 to 133 in bcd024e
if block.number >= submission_deadline { | |
tracing::info!( | |
?hash, | |
deadline = submission_deadline, | |
current_block = block.number, | |
"tx not confirmed in time, cancelling", | |
); | |
self.cancel(mempool, settlement.gas.price, solver).await?; | |
return Err(Error::Expired); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea for this change makes sense to me but the implementation seems awkward and at the wrong place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; just some small suggestions.
/// Solve an auction as part of this competition. | ||
pub async fn solve(&self, auction: &Auction) -> Result<Option<Solved>, Error> { | ||
if self.settle_queue.capacity() == 0 { | ||
tracing::warn!("settlement queue is full; auction is rejected"); | ||
return Err(Error::SubmissionError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/solve
requests should not return errors related to submitting solutions. The request did not ask to submit anything yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Added a new internal error type, but since it is not expected to change the API, the solver error is used:
competition::Error::SettlementQueueIsFull => Kind::SolverFailed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, assuming comments above will be addressed.
Currently blocked by #3133 |
Description
This is a follow-up to #3116, which addresses a suggestion of moving the settlement queue to the driver. This should help avoid increasing the block deadline since it can still be calculated starting from the simulation block.
Changes
https://github.com/cowprotocol/infrastructure/pull/2241 should be reverted.
How to test
Existing tests. More driver tests will be implemented once the approach is approved.