Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing call_data in quotes and order_quotes tables #3124

Open
wants to merge 53 commits into
base: main
Choose a base branch
from

Conversation

mstrug
Copy link
Contributor

@mstrug mstrug commented Nov 14, 2024

Description

Added storing of interactions received for /quote response sent by the solver, into new tables: quote_interactions and order_quote_interactions. Storing information if quote was validated in quotes and order_quotes tables.
Quote interactions are used for verification/auditing orders.

Changes

Added new column verified to quotes and order_quotes table.
Added new table: quote_interactions which stores solver returned interactions for particular quote. Data in this table is transient and is removed together with expired quote remove.
Added new table: order_quote_interactions which stores quote interactions permanently for particular order uid.
Added database migration script and updated readme file.

How to test

Existing tests, also added dedicated tests for this new functionality..

Copy link
Contributor

@MartinquaXD MartinquaXD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we now want to store the data in the db the PR also needs a db migration file (database/sql directory) and an update in the db Readme for that.

crates/orderbook/src/database/orders.rs Outdated Show resolved Hide resolved
Copy link

Reminder: Please update the DB Readme.


Caused by:

@mstrug mstrug marked this pull request as ready for review November 15, 2024 22:28
@mstrug mstrug requested a review from a team as a code owner November 15, 2024 22:28
@squadgazzz
Copy link
Contributor

Reference discussion: link

Let's avoid sharing links to private discussions in open-source projects and instead share all the required information in the description. @mstrug, could you update the description explaining the reasoning and other considerations, if any?

@mstrug mstrug marked this pull request as draft November 23, 2024 00:20
@mstrug
Copy link
Contributor Author

mstrug commented Nov 29, 2024

Do we really need 2 separate tables? Was it considering storing interactions in the same table as an array?

I was thinking about using only one table with interactions and store quote_id and order_uid for each interaction, but decided to use separate tables, as not all quote interactions are then stored in quote order interactions table. In my opinion it is better to have one table with transient data more frequently updated and the other one with persistent data.

crates/database/src/orders.rs Outdated Show resolved Hide resolved
crates/autopilot/src/database/auction.rs Show resolved Hide resolved
crates/database/src/quotes.rs Outdated Show resolved Hide resolved
Comment on lines 209 to 211
SELECT q.*, i.index, i.target, i.value, i.call_data FROM quotes q
JOIN quote_interactions i ON quote_id = id
WHERE id = $1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to aggregate the interactions directly in the quote row using SQL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can be done using Postgres aggregate function with custom sqlx::FromRow trait implementation, I've developed working solution which after some cleanups I will commit.

crates/orderbook/src/database/orders.rs Outdated Show resolved Hide resolved
Comment on lines 280 to 281
Solver responding to quote request provides a list of interactions which need to be executed to fulfill the quote. When order is created
these interactions are copied from quote. These interactions are stored persistently and can be used to audit auction winning order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be treated like user facing documentation and should therefore use proper grammar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

);

-- Get a specific quote's interactions.
CREATE INDEX quote_id_interactions ON quote_interactions USING HASH (quote_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a particular reason to not use the default index type (BTree) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to use hash because we don't need to query for ranges and hash map has little better performance on average (O(1) vs O(N log N)). Also we don't need to have unique index here.

impl TryFrom<database::quotes::QuoteWithInteractions> for QuoteData {
type Error = anyhow::Error;

fn try_from(input: database::quotes::QuoteWithInteractions) -> Result<QuoteData> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either this:

Suggested change
fn try_from(input: database::quotes::QuoteWithInteractions) -> Result<QuoteData> {
fn try_from((quote, interaction): database::quotes::QuoteWithInteractions) -> Result<QuoteData> {

or use a regular QuoteWithInteractions struct with named arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuoteWithInteractions has been refactored to contain same fields as Quote struct + interactions vector.

@MartinquaXD
Copy link
Contributor

Due to a recent change in the API we'd need to store a bunch more data for a quote. Specifically we'd also have to store pre-interactions and JIT orders returned in a quote.
I'm a bit worried that this will create even more tables. The pre-interactions could be stored in the same tables the PR already introduces (by adding another column which kind of interaction it is) but the JIT orders would theoretically need 2 more tables in the current approach. :/

Since we don't really need to introspect the quote data in the backend I'm considering whether it makes more sense to just add a single JSON column to the quotes and order_quotes tables that contain all the details: pre-interactions, regular interactions, JIT orders, clearing prices. WDYT?

@mstrug
Copy link
Contributor Author

mstrug commented Dec 3, 2024

Due to a recent change in the API we'd need to store a bunch more data for a quote. Specifically we'd also have to store pre-interactions and JIT orders returned in a quote. I'm a bit worried that this will create even more tables. The pre-interactions could be stored in the same tables the PR already introduces (by adding another column which kind of interaction it is) but the JIT orders would theoretically need 2 more tables in the current approach. :/

Since we don't really need to introspect the quote data in the backend I'm considering whether it makes more sense to just add a single JSON column to the quotes and order_quotes tables that contain all the details: pre-interactions, regular interactions, JIT orders, clearing prices. WDYT?

So we will need to have one column of type json, with predefined schema. The only thing which can be not so optimal is updating data in that column. If we would need to add some data then we need to read json from the column and write updated json back. But if this is not going to be a frequent action, then there should be no issue with that.

@MartinquaXD
Copy link
Contributor

I don't see a need to ever update the columns. The only thing that could happen is that in the future we want to store a different JSON format but in that case I'd say we should just leave the old values untouched.
If we want to return the data from an API we could simply not expect any format and return whatever we find in the DB. Overall this data is mostly for debugging purposes and not that critical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants