Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PECO-1263] Implement a .returned_as_direct_result property for AsyncExecution status #325

Open
wants to merge 6 commits into
base: peco-1263-staging
Choose a base branch
from

Conversation

susodapop
Copy link
Contributor

@susodapop susodapop commented Jan 22, 2024

Description

This PR implements an additional boolean flag on AsyncExecution that indicates whether this execution returned directResults. When this happens, results cannot be fetched from the server again. Because when a directResults query returns its operation is closed immediately.

I determined this is necessary while documenting the new execute_async() behaviour. For context, when we send a TExecuteStatementReq to the thrift server, there are two possible return conditions:

  1. If the query completes within five seconds, the resulting TExecuteStatementResp will actually include the results!
  2. If the query takes longer than five seconds to complete, the resulting TExecuteStatementResp will only include the query_id for the query, which we can use to poll for the results until they are ready.

In case 1, the result is sent within the initial TExecuteStatementResp and they cannot be fetched a second time. What this means for users is that if they are going to call .serialize() to persist a query_id and secret for use by another thread, they need a way to verify that the results will actually be available to another thread.

This new .returned_as_direct_result property makes that easier.

One unfortunate implication of this behaviour is that users cannot use one thread to kickoff all their queries and a separate thread to fetch their results. Because not all AsyncExecution objects can actually be picked up separately.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
@benc-db
Copy link
Collaborator

benc-db commented Jan 22, 2024

If the query completes within five seconds, the resulting TExecuteStatementResp will actually include the results!
If the query takes longer than five seconds to complete, the resulting TExecuteStatementResp will only include the query_id for the query, which we can use to poll for the results until they are ready.

Oh man, why? This is really unfortunate behavior.

@@ -225,6 +228,15 @@ def last_sync_timestamp(self) -> Optional[datetime]:
"""The timestamp of the last time self.status was synced with the server"""
return self._last_sync_timestamp

@property
def is_available(self) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long is it available for? How does this limitation work in the CUJ of recovering from a user-space crash?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to check with the thrift server folks to get that answer. I believe it's available for a few hours. With cloud fetch enabled it's 24 hours.

What do you think of the name is_available? I'd like a name that's reflective but also easy enough to type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its really a tough one; I think this might be a little misleading, because isn't the answer 'Unknown' for crash recovery CUJ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Because under this situation, if the client crashes it will do so before the query_id is returned. Which means there would be no way to pick up the execution anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so if it gets a query_id, then it must be available, unless its been a long time? Although, again, available doesn't sound quite right because the query could still be executing. Maybe just returned_as_direct_result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I've done the rename and pushed.

Note: we could get around this by allowing users to disable directResults when the TExecuteStatementReq is emitted.

Jesse Whitehouse added 5 commits January 22, 2024 16:42
thrift server team.

Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
Signed-off-by: Jesse Whitehouse <jesse.whitehouse@databricks.com>
@susodapop susodapop changed the title [PECO-1263] Implement an .is_available property for AsyncExecution status [PECO-1263] Implement a .returned_as_direct_result property for AsyncExecution status Jan 22, 2024
@jadewang-db
Copy link

is really unfortunate behavior.

will this cause any issues?

@benc-db
Copy link
Collaborator

benc-db commented Jan 22, 2024

is really unfortunate behavior.

will this cause any issues?

Jade, this is the issue:

In case 1, the result is sent within the initial TExecuteStatementResp and they cannot be fetched a second time. What this means for users is that if they are going to call .serialize() to persist a query_id and secret for use by another thread, they need a way to verify that the results will actually be available to another thread.

@@ -81,6 +81,7 @@ class AsyncExecution:
]
_last_sync_timestamp: Optional[datetime] = None
_result_set: Optional["ResultSet"] = None
_returned_as_direct_result: bool = False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need change get_result method to also check this flag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants