Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle external_data table deploys mostly the same as for regular tables #6558

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

curtismorales
Copy link
Contributor

@curtismorales curtismorales commented Nov 25, 2024

Description

Fixes a bug introduced in #6529 -- this PR added support for tables that are defined with a metadata.yaml but no query.sql, but didn't check for tables that are built on external data (Google Sheets or CSV)

This adds logic to the deploy_table function to handle external tables differently. In the process I also did some refactoring of how we deploy external tables. I moved the metadata update logic that was in _deploy_external_data into an attach_external_data_config function in publish_metadata.py and removed the duplicated code.

Is this reasonable? This also cuts one of the passes we're doing through all the sql and metadata files during a deploy, so I think it's a performance benefit. Unless there was a reason that we weren't deploying external tables in the same threads as the rest of the table deploys?

Also, I should add some tests; this bug is kind of my fault for not adding tests in the first place. If it's okay with you, I'd like to do that in a separate PR.

Related Tickets & Documents

  • AD-501 (where this bug came up)

┆Issue is synchronized with this Jira Task

@curtismorales curtismorales marked this pull request as ready for review November 25, 2024 21:36
try:
if artifact_file.name == METADATA_FILE:
metadata = Metadata.from_file(artifact_file)
else:
metadata = Metadata.of_query_file(artifact_file)

if metadata.external_data:
if str(artifact_file).endswith("query.sql"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check for other types of query files, like query.py here as well. Maybe similar to

and not any(
file.suffix == ".sql" or file.name == "query.py"
for file in metadata_file.parent.iterdir()
if file.is_file()
)

@scholtzan
Copy link
Collaborator

Having some more unit tests for this would be nice, but it's fine to do that in a follow-up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants