-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automate Geoglam & NO2 dataset ingestion #155
Comments
Putting the discovery-items config within s3://<EVENT_BUCKET>/collections/ in the following format: https://github.com/US-GHG-Center/ghgc-data/blob/add/lpdaac-dataset-scheduled-config/ingestion-data/discovery-items/scheduled/emit-ch4plume-v1-items.json will trigger the discovery and subsequent ingestion of the collection items based on the schedule attribute |
mcp-prod will need a new release of airflow to include automated ingestion |
Update: We have decided to run these weekly instead of bi-weekly |
I added the scheduled collection configs from veda-data #177 to mcp-test and mcp-production |
Description
NO2 (#89) and Geoglam (#167, #173) datasets requires monthly ingestion as new assets are created. This is currently a manual process however should be automated.
veda-data-airflow
has a feature that allows scheduled ingestion by creating dataset specific DAGs. The file must still be transferred to the collection s3 bucket. A json file must be uploaded to the airflow event bucket. Here is an example json:Acceptance Criteria
no2-monthly
andno2-monthly-diff
froms3://covid-eo-data
bucket tos3://veda-data-store-staging
ands3://veda-data-store
using MWAA transfer dagno2-monthly, no2-monthly-diff
) in mwaa event bucket for staging (UAH) and production (MCP)geoglam
) in mwaa event bucket for staging (UAH) and production (MCP)The text was updated successfully, but these errors were encountered: