-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
targets integration #155
Comments
The function cr_build_targets(path=tempfile())
# adding custom environment args and secrets to the build
cr_build_targets(
task_image = "gcr.io/my-project/my-targets-pipeline",
options = list(env = c("ENV1=1234",
"ENV_USER=Dave")),
availableSecrets = cr_build_yaml_secrets("MY_PW","my-pw"),
task_args = list(secretEnv = "MY_PW")) Resulting in build: ==cloudRunnerYaml==
steps:
- name: gcr.io/google.com/cloudsdktool/cloud-sdk:alpine
entrypoint: bash
args:
- -c
- gsutil -m cp -r ${_TARGET_BUCKET}/* /workspace/_targets || exit 0
id: get previous _targets metadata
- name: ubuntu
args:
- bash
- -c
- ls -lR
id: debug file list
- name: gcr.io/my-project/my-targets-pipeline
args:
- Rscript
- -e
- targets::tar_make()
id: target pipeline
secretEnv:
- MY_PW
timeout: 3600s
options:
env:
- ENV1=1234
- ENV_USER=Dave
substitutions:
_TARGET_BUCKET: gs://mark-edmondson-public-files/googleCloudRunner/_targets
availableSecrets:
secretManager:
- versionName: projects/mark-edmondson-gde/secrets/my-pw/versions/latest
env: MY_PW
artifacts:
objects:
location: gs://mark-edmondson-public-files/googleCloudRunner/_targets/meta
paths:
- /workspace/_targets/meta/** |
Tests are working now which confirm a targets build can reuse previous builds artifacts, and also rerun if the source are updates https://github.com/MarkEdmondson1234/googleCloudRunner/pull/159/files |
Need two modes(?) - one where all target files are the upcoming gcs integration which will download artifacts as needed, one where the data is loaded from other sources (file etc) kept in a normal GCS bucket |
Added |
Getting some feedback here ropensci/targets#720
GCP already available via:
But I think there is an opportunity to move this more into a serverless direction, as the cloud build steps seem to seamlessly map to
tar_targets()
if a way of communicating between the steps can be done.As an example an equivalent
googleCloudRunner
totargets
minimal example would be:Normally I would put all the r steps in one buildstep sourced from a file but have added
readRDS() %>% blah() %>% saveRDS()
to illustrate functionality that I thinktargets
could take care of.Makes this yaml object that I think maps to
targets
closely:(more build args here)
Do the build on GCP via
the_build |> cr_build()
And/or each buildstep could be its own dedicated
cr_build()
and the build's artefacts are uploaded/downloaded after its run.This holds several advantages:
I see that as a tool that is better than Airflow for visualising DAGs, taking care of state management on whether each node needs to be run but with a lot of scale to build each step in a cloud environment.
The text was updated successfully, but these errors were encountered: