You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following is a description based on my experience and my understanding of the Couler code for scheduling Argo workflows. Please feel free to correct and misunderstanding on my part. If the following reported issue/enhancement proposal is already on the development radar I hope this will at least provide further support for it or start a discussion around the idea! In addition, thanks for all the development that is undergoing on Couler and Argo :)
Summary
It would be amazing if Couler would support passing in/specifying parameters that are dedicated for interpretation/usage in the scripts that are submitted via couler.run_script(...). Currently, if args are specified they get used as container args in addition to being added to the template arguments and step argument specification. Following are some code blocks that serve for reproducibility:
importcouler.argoascoulerfromcouler.argo_submitterimportArgoSubmitterfromcouler.core.constantsimportImagePullPolicydefstart():
def_start():
print('{{input.parameters}}') # just print all of them to see what's in itreturncouler.run_script(
'python:3.7',
source=_start,
args=[arg], # main issuestep_name='start',
image_pull_policy=ImagePullPolicy.IfNotPresent,
)
start()
submitter=ArgoSubmitter()
couler.run(submitter=submitter)
The script above results in the following log:
python: can't open file 'start': [Errno 2] No such file or directory
As you can see, the container command python is accompanied by the "start" string. The template does contain the parameters, so the _start code should be able to access the parameters. Here's the YAML spec:
name: couler-testinginputs: {}outputs: {}metadata: {}steps:
- - name: start-112template: startarguments:
parameters:
- name: para-start-0value: start
---
templates:
name: startinputs:
parameters:
- name: para-start-0 # this should be accessibleoutputs: {}metadata: {}script:
name: ''image: 'python:3.7'command:
- pythonargs:
- '{{inputs.parameters.para-start-0}}'# but this makes things blow upresources: {}imagePullPolicy: IfNotPresentsource: |print('{{input.parameters}}')
One way I am currently getting around this limitation is by setting environment variables with the passed parameters, as follows:
As you may imagine, it's challenging to choose the proper environment variable name: what if users overwrite an important name? What if you have users who may not have the necessary experience to have the idea of checking what environment parameters are overwritten? If something gets overwritten, the user is not notified/observability is limited. As a potentially unrelated note, the above example is part of a much bigger workflow I am currently developing; each function in this workflow would need its own parameters, as one may infer.
What change needs making?
Potentially, a way to differentiate between script parameters and container parameters. I believe the inputs are a way to solve this but not all users have the desire to store everything in a bucket so parameters can be used as artifacts.
I am happy to attempt to solve this problem but I would love to first have a discussion of a potential way of getting around this that would best solve community use cases.
Use Cases
Users who want to run quick experiments by submitting simple functions to Argo as they want to take advantage of the resource orchestration capacity of Argo. This supports increased development velocity by abstracting away the need to construct custom containers that have scripts which can be specified on the command line, along with their particular parameters. Python images can grow to 5Gi+, depending on the number of big packages users install. Therefore, a way to quickly submit functions for processing in containers that already have the required dependencies would be very valuable and perhaps promote wider adoption of Argo.
When would you use this?
I think I covered this in User Cases so please refer to that.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered:
Thanks for the well-written proposal. Would you like to propose a way to distinguish the args? Also cc @rushtehrani who's previously contributed a fix on script template args.
Thanks for the well-written proposal. Would you like to propose a way to distinguish the args? Also cc @rushtehrani who's previously contributed a fix on script template args.
Thanks for the reply @terrytangyuan. Sure, let me take a quick look at Argo, then here, and see what we have as options for potentially implementing this.
The change I made excluded artifacts but not arguments. I believe the behavior in run_container is the same at the moment.
I thought about using container_args/args or args/arguments but that would break backward compatibility and I didn't get a chance to dig deeper into it.
The following is a description based on my experience and my understanding of the Couler code for scheduling Argo workflows. Please feel free to correct and misunderstanding on my part. If the following reported issue/enhancement proposal is already on the development radar I hope this will at least provide further support for it or start a discussion around the idea! In addition, thanks for all the development that is undergoing on Couler and Argo :)
Summary
It would be amazing if Couler would support passing in/specifying parameters that are dedicated for interpretation/usage in the scripts that are submitted via
couler.run_script(...)
. Currently, ifargs
are specified they get used as container args in addition to being added to the template arguments and step argument specification. Following are some code blocks that serve for reproducibility:The script above results in the following log:
As you can see, the container command
python
is accompanied by the "start" string. The template does contain the parameters, so the_start
code should be able to access the parameters. Here's the YAML spec:One way I am currently getting around this limitation is by setting environment variables with the passed parameters, as follows:
The above results in the following log:
As you may imagine, it's challenging to choose the proper environment variable name: what if users overwrite an important name? What if you have users who may not have the necessary experience to have the idea of checking what environment parameters are overwritten? If something gets overwritten, the user is not notified/observability is limited. As a potentially unrelated note, the above example is part of a much bigger workflow I am currently developing; each function in this workflow would need its own parameters, as one may infer.
What change needs making?
Potentially, a way to differentiate between script parameters and container parameters. I believe the inputs are a way to solve this but not all users have the desire to store everything in a bucket so parameters can be used as artifacts.
I am happy to attempt to solve this problem but I would love to first have a discussion of a potential way of getting around this that would best solve community use cases.
Use Cases
Users who want to run quick experiments by submitting simple functions to Argo as they want to take advantage of the resource orchestration capacity of Argo. This supports increased development velocity by abstracting away the need to construct custom containers that have scripts which can be specified on the command line, along with their particular parameters. Python images can grow to 5Gi+, depending on the number of big packages users install. Therefore, a way to quickly submit functions for processing in containers that already have the required dependencies would be very valuable and perhaps promote wider adoption of Argo.
When would you use this?
I think I covered this in User Cases so please refer to that.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: