Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional vars files: dynamic DAGs #10554

Open
mike-grayhat opened this issue Sep 12, 2024 · 2 comments
Open

Optional vars files: dynamic DAGs #10554

mike-grayhat opened this issue Sep 12, 2024 · 2 comments
Labels
A: pipelines Related to the pipelines feature feature request Requesting a new feature question I have a question?

Comments

@mike-grayhat
Copy link

mike-grayhat commented Sep 12, 2024

I have quite unusual case where I rely on variable generation from the first stage of the pipeline. The problem is that on the first run it doesn't exist yet which in turn invalidates the whole yaml file.

vars:
  - items: {}
  - items.yaml # non-existent before the first run

stages:
  collect_items:
    ...
  process:
    foreach: ${items}
    do:
     ...

I don't see an easy way out of it (even hydra works only on experiment runs, not on general dvc repros) and an option to skip missing variables would help a lot.

@shcheklein shcheklein added feature request Requesting a new feature question I have a question? labels Sep 12, 2024
@shcheklein
Copy link
Member

I think DVC needs all vars in such cases resolved before it can run the pipeline. Your vars essentially define the pipeline. It reads and compiles it first. So, even if allow missing files, it's a bigger change I think to make it dynamic. @skshetry could confirm that.

Does the content of the items.yaml change on every run?

@shcheklein shcheklein added the awaiting response we are waiting for your reply, please respond! :) label Sep 12, 2024
@mike-grayhat
Copy link
Author

The content of items.yaml gets generated based on external sources so it changes from time to time. The problem we face right now is that in theory we can put items.yaml under dvc, but we can't even pull it on fresh repo because dvc.yaml is not valid yet. Similarly dvc diff doesn't work. Static nature of dvc dag is a limiting factor for us, but we worked around the most problems except this one, in which case we have to rely on a separate pipeline to pull such files. I'm thinking of a better solution and haven't come up with one yet.

@shcheklein shcheklein changed the title Optional vars files Optional vars files: dynamic DAGs Sep 13, 2024
@shcheklein shcheklein added A: pipelines Related to the pipelines feature and removed awaiting response we are waiting for your reply, please respond! :) labels Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: pipelines Related to the pipelines feature feature request Requesting a new feature question I have a question?
Projects
None yet
Development

No branches or pull requests

2 participants