about improving dvc move
#5926
Replies: 6 comments
-
Thanks @dashohoxha ! I would split those into two issues - Also, just to give you more context, check this one #1489 which is related - and it will give you some sense why it's hard to implement a general "stage move" - it's just tricky. |
Beta Was this translation helpful? Give feedback.
-
It seems indeed tricky to move a stage (because of dependencies). The safest way seems to be deleting a stage file ( |
Beta Was this translation helpful? Give feedback.
-
@dashohoxha and it's not about dependencies ( |
Beta Was this translation helpful? Give feedback.
-
That is clearly the responsibility of the user. I think that we should never touch the cmd field. So, the cleanest solution for moving a stage file is: As a best practice, I would recommend to users to use a bash script for building a stage file. #!/bin/bash
dvc run -f stage1.dvc ....
dvc run -f stage2.dvc ....
dvc run -f stage3.dvc .... Some of the benefits are these:
Does this make sense? |
Beta Was this translation helpful? Give feedback.
-
@dashohoxha right, and we've seen people doing this already. Not sure I like this approach (bash scripting) to be honest. Feels like a workaround. I would like DVC to be self-sufficient to more or less extent in the long term. Saying that, I don't know yet how it should be done. Some Python DSL/decorators DVC can extract the description from? Redesign DVC files in way that we don't duplicate the same structural info across multiple files and fields? Would love to explore other ideas and how graph definitions are done in other frameworks, how people manipulate them. Re the cleanest ways. Remove is destructive - it'll take time to rebuild the outputs again, or you would need to run What do you think about splitting the issue into two (repurpose this into |
Beta Was this translation helpful? Give feedback.
-
Another discussion and confusion - https://discordapp.com/channels/485586884165107732/563406153334128681/685125650901630996 |
Beta Was this translation helpful? Give feedback.
-
There is a discussion here about deprecating
dvc move
. However I think that it is possible to improve it instead, in order to make it more useful (see this and this ). I am summarizing here my suggestions, so that we can discuss them further.We can make
dvc move
a bit more flexible if it can accept arguments of two different types:.dvc
file (making an implicitdvc commit
)..dvc
file, then it moves the.dvc
file only, and updates the paths and checksums inside it (making an implicitdvc commit
).This would make
dvc move
consistent and complete, being able to cover all the cases when moving files around is needed. (I believe that the currentdvc move
was conceived with data-tracking files in mind, those created bydvc add
, so it doesn't work well with stage files.)Admittedly, this would be like combining two different commands into one, which in general is not a good thing. But I think that in this case it is reasonable and justified.
If we would like to split those commands (maybe in the future), we could do it like this:
dvc data move
anddvc stage move
. However this obviously requires a new level of subcommands, which right now does not seem to be feasible.EDIT:
After reading some previous discussions (#1489), its seems clear to me that
dvc move
cannot be made useful for stage files. So either we make very clear (in the docs, in help message etc.) that it can be used only for data files, or we just deprecate and remove it.Beta Was this translation helpful? Give feedback.
All reactions