params
function registry
#6506
Replies: 4 comments 6 replies
-
Great synopsis of the issues, @daavoo! Before we get to a function registry, I think it would be helpful to recap some of the problems in those issues, and some implementations to solve those particular issues. Abstracting any solutions into a function registry could be revisited as a next step if they aren't flexible enough to meet all user's needs. ProblemsToo many parameters to specifyThere are too many parameters to explicitly hard code each one. If they are in #5477 (comment): "Problem here is that I can modify the learning rate, momentum, architecture details...." #6107: "The scripts could have a lot of args (this train.py has about 20 of them, all handled by Python argparse module)." RepetitionParameters have to be repeated even in simple scenarios (they need to be defined in the params file, #5477 (comment): "I will end up repeating most of my config file in the params.yaml." #6107: "I’d need to duplicate all arguments in two places: in params.yaml and in dvc.yaml, thus having three slightly different copies of the arguments." Complex objects/expressions are not supportedType constructors and other complex expressions that may be used in ML frameworks for parameter configuration cannot be parsed today as parameters. #5477 (comment): "At the end it would be super useful to just use my python config files as params directly. At this moment I cannot afford to remove type constructors of all my config files." #6512: Modifying existing workflowsI can mitigate some of these issues if I adapt my workflow to what DVC expects. I can move my parameters to #5477 (comment): "At this moment I cannot afford to remove type constructors of all my config files." #6107: "And if I’m not sure I want to migrate and just want to give DVC a try I would want to keep all my current scripts unchanged to remove DVC later if I won’t be planning on using it."
|
Beta Was this translation helpful? Give feedback.
-
There are many related discussions/issues discussing ideas similar to a function registry for dependencies, which seem to expand this idea from params to any dependency:
Also, a lot of the parameter issues noted here were considered when first implementing params: #3393. |
Beta Was this translation helpful? Give feedback.
-
Another related discussion: https://discuss.dvc.org/t/dvc-and-hydra-integration/868/2 |
Beta Was this translation helpful? Give feedback.
-
Looking at this discussion, it seems that it would be a good way to address this feature request: #8072 about integrating HyperPyYAML. Is it still something dvc wants to address? |
Beta Was this translation helpful? Give feedback.
-
Summary
It would be nice to expose part of the
params
functionality to some sort offunction registry
(i.e. using catalogue) in order to allow users to inject custom logic for handling params.There have been several discussions and feature requests for extending or modifying
params
: #4780 #4883 #5477 #5777 #5868 #5971 #6107 #6129 #6511 #6512discord discussion about hydra
Motivation
I think that allowing users to register custom functions could partially alleviate these requests and
dvc
could focus on maintaining a minimal built-in functionality.For example, having the registry would allow:
dvclive
to provide some opinionateddvc params
functionality focused on specific ML Frameworks as part of its integrations without requiring changes todvc
core.dvc params
to integrate with other configuration libraries like configparser, dynaconf, hydra, yacs, etc.Add strict validation to
dvc params
(i.e. using pydanticAnd probably by just exposing the options to the users some more ideas would come.
Implementation
I see a few potential places to inject a registry:
load_param_overrides
MODIFIERS
LOADERS
It would require to be more explicit (on docs and/or code) about the expected inputs/outputs for each registry place.
Beta Was this translation helpful? Give feedback.
All reactions