-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Evaluation, Diagnostics and MLFlow Registry #1026
base: main
Are you sure you want to change the base?
Conversation
…e compliant with standards
…nclude in calc_metric
…conform with API guidelines and enable model registering
…ing an MMMEvaluator class and moving functions there as methods. Breaking out pm.compute_log_likelihood into its own method. Moving over diagnostic functions from MLFlow into this module
…V), adding remaining methods
@louismagowan, thanks for opening this one up again! Some tests (and the pre-commit) are failing. Do you need some help? |
@juanitorduz - thanks but I think I'm alright for now (or at least I think I can try and fix it myself later). I think I probably need to recreate my conda env (as I think it's a bit old now). @wd60622 - no need to review again yet, I'm still working on addressing your earlier feedback (I'll ping myself when I think I'm ready again) Thanks! |
…Adding log_model to autolog, along with log_loocv()
…metrics as distributions first, before then taking summary stats over the distributions of the metrics
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1026 +/- ##
==========================================
- Coverage 95.64% 93.69% -1.96%
==========================================
Files 39 40 +1
Lines 4089 4250 +161
==========================================
+ Hits 3911 3982 +71
- Misses 178 268 +90 ☔ View full report in Codecov by Sentry. |
Co-authored-by: Will Dean <57733339+wd60622@users.noreply.github.com>
…ng dst_path an arg in it
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
…h, add separate patch to autolog
…_model was dropped from autolog
Any idea why those tests are failing @wd60622 ? They aren't in the scope of my PR, I don't think 🤔 |
The tests are with fresh installs of libraries. Pydantic's latest version has changed error messages. I pushed a changed to Carlo's posterior pr. Feel free to ignore. Out of scope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very close to the finish line. Thanks for making the changes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like to keep support for all pymc models when not only for mmm
mlflow.register_model(model_uri, registered_model_name) | ||
|
||
|
||
def load_mmmm( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many Ms
We don't need to call this directly again in MMM.fit patch, since that function calls | ||
pm.sample() internally anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample never calls compute_log_likelihood to my understand. CC @juanitorduz
Nor would we want it to
Do we need to adjust any logic to fit accordingly?
if log_loocv and "log_likelihood" not in idata.groups(): | ||
pm.compute_log_likelihood(idata=idata, model=model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont automatically call with sample. It can be high computation
|
||
mlflow.log_params( | ||
idata.attrs, | ||
) | ||
mlflow.log_param("nuts_sampler", kwargs.get("nuts_sampler", "pymc")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this move. This is helpful for all pymc models
mlflow.log_param("pymc_marketing_version", __version__) | ||
mlflow.log_param("pymc_version", pm.__version__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep for all pymc models
@@ -577,11 +999,41 @@ def new_fit(*args, **kwargs): | |||
"saturation_name", | |||
json.loads(idata.attrs["saturation"])["lookup_name"], | |||
) | |||
|
|||
# Align with the default values in pymc.sample | |||
tune = kwargs.get("tune", 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shoud be logged because of patched sample
if log_metadata_info: | ||
log_metadata(model=model, idata=idata) | ||
|
||
# mmm.fit() calls pm.sample() internally, so we need to make sure log-likelihood is only added once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong to my understanding
@@ -412,14 +463,359 @@ def log_inference_data( | |||
os.remove(save_file) | |||
|
|||
|
|||
def log_summary_metrics( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename to include mmm and something with evaluation
Sets | ||
---- | ||
idata.log_likelihood : az.InferenceData | ||
The InferenceData object with the log_likelihood group, unless it already exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outdated
.. code-block:: python | ||
import pandas as pd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think a space is needed here
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Alternatively, load a model that has been autologged to MLflow via `pymc_marketing.mlflow.autolog(log_mmm=True)`, from the [PyMC-Marketing MLflow module](https://github.com/pymc-labs/pymc-marketing/blob/main/pymc_marketing/mlflow.py)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"autologged" is wrong based on the logic now
Hey @louismagowan, great PR. I really like your work, but I have something to ask haha, could we have a notebook that is something like "Evaluating model" where we show all the new metrics and how we can import and use them. It would be great if each addition had its own notebook! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the repeated patterns occur in logging functions, Could we encapsulate and re-use?
def log_and_remove_artifact(path: str | Path):
mlflow.log_artifact(str(path))
os.remove(path)
Then we replace on places like log_arviz_summary
, log_model_graph
, and log_inference_data
.
@@ -528,12 +928,29 @@ def autolog( | |||
""" | |||
arviz_summary_kwargs = arviz_summary_kwargs or {} | |||
|
|||
def patch_compute_log_likelihood(compute_log_likelihood): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints.
log_inference_data(idata, save_file="idata.nc") | ||
|
||
return idata | ||
|
||
return new_fit | ||
|
||
def patch_mmm_sample_posterior_predictive(sample_posterior_predictive): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints
Description
This PR is a recreation of this one which I was told to close and restart due to git issues on the pymc-marketing repo.
It could be nice to have some standardised model evaluation and diagnostic functions added to pymc-marketing. Ideally they'd be formulated in a way that makes them easy to log in MLFLow later on.
It would also be cool to build on top of the MLFlow module, to create a custom mlflow.pyfunc.PythonModel class to allow users to be able to register their models in the MLFlow registry. This would allow people to serve and maintain their MMMs more easily, and could help with MMM refreshes too.
Standard model metrics could include:
etc.
Diagnostic metrics could include:
Some additional plots (also useful for diagnosing models):
Model Registry / Additional Logging Code:
A wrapper for an MMM model to make it conform to the MLFlow api, enabling registering and easier deployment
Also an option to load models from the registry as well / or download the idata from MLflow too.
I'll open this as a Draft for now - since I'll need advice on how where best to put this code, as well as overall design etc.
Related Issue
Checklist
Modules affected
Type of change
📚 Documentation preview 📚: https://pymc-marketing--1026.org.readthedocs.build/en/1026/