-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(blog): classification metrics on the backend #10501
base: main
Are you sure you want to change the base?
docs(blog): classification metrics on the backend #10501
Conversation
Seems like these would also be useful additions to IbisML! |
I think so! I have given that a good bit of thought and I think it would be worth adding that capability with IbisML. I opened feat: ibis_ml.metrics #174 over there, so hopefully, we can discuss further and plan the approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick look. I personally like the detailed explanations; as you said, a lot of people may not have much ML exposure. I also think this is illustrative, but not necessarily efficient.
I'm guessing this should be a lot more efficient:
>>> tp = (t.actual * t.prediction).sum()
>>> tp
┌───┐
│ 4 │
└───┘
>>> fp = t.prediction.sum() - tp
>>> fp
┌───┐
│ 2 │
└───┘
>>> fn = t.actual.sum() - tp
>>> fn
┌───┐
│ 3 │
└───┘
>>> tn = t.actual.count() - tp - fp - fn
>>> tn
┌───┐
│ 3 │
└───┘
(I borrowed the logic from https://github.com/scikit-learn/scikit-learn/blob/a2448b5ce8778b76f8d8c6e7b0ef9b6cca9c7313/sklearn/metrics/_classification.py#L445, since I was too lazy to think it through myself.)
Since you do explicitly make a point about performance, maybe it makes sense to show the more efficient method after going through the illustrative labeling approach?
Edit: An alternative would be to just show the illustrative approach, add the efficient approach to IbisML, and call the IbisML function to demo the "efficient" path.
Thanks for the review and the feedback! I agree. The way you demonstrated calculating the true positives, false positives, etc., does seem much more efficient. It also demonstrates how we can break apart calculations and use them in other expressions with Ibis.
This is a great idea! The illustrative approach helps cement the concepts, and then the more efficient method would demonstrate assigning expressions as variables as using them in other expressions. Something that is far less convenient to do with pure SQL. I'm happy to incorporate this!
What if we added the above efficient approach to the article as it is now, I follow this up with another blog post on regression metrics. Then we have a third blog post to close out the series that throws back to the first two (e.g., we've previously reviewed and demonstrated how to calculate classification and regression metrics with Ibis, in this post, we'll demonstrate how we can perform these calculations out of the box with IbisML) so that we can tie it all together and create a nice mini series of blog posts. |
Sounds good to me! From my perspective, part of seeing your posts is also an indicator of what, if anything, somebody may actually want to use Ibis for in the ML space. Happy to use the blogs as a leading indicator. :) |
I just updated it to incorporate this approach. Thank you for sharing those snippets! Hopefully it flows well - I'm happy to adjust as necessary. |
t.select( | ||
accuracy=accuracy_expr, | ||
precision=precision_expr, | ||
recall=recall_expr, | ||
f1_score=f1_score_expr, | ||
).limit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.execute()
should work (or .to_pyarrow().as_py()
or some of the other .to_*
export methods)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added suggestions for the "efficient" paths, but I think for these there may be no meaningful difference if the computations are already warm on the backend? Probably something you could more easily test if you're interested; leave it up to you whether you want to use these shortcut formulas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor grammatical changes, but otherwise looks great to me!
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
64e025a
to
ef63e90
Compare
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
--- | ||
title: "Classification metrics on the backend" | ||
author: "Tyler White" | ||
date: "2024-11-15" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
date: "2024-11-15" | |
date: "2024-11-25" |
I always forget to update the date, so leaving this out as a reminder to adjust this
when we are ready to merge.
I'm ready to go with this one if we're good with it! (pending the date edit). Thanks for your help and the thorough review @deepyaman, I think it greatly improves the post! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @IndexSeek -- this looks good to me!
Do you have a particular date you'd like to release it on?
I feel like @lostmygithubaccount would tell us to not publish it on a Friday.
Sweet! Thank you for the review and approval. I think this upcoming Monday would work out well, given later in the week many potential US readers would rather be consuming turkey than consuming information on classification metrics. I edited my suggestion above so it is easier to tweak when we are ready to go if that date is okay. |
generally wouldn't recommend publishing on Friday + a lot of people will be out all of next week for Thanksgiving. but idk, maybe people want something to read still great blog! not necessary, but could be cool to demonstrate a plot of the confusion matrix with one of the visualization libraries also this reminded me of what could be a cool follow up blog for using binary classification to detect data drift over time (described as two-sample tests here: https://arxiv.org/abs/1610.06545 and various other articles since). it's a really cool application and in theory Ibis + XGBoost or LightGBM makes it trivial to implement on a ton of backends |
Thank you! This is a great idea; I will tweak this to support plotting this either this evening or over the weekend.
I haven't previously used binary classification to detect drift, but it does seem like a clever application! I like the idea a lot; exploring and providing a write-up showing how Ibis can make this easy regardless of where the data is would be cool. We could also use Ibis to detect feature drift; that is something else I've been thinking about a lot. I think the implementation would be more straightforward than alternatives. |
Description of changes
Adding a blog post breaking down how to perform binary classification metrics with Ibis. I did a fair amount of background explanation on these models and these metrics because many Ibis users may not be as familiar with these topics, but we can scale that back if needed and get more to the point.