-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch MKI sorting from quicksort to mergesort #19
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #19 +/- ##
=======================================
Coverage 98.74% 98.74%
=======================================
Files 12 12
Lines 478 478
=======================================
Hits 472 472
Misses 6 6 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Quick follow-up question: We're not using quicksort anywhere else, are we? |
A quick ripgrep of all our repos shows the following. I'm not sure if any of these would be impactful. Perhaps only the comps one is a concern @jeancochrane
|
While working on ccao-data/data-architecture#422, I discovered that
mki()
doesn't produce stable results i.e. it returns slightly different values for each run. This is due to the defaultpandas.sort_values()
function using quicksort, which is not stable and returns different index orders for ties.Per this SO post, This PR switches the sort method to mergesort, which is stable.