-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for double weights #203
Comments
Yeah... there has been a fair bit of discussion on this. The core question is what does the t-digest invariant actually mean with non-integer weights. Do you have thoughts on that? The key problems in the past include:
So, what do you think? |
Centroids now have to maintain their cardinality (the number of samples). Then, the exemption can be done based on the cardinality, not the weight (in fact, weight used to be some kind of cardinality, with the assumption of unit weight). With non-integer weights, the t-digest invariant Then, the invariant should behave identical to equivalent integer weights. For example, non-integer weighted samples (sample value, sample weight)
with quantiles
are equivalent to these integer-weighted samples:
with quantiles
Difference is that a cluster |
What are your thoughts on supporting double weights, instead of integer weights only? This would allow to use
(0..1]
weights, which would be more convenient than mapping those weights to integers in user code.This would require to distinguish the semantics of
count
fromweight
, which could be beneficial in other use cases as well, e.g. #198.Obviously, this will introduce a breaking change to the API.
The text was updated successfully, but these errors were encountered: