Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[solidago] feat: Asymetric uncertainty #1781

Closed
wants to merge 2 commits into from

Conversation

lfaucon
Copy link
Member

@lfaucon lfaucon commented Sep 24, 2023

#1780


Description

Uncertainty of individual scores should be asymmetric.
This PR updates the computation of individual scores for the Continuous Bradley Terry model in solidago using the method of "high likelihood range".

To-do

  • Remove the symmetric uncertainty in solidago CBT model
  • Adapt Tournesol's individual rating model to store the assymetric uncertainties
  • Adapt the downstream handling of uncertainty to support assymetric uncertainty of individual ratings (
    • usage of uncertainty in global score aggregation
    • the display of uncertainty in the public dataset
    • usage of the uncertainty in scaling
  • Make sure the new feature does not overly change the current video rankings
  • Increment Solidago's version

Checklist

  • I added the related issue(s) id in the related issues section (if any)
    • if not, delete the related issues section
  • I described my changes and my decisions in the PR description
  • I read the development guidelines of the CONTRIBUTING.md
  • The tests pass and have been updated if relevant
  • The code quality check pass

❤️ Thank you for your contribution!

negative_exponential_term = np.exp((normalized_r_ab - 1) * theta_ab)
return np.where(
np.abs(theta_ab) < EPSILON,
1 / 2,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably log of 1/2 instead

if f(b, *args) == 0:
return b
if f(a, *args) * f(b, *args) > 0:
raise ValueError("Function `f` should have opposite sign on bounds `a` and `b`")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimize by not calling f(a) and f(b) twice

@@ -39,6 +53,64 @@ def Delta_theta(theta_ab):
).sum() ** (-0.5)


HIGH_LIKELIHOOD_RANGE_THRESHOLD = 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lenhoanglnh Do you confirm the idea of using a Likelihood lower bound to compute the uncertainty interval, rather than a more standard 90% confidence interval?

indices_b, _r_ab = coord_to_subset[idx_a]
indices_b, r_ab = coord_to_subset[idx_a]
lower_bound, upper_bound = get_high_likelihood_range(
continuous_bradley_terry_log_likelihood,
Copy link
Member Author

@lfaucon lfaucon Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lenhoanglnh When calculating the high likelihood range, should we include only the likelihood of observed comparisons, or also the prior/regularization (something like alpha * theta^2)?

@GresilleSiffle GresilleSiffle added the Solidago Tournesol algorithms library label Sep 25, 2023
@lenhoanglnh lenhoanglnh mentioned this pull request Jan 13, 2024
5 tasks
@lfaucon
Copy link
Member Author

lfaucon commented Jun 1, 2024

Replaced by #1973

@lfaucon lfaucon closed this Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Solidago Tournesol algorithms library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants