-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A question about calculating precision@k #2
Comments
Do you mean if there's only one true positive code? Precision @ k is generally defined as the fraction of the k highest scored-labels that are in the set ground truth labels (it is defined this way at least in our paper). So we always want the denominator to be k, even if there are fewer than k ground truth labels. |
Thanks for your reply. |
Ah, I see what you mean now. Yes you can define it that way. I'm not actually sure what the standard practice is in information retrieval, which is where I think this metric is most prevalent. I suspect the difference will not be major for this ICD coding task, as a trained model will usually predict more than 8 codes. For internal consistency I think I will update that comment, but keep the implementation as is now. |
The original implementation doesn't seem to be wrong. This article may help: https://medium.com/@m_n_malaeb/recall-and-precision-at-k-for-recommender-systems-618483226c54 |
Hello,
I have a question about the function precision_at_k in evaluation.py. I think the denominator should be the amount of 1 predictions made among the top k predictions, however, in the code, length of top k is used. For example, if only 1 true prediction in the top 5, the denominator should be 1 but in this case it would still be 5.
Here is my modification:
Could you take a look at it? Correct me if I am wrong.
The text was updated successfully, but these errors were encountered: