Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation only considering single-label? #9

Open
nizhf opened this issue Jun 15, 2022 · 4 comments
Open

Evaluation only considering single-label? #9

nizhf opened this issue Jun 15, 2022 · 4 comments

Comments

@nizhf
Copy link

nizhf commented Jun 15, 2022

Hi.
I have a question when I using the vidor_eval.ipynb script to generate mAP. The script seems to only support single-label case? If a ground-truth human-object pair has multiple interactions, for example gt <human1, (watch, next to), obj2>, only <human1, watch, obj2> can be matched to a prediction. This gt pair <human1, obj2> is then added to gt_bbox_pair_matched and cannot be matched to other predictions.
Thank you

@coldmanck
Copy link
Owner

Hi @nizhf thanks for your interest in our work! I think our vidor_eval.ipynb indeed supports multi-label evaluation. We loop through all the predicted HOI triplets, and when there's a match, we append the specific triplet_class to gt_bbox_pair_matched. Note that it's possible that there're more than one triplet with the same subject and object in the predicted HOI triplets.

@nizhf
Copy link
Author

nizhf commented Jun 17, 2022

I think what you append to gt_bbox_pair_matched is the index of the gt_pair. In gt_bbox_pair_matched.add(max_gt_id), the max_gt_id is set as max_gt_id = k, and k is from this line for k, gt_bbox_pair_id in enumerate(gt_bbox_pair_ids), which is the index of the gt_bbox_pair_id, but not a triplet.

@coldmanck
Copy link
Owner

Let me clarify: the idea of evaluation is:

for each predicted HOI triplet
   for each ground truth HOI triplet (k)
      if there's a match
         set is_match to True
         record k or update with the maximum overlapping object-pair boxes
   if there's a match
      add the matched, predicted HOI triplet into the true positive set
   else
      add into the false positive set

As the ground truth HOI triplets are multi-label, the predictions also can match them.

@nizhf
Copy link
Author

nizhf commented Jun 18, 2022

Thank you for detailed clarification.

What confuses me is for each ground truth HOI triplet (k). In the evaluation script, it refers to for k, gt_bbox_pair_id in enumerate(gt_bbox_pair_ids), and gt_bbox_pair_ids = result['gt_bbox_pair_ids'].

I checked the result JSON file, gt_bbox_pair_ids are for example 'gt_bbox_pair_ids': [[0, 1], [1, 0]]. If I understand correctly, these point to the index of gt_boxes. So maybe here is only for each ground-truth pair (k)? The ground truth HOI triplet is obtained by gt_rel_cls = result['gt_action_labels'][k][j]. If there is a match, the ground-truth pair k is added to gt_bbox_pair_matched. This pair then cannot be matched to other predicted triplet.

Just a detailed example:
Assume we have two predicted HOIs: <human1, watch, obj2> and <human1, next_to, obj2>. The gt_bbox_pair_ids is [[0, 1]]. The gt_action_labels has 1.0 for watch and next_to.
We first process prediction <human1, watch, obj2>. We have k=0 and j=index_of_watch. Then we have result['gt_action_labels'][k][j]=1.0. This is a match, we add <human1, watch, obj2> to tp and k=0 to gt_bbox_pair_matched.
Then we process prediction <human1, next_to, obj2>. We have k=0 and j=index_of_next_to. We also have result['gt_action_labels'][k][j]=1.0. There should be a match, but we check that k=0 is already in gt_bbox_pair_matched, so <human1, watch, obj2> is falsely added to fp.

I hope I described my understanding of the vidor_eval_ipynb script clearly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants