You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to ask a question regarding the image quantization. I dont really understand why you divide coordinates of the bounding box with the max_image_size (= 512), instead of the patch_image_size
Assuming a bounding box [x1, y1, x2 x2] with width w and height h, to me it seems that the quantization of each coord would be x1 / w * (num_bins -1). For example for a bounding box [120, 200, 150, 220] with w = 600 and h = 800 the quantized x1 would be: 120 / 600 * (num_bins -1).
Could you also explain the choice behind the value of the max_image_size?
Thanks :)
The text was updated successfully, but these errors were encountered:
Maybe it's just a coords normalization operation in both training and prediction.
However, when using bin2coord, it causes the coordinates to go out of the image(task.cfg.max_image_size >= task.cfg.patch_image_size).
Hello!
I would like to ask a question regarding the image quantization. I dont really understand why you divide coordinates of the bounding box with the
max_image_size
(= 512), instead of thepatch_image_size
OFA/utils/transforms.py
Lines 240 to 243 in a36b91c
Assuming a bounding box [x1, y1, x2 x2] with width w and height h, to me it seems that the quantization of each coord would be
x1 / w * (num_bins -1)
. For example for a bounding box [120, 200, 150, 220] with w = 600 and h = 800 the quantized x1 would be:120 / 600 * (num_bins -1)
.Could you also explain the choice behind the value of the
max_image_size
?Thanks :)
The text was updated successfully, but these errors were encountered: