You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since there are 24 significand bits (23 explicit) in f32, the maximum finite non-integer in f32 is 223 − 0.5. As a result, non-integers with absolute value in (215, 223) are returned as is and not rounded.
Either of the following would solve this issue:
Work on i32 instead of i16
A second iteration that works on magnitudes > 215
The text was updated successfully, but these errors were encountered:
jdh8
changed the title
Values in (2<sup>15</sup>, 2<sup>23</sup>) are not rounded by rounding ops
Values in (2^15, 2^23) are not rounded by rounding ops
Oct 7, 2024
@mouliraj-mcw I do not believe support for float-to-int32 is going to become available as a single call similar to float_to_int16, because the underlying instruction does not support int32 as output to the conversion. We do have a manually implemented conversion that I made for the Typecast OP found here: https://github.com/tenstorrent/tt-llk-wh-b0/blob/eadfdb466dcc738803969f9eda76aacbb33b5ad6/common/inc/sfpu/ckernel_sfpu_typecast.h#L62-L97
I wouldn't recommend just dropping in this kernel in place of float_to_int, because that would be very slow. It would be better to combine the two so that there isn't as much of a perf hit.
Currently, rounding ops ignore values outside range of
i16
, i.e. [-215, 215).tt-metal/tt_metal/hw/ckernels/wormhole_b0/metal/llk_api/llk_sfpu/ckernel_sfpu_floor.h
Lines 31 to 33 in 3d33e8d
Since there are 24 significand bits (23 explicit) in
f32
, the maximum finite non-integer inf32
is 223 − 0.5. As a result, non-integers with absolute value in (215, 223) are returned as is and not rounded.Either of the following would solve this issue:
i32
instead ofi16
The text was updated successfully, but these errors were encountered: