Implement FLOAT32 support for eltwise binary ops #15483

KalaivaniMCW · 2024-11-27T06:08:44Z

Yan's comment:
For binary ops we can't use the full float32 precision. The reason is this. Both input tiles (from A and B) are in the local SRAM in the fp32 format. Then the unpacker puts them into SrcA and SrcB registers. Those registers only support the TF32 format, immediately losing 13 bits of mantissa. Then they are placed in the DST register back in fp32, but the precision has already been lost. We do support the direct SRAM to DST unpacking with full precision, but only for one of the two unpackers. So this would work for unary ops, but not for binary.
Ref: comment , ticket

New LLK for binary SFPU ops - rd/binary_sfpu_pow

Goal:

incorporate the full float32 precision in the current elt binary implementation without disturbing the existing implemenatation i.e. a separate compute kernel for fp32 and program factory. the criteria to pick full float32 precision, for now, is when both inputs are in float32 dtype. #15483: Initial setup for binary sfpu ops #15557
Need to support pre and post-activations on input and output
Need to support chained binary ops
Do we need typecast on output ? I dont think so , since this kernel exists for the purpose of providing full float32 precision

KalaivaniMCW self-assigned this Nov 27, 2024

KalaivaniMCW added ttnn op_cat: eltwise MCW labels Nov 27, 2024

KalaivaniMCW pushed a commit that referenced this issue Nov 27, 2024

#15483: step 1 - Mcore sfpu pgm factory and utils

0bf9f41

KalaivaniMCW added a commit that referenced this issue Nov 28, 2024

#15483: initial setup - binary sfpu

6a3008b

KalaivaniMCW mentioned this issue Nov 28, 2024

#15483: Initial setup for binary sfpu ops #15557

Draft

5 tasks

KalaivaniMCW added a commit that referenced this issue Nov 28, 2024

#15483: initial setup - binary sfpu

a12fbcc

KalaivaniMCW added a commit that referenced this issue Nov 29, 2024

#15483: new changes to pow div

7579af8

KalaivaniMCW added a commit that referenced this issue Nov 30, 2024

#15483: initial setup

cd772e7

KalaivaniMCW added a commit that referenced this issue Dec 2, 2024

#15483: add activations and prescale support

9fb6dd4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement FLOAT32 support for eltwise binary ops #15483

Implement FLOAT32 support for eltwise binary ops #15483

KalaivaniMCW commented Nov 27, 2024 •

edited

Loading

Implement FLOAT32 support for eltwise binary ops #15483

Implement FLOAT32 support for eltwise binary ops #15483

Comments

KalaivaniMCW commented Nov 27, 2024 • edited Loading

KalaivaniMCW commented Nov 27, 2024 •

edited

Loading