Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SFPU shift operator issue when using sfpi #15514

Open
rdjogoTT opened this issue Nov 27, 2024 · 0 comments
Open

SFPU shift operator issue when using sfpi #15514

rdjogoTT opened this issue Nov 27, 2024 · 0 comments
Labels
bug Something isn't working LLK

Comments

@rdjogoTT
Copy link
Contributor

Describe the bug
The following SFPU llk, implemented using sfpi, does not work:

template <bool APPROXIMATION_MODE, int ITERATIONS = 8>
inline void calculate_left_shift(const uint shift_amt) {
#pragma GCC unroll 0
    for (int d = 0; d < ITERATIONS; d++) {
        vInt val = dst_reg[0];
        val = val << shift_amt;
        dst_reg[0] = val;

        dst_reg++;
    }
}

But this one does, and it uses SFPU instructions directly:

template <bool APPROXIMATION_MODE, int ITERATIONS = 8>
inline void calculate_left_shift(const uint shift_amt) {
#pragma GCC unroll 0
    for (int d = 0; d < ITERATIONS; d++) {
        TTI_SFPLOAD(0,4,3,0);
        TT_SFPSHFT(shift_amt,0,0,1);
        TTI_SFPSTORE(0,4,3,0);
        dst_reg++;
    }
}

Both simply load 32 bit values from Dest, left shift by shift_amt, and store back. Sfpi version does not work when bit 31 (MSB) has to change from 0->1 or 1->0 as a result of shift operation.
See this issue for more details: #13415

To Reproduce
Use this test:

from loguru import logger
import random
import pytest
import torch
import ttnn

from tests.ttnn.utils_for_testing import assert_with_pcc
from tests.ttnn.python_api_testing.sweep_tests import ttnn_ops


def run_bitwise_left_shift_tests(input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, shift_bits, device):
    torch.manual_seed(data_seed)

    x = torch.randint(-2147483647, 2147483648, input_shape[0], dtype=torch.int32)
    
    changebit = 31 - shift_bits
    exludebit_mask = torch.bitwise_not(torch.tensor(2**changebit).to(torch.int32))
    includebit_mask = torch.tensor(2**changebit).to(torch.int32)

    x = torch.where(x < 0, torch.bitwise_and(x, exludebit_mask), torch.bitwise_or(x, includebit_mask))
    # Uncomment the line bellow and comment the line above for the unit test to pass
    #x = torch.where(x < 0, torch.bitwise_or(x, includebit_mask), torch.bitwise_and(x, exludebit_mask))
    
    try:
        # get ref result
        ref_value = torch.bitwise_left_shift(x, shift_bits)
        
        tt_x = ttnn_ops.setup_ttnn_tensor(x, device, dlayout[0], in_mem_config[0], dtype[0])
        
        tt_result = ttnn.bitwise_left_shift(tt_x, shift_bits)
        tt_result = ttnn_ops.ttnn_tensor_to_torch(tt_result, output_mem_config[0])

    except Exception as e:
        logger.warning(f"Operation execution crashed")
        raise e

    assert len(tt_result.shape) == len(ref_value.shape)
    assert tt_result.shape == ref_value.shape
    assert_with_pcc(ref_value, tt_result, 0.99)


test_sweep_args = [
    (
        [(32, 96)],
        [ttnn.int32],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG],
        [ttnn.L1_MEMORY_CONFIG],
        17799073,
        1,
    ),
    (
        [(3, 160, 224)],
        [ttnn.int32],
        [ttnn.TILE_LAYOUT],
        [ttnn.L1_MEMORY_CONFIG],
        [ttnn.DRAM_MEMORY_CONFIG],
        3121221,
        11,
    ),
    (
        [(6, 6, 224, 256)],
        [ttnn.int32],
        [ttnn.TILE_LAYOUT],
        [ttnn.DRAM_MEMORY_CONFIG],
        [ttnn.DRAM_MEMORY_CONFIG],
        10286194,
        23,
    ),
    (
        [(6, 10, 256, 256)],
        [ttnn.int32],
        [ttnn.TILE_LAYOUT],
        [ttnn.L1_MEMORY_CONFIG],
        [ttnn.L1_MEMORY_CONFIG],
        10286194,
        30,
    ),
]

@pytest.mark.parametrize(
    "input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, shift_bits",
    (test_sweep_args),
)
def test_bitwise_left_shift(input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, shift_bits, device):
    run_bitwise_left_shift_tests(input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, shift_bits, device)

Expected behavior
Sfpi version of the LLK should pass the test as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working LLK
Projects
None yet
Development

No branches or pull requests

1 participant