Fix [group] attn matmul to not throw circular buffer size exception in debug build #15458

bbradelTT · 2024-11-26T14:30:44Z

7a15bc9#diff-1838d88a8afd70b0cb4ec91cfa4f7af1426019d7e1feaefbdbe27709e9f387aaR466 has added a check in tt_metal/impl/buffers/circular_buffer_types.cpp that fails for debug builds when running an attn matmul test:

tests/tt_eager/python_api_testing/unit_testing/misc/test_attn_matmul.py::test_group_attn_matmul_with_program_cache[True-5-in0_dtype0-in1_dtype0-output_dtype0-True] ...
Cannot set to globally allocated buffer. Circular buffer size 6144 B exceeds allocated L1 buffer bank size of 4096 B

This needs to be fixed.

Repro steps:

./build_metal.sh  -b Debug
pytest tests/tt_eager/python_api_testing/unit_testing/misc/test_attn_matmul.py

Expected:

Everything should pass

Actual:

Exception is thrown

Note: was on BH, with skips removed, but expect failures for other other arch as well.

The text was updated successfully, but these errors were encountered:

bbradelTT · 2024-11-26T16:18:57Z

Details about how to fix this issue from @tt-aho

This is the commit that fixes 2 ops: e620b63
One is concat, where the shard size of the first input tensor was used to size the cbs of all the other inputs, even when they could be bigger/smaller.
Other was ssm, it was a RM layout shard, but was being sized to what it would be in tiles.
Could be different reasons for why the total size set is incorrect, would be an op by op basis.
Neither of these actually result in l1 corruption I think since they’re used for reading input only, but we want the check to always be enabled to prevent any issues since there are no guarantees on how the user might use the cb.

bbradelTT added bug Something isn't working op_cat: mm P1 labels Nov 26, 2024

bbradelTT self-assigned this Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix [group] attn matmul to not throw circular buffer size exception in debug build #15458

Fix [group] attn matmul to not throw circular buffer size exception in debug build #15458

bbradelTT commented Nov 26, 2024

bbradelTT commented Nov 26, 2024

Fix [group] attn matmul to not throw circular buffer size exception in debug build #15458

Fix [group] attn matmul to not throw circular buffer size exception in debug build #15458

Comments

bbradelTT commented Nov 26, 2024

bbradelTT commented Nov 26, 2024