-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need support for ttnn.max_pool2d to accept block and width sharded input. #12810
Comments
fyi @saichandax |
@mywoodstock is there a plan to support this towards yolov4 optimization efforts? cc: @mbahnasTT |
Yes, the PR is nearly ready to be merged |
This is now in |
thanks for the update @mywoodstock ! great news! We will test this on yolov4 and once confirmed we can close this issue. |
@mywoodstock @dvartaniansTT, I am able to pass block-sharded input to the maxpool, and the execution is happening without any issue. However, the PCC of output coming from maxpool is very low(~0.055). I have create separate issue #14206 for it. |
@mywoodstock is this on your radar? |
@dvartaniansTT Yes, its being worked on: #14249 |
@dvartaniansTT , Yes, we are getting almost 0 pcc. The bug is tracked in #14249 issue as Abhinav mentioned. |
Describe the bug
ttnn.max_pool2d supports only height_sharded input tensor. Need support for block_sharded and width_sharded input.
To Reproduce
Steps to reproduce the behavior:
pytest tests/ttnn/integration_tests/yolov4/test_ttnn_neck.py
Expected behavior
To accept Block_sharded and width_sharded layout.
Screenshots
Please complete the following environment information:
Additional context
The input shape which we pass to maxpool is 1,10,10,512[NHWC]. Since the Channels is higher it should happen in width or block sharding to increasing the performance.
Current values when we use height sharding,
Attributes:
{'memory_config_':'MemoryConfig(memory_layout=TensorMemoryLayout::HEIGHT_SHARDED;buffer_type=BufferType::L1;shard_spec=ShardSpec(grid={[(x=0;y=0) - (x=3;y=0)]};shape={25; 0};orientation=ShardOrientation::ROW_MAJOR;halo=0))'; 'output_dtype_': 'DataType::BFLOAT16'; 'sliding_window_config_': 'SlidingWindowConfig(batch_size=1; input_hw=(10;10); window_hw=(5;5); stride_hw=(1;1); pad_hw=(2;2); dilation_hw=(1;1); num_cores_nhw=4; core_range_set_={[(x=0;y=0) - (x=3;y=0)]})'}
Core_count: 4
Kernel duration: 1077197 ns
The text was updated successfully, but these errors were encountered: