Releases: tenstorrent/tt-metal
v0.53.1-rc1
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/12022082193
📦 Uncategorized
- [CCL] Add negative dim support
- PR: #15305
- #12151: Replace
avg_pool2d
withglobal_avg_pool2d
- PR: #14330
- Update Qwen README.md to remove Llama references
- PR: #15409
- Optimized Llama 3.x perf with sharded residual
- PR: #15142
- #15247: Add unit test to show segfault with sharded config problem
- PR: #15249
- Fix num cores for dram sharded MM
- PR: #15373
- #0: Dispatch RTAs early in some cases
- PR: #15391
- #0: New RiscV architecture extension attributes
- PR: #15403
- #15361: Conv2d width sharded fails with tilized input
- PR: #15369
- #6659: remove dead code
- PR: #15427
- CB Size Validation Fix Rollout
- PR: #15394
- #12979: Merge erisc data & bss sections
- PR: #15267
- Fold batches into channels and use grouped convolutions in UNet Shallow
- PR: #14437
- [TT-Train] Added Yaml Configs support
- PR: #15352
- #7493: Accidently added two tests that should have been deleted durin…
- PR: #15431
- #0: Add InsertBraces: true to .clang-format
- PR: #15438
v0.53.0-rc51
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/12016640880
- no changes
v0.53.0-rc50
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/12001584495
- no changes
v0.53.0
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/12016702477
📦 Uncategorized
- #14773: Set default to true when getting active ethernet cores
- PR: #14776
- #11795: Update test_pgm_dispatch and sweep
- PR: #14806
- #14880: Ternary composite op clean up
- PR: #14888
- #14928: Ternary backward clean up
- PR: #14931
- #14930: Complex backward op clean up
- PR: #14932
- #0: Update Mixtral target
- PR: #14947
- #14665: add new moreh_clip_grad_norm and test in ttnn
- PR: #14667
- #14730: Support unequal ranked inputs for eltwise binary
- PR: #14803
- Fix double deallocate in llama3 attention
- PR: #14951
- #14862: fp32 support in unary
- PR: #14899
- Angle op fix
- PR: #14129
- Fix a non-c-typedef-for-linkage error
- PR: #14857
- Add experimental fused qk ROPE
- PR: #14860
- [skip ci] #14001: Add an ALIAS target for consuming TTNN
- PR: #14965
- #0: Disable llama test_model from all-post-commit CI pipeline
- PR: #14968
- float32 tilize support
- PR: #14963
- Move NUM_CIRCULAR_BUFFERS to hw/inc
- PR: #14908
- Mchiou/14961 disable gs profiler ring buffer
- PR: #14970
- #14990: Address feedback in Programming Mesh of Devices Tech Report
- PR: #14991
- #11512: Add sweep test for ttnn.transformers.attention_softmax
- PR: #14655
- #14826: Remove misoptimizations from init code
- PR: #14861
- Use cluster desc yaml on BH and pass PCIe NoC endpoint to device
- PR: #14945
- Increase packer precision for bfp8 formats
- PR: #14822
- Revert "Angle op fix"
- PR: #15014
- use do_crt1 like other cores
- PR: #15009
- Fixed incorrect mem size for DebugIErisc
- PR: #15021
- Dvartanians/mbahnas/yolov4 web demo traced
- PR: #14946
- [skip ci] Update CODEOWNERS
- PR: #15023
- Added tt-train to the tt-metal monorepo
- PR: #14875
- #0: Disable Unity builds to detect bitrot
- PR: #15017
- Update Resnet50 perf on n150
- PR: #15026
- [skip ci] Add GEMM techreport to explain WH performance
- PR: #14585
- Alignment fix for BH in I2S and S2I
- PR: #14627
- [skip ci] Update README.md (MM FLOPS)
- PR: #15029
- FD refactor + sub device support
- PR: #14400
- #0: Provide script for installing system dependencies
- PR: #14405
- Build with unity in build-artifact.yaml, don't use unity in build.yaml
- PR: #15027
- Move NOC_0_X/Y behind Hal
- PR: #14920
- Add reduce_scatter t3k perf to pipeline
- PR: #14950
- add initial fabric erisc data mover (EDM) impl
- PR: #14923
- Revert "Alignment fix for BH in I2S and S2I"
- PR: #15049
- Revert "use do_crt1 like other cores"
- PR: #15052
- Revert "#14826: Remove misoptimizations from init code"
- PR: #15053
- Reduce dependence on ARCH_NAME in dev_msgs.h
- PR: #14943
- graph trace update - extract_circular_buffers_peak_size_per_core
- PR: #15047
- Llama-Vision: Enable tracing, refactor generation code
- PR: #15005
- [tt-train] Added mesh support
- PR: #15031
- #13655: Fix sub-device tests for BH
- PR: #15050
- LlamaVision: Move xattn cache generation to text prefill forward
- PR: #15056
- Revert "Reduce dependence on ARCH_NAME in dev_msgs.h"
- PR: #15067
- Alignment fix for BH on I2S and S2I (fix after revert)
- PR: #15055
- Update size.hpp
- PR: #15072
- [skip ci] #0: update yolov4 READMEs
- PR: #15063
- #15073: Fix use after move in ttnn run_operation
- PR: #15074
- Restructure supported params table for ternary ops
- PR: #14992
- Change tt_SiliconDevice to tt::umd::Cluster
- PR: #14752
- #14546: Fix moreh_adamw power_tile reduce performance
- PR: #14927
- Update documentation for LERP
- PR: #14989
- Restructure supported params table for ternary backward ops
- PR: #14994
- #14999: Update scatter golden function
- PR: #14998
- Update ternary and backward ternary pybind examples
- PR: #15036
- (REDO) Reduce dependence on ARCH_NAME in dev_msgs.h
- PR: #15068
- #13521: New sweep for pytorch tracing - ttnn.add
- PR: #14877
- #14590: Move sfpi off LFS
- PR: #14758
- Add multi-block support for matmul_2d
- PR: #15012
- Enable CCache for builds
- PR: #15082
- Enable clang-tidy check for use after move
- PR: #15105
- #14688 Scan the repo with clang-tidy as part of post-commit
- PR: #15071
- Add tunneler tests to ci
- PR: #15098
- #11795: Added tests that dispatch randomly-generated Programs and alternate between using trace and not using trace
- PR: #14906
- #14895: enable gp-rel in kernels
- PR: #15043
- Add entry to MM benchmark
- PR: #15117
- Fix use-after-move
- PR: #15112
- #15123 This check is clean
- PR: #15126
- #5174: Uplifitng microbenchmarks to run on BH
- PR: #14191
- Relax Max Pool Requirement For C To Be Power Of 2
- PR: #15022
- Remove LFS from tt-train
- PR: #15139
- #14985: Update the examples for binary backward doc
- PR: #15083
- Add integer support for eltwise ops
- PR: #14953
- #0: Use logical shape in validation check
- PR: #15092
- [CCL] Compute device utilization percentage
- PR: #15084
- #15144: Increase trace region for yolo to fix
- PR: #15149
- #15079: make ProgramCache::is_enabled_ initialized out-of-line
- PR: #15080
- [skip ci] Update GEMM_FLOPS.md
- PR: #15100
- [skip ci] Update README.md
- PR: #15154
- [skip ci] Update GEMM_FLOPS.md
- PR: #15153
- [skip ci] Add files via upload
- PR: #15155
- #14474: Fix OoO issues for Llama3 tests on CI
- PR: #15111
- #0: Revert "#14730: Support unequal ranked inputs for eltwise binary (#14803)"
- PR: #15169
- Manually address an issue that local clang-tidy trips over
- PR: #15134
- Add Qwen2-7B model on N150
- PR: #15044
- Add support for new logical sharding + alignment in TensorLayout and tensor creation
- PR: #14771
- Support dst_full_sync_en flag in the WH compute kernel config pybind
- PR: #15007
- Revert "Add tunneler tests to ci"
- PR: #15177
- #14634: Remove usage of ARCH_NAME sp constants MEM_L1_SIZE
- PR: #14878
- tilize_op float32 access
- PR: #15115
- Add build config struct to HAL with base FW and local init addrs
- PR: #15150
- Update test_pgm_dispatch_script
- PR: #15184
- #15123 Fix performance-for-range-copy
- PR: #15128
- #0: Improve functional generality of ttnn.concat
- PR: #14306
- #15167: explicitly check for rank 4 in reduce special cases
- PR: #15181
- #14985: Update binary bw example, Use logical shape
- PR: #15091
- Disable test from running on t3k
- PR: #15194
- Update CODEOWNERS
- PR: #15195
- #14985: Update bias_gelu_bw example, implementation
- PR: #15086
- Update Lerp op
- PR: #15085
- Update Qwen expected compile time
- PR: #15198
- #14985: Update binary bw docs
- PR: #15163
- #13676: Add unit tests for io_bw, tan_bw, and lerp
- PR: #15002
- Move llama single-device demo tests to perf pipeline for dashboard support
- PR: #15199
- #14826: reorganize crt startup
- PR: #15094
- #13929: Update the input range for ldexp test
- PR: #14996
- #0: Remove duplicate single-card demo llama3 tests
- PR: #15218
- #0: Add eth dispatch to test_pgm_dispatch sweeps
- PR: #15217
- Add a Debug preset
- PR: #15222
- #13127: Add physical_shard_shape to ShardSpec attributes
- PR: #15185
- #13720 Make reshape-view 0 cost when possible
- PR: #15118
- Convert Hal into a Singleton
- PR: #15116
- Add support for arrays in CoreRangeSet
- PR: #14967
- #0: Fix typo causing spurious perf warnings for concat
- PR: #15229
- Update perf and latest features for llm models (Nov 18)
- PR: #15223
- #15145: Add support for multi-device tensors in grouped convolution weight preprocessing
- PR: #14914
- [tt-train] Fix tt-train in main branch
- PR: #15232
- #15144: Up timeout for mamba to an obscene number because we seem to take longer for some reason that I don't understand
- PR: #15244
- #14985: Update examples for binary backward ops
- PR: #15203
- #15228: Fix error message in BaseShape when index is out of bounds
- PR: #15236
- Allow Concrete Hal Translation Units to have unique include paths
- PR: #15189
- Update binary examples and supported params Set 2
- PR: #15211
- Add TT-NN roadmap and overview
- PR: #15253
- Add data formats to perf report
- PR: #15170
- Mo/14961 remove op alignment check
- PR: #15097
- Organize contributing docs in a subdir and add notes on clang-tidy
- PR: #15235
- #13675: update supported range for tan_bw
- PR: #15119
- Fix N150 llama3 demo CI tests to properly save perf information to superset
- PR: #15220
- #0: Add sweep for rw bw test
- PR: #15264
- [tt-train] Free graph during backward pass
- PR: #15241
- Update binary examples
- PR: #15041
- #14974: ttnn::empty Tensor creation API for MeshDevice
- PR: #15191
- #14427: increase erisc kernel code size
- PR: #15193
- Update remove-stale-branches.yaml
- PR: #15248
- Consolidate action back into this repo
- PR: #15240
- Fix usage of deleted branch
- PR: #15277
- #15234: disable sharded tests on Blackhole until fix is introduced
- PR: #15237
- #15140: Fix UAF error when MeshDevice.close_devices() not invoked
- PR: #15250
- Fix s2i op when shard grid is larger than actual used grid
- PR: #15113
- Add a padding-aware, interleaved, tiled transpose HC with a fused padding value parameter
- PR: #15224
- Update examples of unary backward
- PR: #15210
- Remove CMake variable UMD_HOME
- PR: #15271
- #0: Remove alignment requirements for Row Major tensors
- PR: #15245
- #15078: Update clamp_bw, clip_bw with min, max tensor
- PR: #15255
- Add forward support for...
v0.53.0-rc49
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/11982863618
- no changes
v0.53.0-rc48
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/11964798351
- no changes
v0.53.0-rc47
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/11944913275
- no changes
v0.53.0-rc46
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/11925023291
- no changes
v0.53.0-rc45
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/11904491875
- no changes
v0.53.0-rc44
Note
If you are installing from a release, please refer to the README, INSTALLATION instructions, and any other documentation packaged with the release, not on the main branch. There may be differences between the latest main and the previous release.
The changelog will now follow, showing the changes from last release.
This release was generated by the CI workflow https://github.com/tenstorrent/tt-metal/actions/runs/11884158440
- no changes