Releases
v0.46.0
📦 Uncategorized
user-triggerable C++ post-commit suite
#6406 : add missing position_ids/attention_mask to bert demo
#6282 : Add AdamW
#6315 : Fix dprint tests for T3000
FD2: prefetch stall, dispatch wait, linear read, delay and cleanup
#6609 : update wording in demo section of main README.md
#6364 : Autocomplete for pybinded types
Asarje/ttnn rn50 b20
FD2.0 Test - Fix l1 buffer not page-size aligned in after FD-on-eth changes to L1_UNRESERVED_BASE
#6593 : Add resharding to Llama2 model when possible.
#6572 : Fix ttnn.repeat_interleave
example in documentation
#5780 : Re-enable 100K enqueue program stress test on grayskull
Enable basic width sharding support in all-gather
Alex/metal/remove cb wait markers
#6657 : Use sysmem manager cq size instead of recomputing it each time…
#0: (MINOR) Add Grayskull purchase link and update version to 0.46.0
#5063 : add TopK API to metal
#5480 : FD2.0 Test - Fix test_prefetcher for dram paged read test (-t 3) on whb0
Fix logit low pcc
Backward op - Fixed ldexp, hardsigmoid and asin
#6598 : Fix softplus
Add support for BFP4_B tensor serialization
Eltwise mul for different batch size
#6575 : Split docs into separate Metalium and nn docs
#0: Add two separate links for documentation (tt-metalium/ttnn) on README
#6361 : Update ttnn repeat to use correct shapes when formatting output
#0: Sayonaraaaaaaa
FD2.0 Test fix test_prefetcher add_paged_dram_data_to_worker_data dropping start_page
#5785 : Watcher ringbuffer implementation
Add FD 2.0 WriteHost Command
#0: Put back frequent api tests because I'm an idiot
Optimize All Gather Interleaved Worker send/receive
#0: changing all #include common/*
to #include tt_metal/common/*
#6676 : Fix issues related to unary lte and gte
#5817 : Fix lerp
#6589 : Fix for relu_bw
#6633 : Backward test update
#0: Skip logit, logiteps test
#0: Testing CI fix
#5480 : Update test_prefetcher to pass added hugepage args to dispatch kernel
Fix l1 acc, add whb0 optimized conv tests
Alignment fix for eth core kernels
Add data parallel (multi-chip) for Falcon7b (prefill/decode) model and corresponding tests
CQ_DISPATCH_CMD_WRITE_PAGED support in test_dispatcher and passing tests
#6647 : disable failing ci cpp tests and reenable cpp pipeline on CI
Backward test updates
Ngrujic/check bugs
Add Llama matmul perf tests to main
TTLIB: removing working tests from broken
#6443 : Update backward asin and addcdiv logic
#0: Fix output cb size calculation in reshard op for bfp8b
#0: use smart ptrs in allocator
Jvasilje docs 0322
DRAM based device profiler with Tracy support
#6553 : Fix ttnn.reshape(..) handling for bfloat16, TILE_LAYOUT
PR: #6746
Add Llama2 demo to tt-metal docs
Mistral-7B WH demo
Revert "#0: Put back frequent api tests because I'm an idiot"
FP32 support
#0: Add back frequent api tests to run.sh
Bteng/watcher ci3
Remove cpuprof
logo update
#6184 : sharded row major silu support.
#6443 : Update div_bw and backward ops test file
#6705 : Relax forcing of keyword argument in ttnn.open_device
Forward op tests
#6691 : Allow blocking of inner dim within a core for shaded in0 for 2d and 1d systolic matmuls
#6662 : Width Sharding support for eltwise OP
Stable diffusion python API level perf improvements
Add get_compute_kernel_config_args function
#0: Add fd-2/main triggers for pull_request and push for post-commit
#5480 : FD2 refactor for pre/dis patch variants
#6654 : Add perf tests for ttnn ResNet50
#5480 : Fix fd gtest unit test test_write_host
#0: Set myself as setup.py owner
#6780 : Add mistral7b to demos list in getting started
#4003 : re-added TTNN_ENABLE_LOGGING as runtime flag
#0: Fix semaphore address gen bug
#6769 : Disable program caching for failing Llama tests.
#5480 : Fix zero sized write transaction request that could occur in write_linear_host
#6077 : Fix unet pcc issues
Remove DstSync from llk api templates
FP32 Support
#6680 : Reverting move op change
#6443 : Update asinh and softsign backward
Backward tests with updated test modules
Ngrujic/check bugs 1
#6654 : Moving init for self.compute_kernel_config
#6805 : reproduce the bug with sharded split_query_key_value_and_split_heads
#6832 : Account for tile-padding in softmax for mistral 7B
Enable support for uint32 format to be consumed by SFPU (issue #4624 )
#4252 : fix clang build error since std::log2 only constexpr in gcc
#4003 : log, debug and add pre- and post- hooks only for top-level ttnn ops
#6823 : Fix core count to not include dispatch cores in op reprot
#6197 : Align pages for interleaved <-> sharded.
METALIUM_GUIDE
Bteng/watcher post commit
#6443 : update backward test file for relational ops and concat op
Revert "Bteng/watcher post commit"
#6443 : Update backward ops
Backward test updates
#0: Add the dim 0 support repeat backward
Update hard related test ops
#6757 : Remove set_profiler_location
#6443 : Update backward ops erfinv elu hypot cos sin
#6861 : Enable Watcher/dprint tests on T3000 CI
Update Mistral perf regression for CI, until issue is resolved
Mamba/perf v1
#0: remove data movement ops related to silu in SD
#4003 : added proper fallback for getitem of ttnn.Tensor. Slice the tensor only on the tile boundary but set the shape based on whatever user provided
#4003 : added proper fallbacks for every op that falls back to torch
#6731 : add fix to LN width sharding
#5797 : add back sweep test for ln
Integrate GroupNorm V2 to SD model
METALIUM_GUIDE.md updates
[Falcon7b] Fix bugs with inference throughput measurements in demo
#0: shallow unet add perf_mode
#6154 : 2d matmul in0 height, in1 width sharding
#5249 : Various Falcon40b test and demo cleanup
#0: fix incremental build
#0: remove upsample spill to DRAM
[Llama2 Prefill] Model Functionality completed
Watcher alignment checking for PCIe/DRAM <-> L1
#6920 : fixed the error in whisper
Update METALIUM_GUIDE.md
#6644 : save l1 buffers to data base
Update usage.rst
#6804 : fix ttnn falcon7b demo regression + add to CI regressions
#6285 : Add backward support for floor round and div_no_nan
[skip ci] Update INSTALLING.md
#6873 : Add more test combinations to tt_lib sweeps add, add_unary, su…
Ngrujic/check bugs 3
#6882 : Updated Mistral-7b perf estimate
#6850 : Update install links in Sphinx docs to point directly to INSTALLING.md
#6619 : Fix per op profiler sum
#6644 : sync before calling print l1 buffers
Barsic/ttlib ops check
Barsic/ttlib params fix
#6962 : Move cd tt-metal earlier in the command list of INSTALLING.md
#6819 : Add support for CreateKernel absolute file paths
#6356 : Remove half-half grid logic for bmms
#4003 : added a flag to disable ttnn fallbacks. Don't throw an error w…
#0: Correct FW versions, tt-smi versions, and add note about tt-topology
#0: Capitalize tt to TT consistently for marketing
#0: Add myself as CODEOWNER for INSTALLING.md
#6644 : ttnn visualizer
#6847 : Allow disabling individual watcher features
#6889 : Support printing/padding/tilizing multi-device tensors
#4003 : removed ttnn.print_l1_buffers and consolidated all ttnn flags into a CONFIG class
#6217 : tt_lib async mode support (single chipp tensors supported)
Reshard With Ranges
#4003 : updated buffer report to show the input/output tensors, buffer report of the previous operation and the buttons to go to the reports of previous/next operations. Load ttnn.CONFIG from a json file and override it using a single environment variable
#4003 : disable all tests in test_reports
New TTNN sweeps
#0: Put sfpi/ CODEOWNERS directive back on separate line because I'm an idiot and broke it
#6957 : Upload artifacts regardless of the device perf results
#5592 : Optimize Falcon 7b lm head matmul
#4003 : set delete_reports_on_start to false in the visualizer
#6969 : Split watcher noc alignment checks for reads vs writes
#7012 : Add support for sharding in Mamba model
#6217 : Async Mode Changes
#6886 : ttnn slicing bug for padded input
#7023 : Use bfloat8
weights in Mamba block MLPs
#6937 : Silu fix for multiple calls. Bug fix. Some name changes.
#6306 : Enable N150,N300 ttnn unit tests in CI Regressions; disable failing ones
Fix minor grammatical errors in METALIUM-GUIDE.md
#4003 : ttnn visualizer
#4003 : re-enabled test_reports
Sharded attention in stable diffusion.
#7041 : GS watcher error
#7041 : GS watcher error
#0: update path to watcher.log
Ngrujic/check bugs
build C++ tests in release mode
#6443 : Update backward ops
#6443 : Update backward ops
#6443 : Update backward ops
[skip ci] Update CODEOWNERS
frequent pipeline updates
Clean up Mamba unit tests and configs
#6873 : TTLIB modified sweeps GS and WH
#6443 : Update Unary Div backward
More aggressive deallocation, fewer spills to DRAM.
#4003 : use reports_path instead of tmp_path
#6838 : Add tracy timeout for op reprots
#6873 : Add more sweep combinations for tt_lib bcast and sum operations
#0: Add link to programming guide (METALIUM_GUIDE.md) instead of the bad paragraph we had before
#5489 : re-enable profiler regression on N300
TTNN sweep tests - zeros, zeros like, nexafter, empty, attention softmax inlace
You can’t perform that action at this time.