Buy hardware | Install | Discord | Join Us

TT-NN is a Python & C++ Neural Network OP library.

API Reference | Model Demos

LLMs

Model	Batch	Hardware	ttft (s)	t/s/u	Target t/s/u	Release
Falcon7B-decode	32	e150		4.2	4.4
Falcon7B	32	n150	0.07	16.7	26	v0.52.0-rc2
Mistral-7B	32	n150		9.9	25	v0.51.0-rc28
Mamba-2.8B	32	n150	0.04	12.3	41	v0.51.0-rc26
LLaMA-3.1-8B	1	n150		8.3	23	v0.51.0-rc28
Falcon7B (data parallel)	256	QuietBox	0.11	13.4	26	v0.51.0-rc36
LLaMA-2-70B - (tensor parallel)	32	QuietBox		10.4	20	v0.52.0-rc14
LLaMA-3.1-70B (tensor parallel)	32	QuietBox		10.4	20	v0.52.0-rc14
Falcon40B (tensor parallel)	32	QuietBox		5.3	36	v0.52.0-rc12
Mixtral7Bx8 (tensor parallel)	32	QuietBox	0.19	13.6	33	v0.51.0-rc33
Falcon7B (data parallel)	1024	Galaxy	0.27	4.1	26	v0.52.0-rc14

Notes:

The reported LLM performance is for an input sequence length (number of rows filled in the KV cache) of 128 for all models except Mamba (which can accept any sequence length).

The t/s/u reported is the throughput of the first token generated after prefill, i.e. 1 / inter token latency.

CNNs

Model	Batch	Hardware	fps	Target fps
ResNet-50 (224x224)	20	e150	5,100	10,000
ResNet-50 (224x224)	16	n150	4,100	7,000
ResNet-50 (224x224) (data parallel)	128	QuietBox	32,250	56,000
ResNet-50 (224x224) (data parallel)	512	Galaxy	66,150	224,000
ResNet-50 (224x224) (data parallel)	1024	Two Galaxies	128,800	448,000
ViT	8	e150	860	2,000
Stable Diffusion 1.4 (512x512)	1	n150	0.167	0.3
Unet (shallow)	2	n150	51	1000

NLPs

Model	Batch	Hardware	sen/sec	Target sen/sec
BERT-Large	12	e150	370	410
BERT-Large	8	n150	270	400
T5 small		e150	140
Bloom		e150	70

Model Updates

For the latest model updates and features, please see MODEL_UPDATES.md

TT-NN Tech Reports

Advanced Performance Optimizations for Models (updated Sept 11th)
Programming Mesh of Devices (updated Sept 9th)

TT-Metalium is our low-level programming model, enabling kernel development for Tenstorrent hardware.

Programming Guide | API Reference

Getting started

Get started with simple kernels.

TT-Metalium Tech Reports

Matrix Engine (updated Sept 6th)
Tensor Layouts (updated Sept 6th)
Data Formats (updated Sept 7th)
Saturating DRAM Bandwidth (updated Sept 6th)
Flash Attention on Wormhole (updated Sept 6th)
CNNs on TT Architectures (updated Sept 6th)
Ethernet and Multichip Basics (Updated Sept 12th)
Blackhole Bring-Up Prgramming Guide (Updated Sept 12th)

Name		Name	Last commit message	Last commit date
Latest commit History 10,511 Commits
.github		.github
cmake		cmake
dockerfile		dockerfile
docs		docs
infra		infra
models		models
scripts		scripts
tech_reports		tech_reports
tests		tests
tt_metal		tt_metal
ttnn		ttnn
.clang-format		.clang-format
.clangd		.clangd
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Doxyfile		Doxyfile
ErrorMessageBestPractices.md		ErrorMessageBestPractices.md
INSTALLING.md		INSTALLING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
METALIUM_GUIDE.md		METALIUM_GUIDE.md
README.md		README.md
best_practices.md		best_practices.md
build_metal.sh		build_metal.sh
check_copyright_config.yaml		check_copyright_config.yaml
cloc.sh		cloc.sh
conftest.py		conftest.py
create_venv.sh		create_venv.sh
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Buy hardware | Install | Discord | Join Us

API Reference | Model Demos

LLMs

CNNs

NLPs

Model Updates

TT-NN Tech Reports

Programming Guide | API Reference

Getting started

TT-Metalium Tech Reports

About

Releases

Packages

Languages

License

thanhnguyen-moreh/tt-metal

Folders and files

Latest commit

History

Repository files navigation

Buy hardware | Install | Discord | Join Us

API Reference | Model Demos

LLMs

CNNs

NLPs

Model Updates

TT-NN Tech Reports

Programming Guide | API Reference

Getting started

TT-Metalium Tech Reports

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages