-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama3.2-Vision: Add reference submodule and tests #14051
Conversation
@tt-rkim could you check out |
Looks good Where are we running the |
I'd like to add the multimodal tests to CI after I experiment locally to see if I can tighten up the PCC bounds. I'll make a separate PR to put the tests in CI. |
sounds good, so these reference files are not used anywhere in CI currently? Or are they used for other parts of 3.2? |
@cglagovichTT Shouldn't this PR also include the tests? This is the PR that fully includes multimodal llama, so it's good practice to have it all good to go (tests included). But I'm ok with either approach and will help with the testing. |
Due to popular demand I will add the tests to this PR :) |
THANK YOU SIR |
@@ -208,7 +208,7 @@ def __init__(self, mesh_device, instruct=False, dummy_weights=False, max_batch_s | |||
self.compute_kernel_config_sdpa = ttnn.WormholeComputeKernelConfig( | |||
math_fidelity=ttnn.MathFidelity.HiFi4, | |||
math_approx_mode=False, | |||
fp32_dest_acc_en=False, | |||
fp32_dest_acc_en=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yieldthought is it going to be an issue in the non-vision attention modules to have fp32 acc here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
love it!
…enstorrent#14051) * tenstorrent#13368: Move repeat interleave to xattn cache generation. * #0: Clean up demo, enable arbitrary padding for multimodal text sequence * tenstorrent#13368: Add llama_models Meta reference for Llama3.2 as a submodule * tenstorrent#13368: Change reference imports to use new submodule * tenstorrent#13368: Clean up comments after pushing repeat_interleave into xattn_cache generation. * tenstorrent#13368: Clean up vision tests. Unify assertions and pcc checks. Fix LM head splitting on T3k. * tenstorrent#13368: Fix LM head splits calculation * tenstorrent#13368: For all vision tests, get model-specific parameters from model_args rather than fixtures. This generalizes tests for base and finetuned 11B models. * tenstorrent#13368: Add vision tests to unit, frequent, and demo * tenstorrent#13368: Fixup mesh_device when not passed FAKE_DEVICE * tenstorrent#13368: Remove llama_models as submodule. Move its install to llama3 requirements.txt. --------- Co-authored-by: mtairum <mtairum@tenstorrent.com>
Ticket
#13368
Problem description
The multimodal reference code was not tracked in git. We now have a fork of it we should use as reference code in tests and demos https://github.com/tenstorrent/llama-models/tree/main
What's changed
models/demos/llama3
Checklist
llama_models
as a submodule https://github.com/tenstorrent/tt-metal/actions/runs/11574438253