Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama3.2-Vision: Add reference submodule and tests #14051

Merged
merged 21 commits into from
Oct 29, 2024
Merged

Conversation

cglagovichTT
Copy link
Contributor

@cglagovichTT cglagovichTT commented Oct 21, 2024

Ticket

#13368

Problem description

The multimodal reference code was not tracked in git. We now have a fork of it we should use as reference code in tests and demos https://github.com/tenstorrent/llama-models/tree/main

What's changed

  • Add llama-models as a submodule
  • Move TT multimodal modules and tests into their own folders under models/demos/llama3
  • Added tests to t3k unit, t3k frequent, and t3k demo. For each pipeline, I run the tests on a 1x2 mesh and a 1x8 mesh.

Checklist

@cglagovichTT
Copy link
Contributor Author

@tt-rkim could you check out .gitmodules to make sure that looks alright?

@tt-rkim
Copy link
Collaborator

tt-rkim commented Oct 21, 2024

Looks good

Where are we running the multimodal tests exactly on CI?

@cglagovichTT
Copy link
Contributor Author

cglagovichTT commented Oct 21, 2024

Where are we running the multimodal tests exactly on CI?

I'd like to add the multimodal tests to CI after I experiment locally to see if I can tighten up the PCC bounds. I'll make a separate PR to put the tests in CI.

@tt-rkim
Copy link
Collaborator

tt-rkim commented Oct 21, 2024

sounds good, so these reference files are not used anywhere in CI currently? Or are they used for other parts of 3.2?

@mtairum
Copy link
Contributor

mtairum commented Oct 21, 2024

@cglagovichTT Shouldn't this PR also include the tests?

This is the PR that fully includes multimodal llama, so it's good practice to have it all good to go (tests included).

But I'm ok with either approach and will help with the testing.

@cglagovichTT
Copy link
Contributor Author

Due to popular demand I will add the tests to this PR :)

@cglagovichTT cglagovichTT marked this pull request as draft October 21, 2024 16:28
@tt-rkim
Copy link
Collaborator

tt-rkim commented Oct 21, 2024

THANK YOU SIR

@@ -208,7 +208,7 @@ def __init__(self, mesh_device, instruct=False, dummy_weights=False, max_batch_s
self.compute_kernel_config_sdpa = ttnn.WormholeComputeKernelConfig(
math_fidelity=ttnn.MathFidelity.HiFi4,
math_approx_mode=False,
fp32_dest_acc_en=False,
fp32_dest_acc_en=True,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yieldthought is it going to be an issue in the non-vision attention modules to have fp32 acc here?

Copy link
Collaborator

@TT-billteng TT-billteng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love it!

@cglagovichTT cglagovichTT merged commit b4d605f into main Oct 29, 2024
8 checks passed
@cglagovichTT cglagovichTT deleted the llama32-vision branch October 29, 2024 19:07
ct-clmsn pushed a commit to ct-clmsn/tt-metal that referenced this pull request Nov 12, 2024
…enstorrent#14051)

* tenstorrent#13368: Move repeat interleave to xattn cache generation.

* #0: Clean up demo, enable arbitrary padding for multimodal text sequence

* tenstorrent#13368: Add llama_models Meta reference for Llama3.2 as a submodule

* tenstorrent#13368: Change reference imports to use new submodule

* tenstorrent#13368: Clean up comments after pushing repeat_interleave into xattn_cache generation.

* tenstorrent#13368: Clean up vision tests. Unify assertions and pcc checks. Fix LM head splitting on T3k.

* tenstorrent#13368: Fix LM head splits calculation

* tenstorrent#13368: For all vision tests, get model-specific parameters from model_args rather than fixtures. This generalizes tests for base and finetuned 11B models.

* tenstorrent#13368: Add vision tests to unit, frequent, and demo

* tenstorrent#13368: Fixup mesh_device when not passed FAKE_DEVICE

* tenstorrent#13368: Remove llama_models as submodule. Move its install to llama3 requirements.txt.

---------

Co-authored-by: mtairum <mtairum@tenstorrent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants