Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does it make static graph, when compilation? #1032

Open
clownchrys opened this issue Nov 12, 2024 · 1 comment
Open

Does it make static graph, when compilation? #1032

clownchrys opened this issue Nov 12, 2024 · 1 comment

Comments

@clownchrys
Copy link

clownchrys commented Nov 12, 2024

Hi!

I am trying to use transformer-neuronx to compile the customized huggingface llama-3.1-8b model.
I use the model with beam search, and I know that it makes dynamic graph during generation.
But, if I compile the model with using neuron-sdk, does it make static graph by tracing?

Can I still use beam search, after neuron compilation?

There can be something wrong in my knowledge.
If so, please fix them.

Thanks for your effort.

@devesr-amzn
Copy link
Contributor

Thank you for the question. It should be possible to set this parameter in the Configuration(GenerationConfig) to True. This ensures that the graph which gets traced, is static, and supports sampling on device.

When not sampling on device, it should be possible to use beam search after compilation, if the model remains identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants