You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to use transformer-neuronx to compile the customized huggingface llama-3.1-8b model.
I use the model with beam search, and I know that it makes dynamic graph during generation.
But, if I compile the model with using neuron-sdk, does it make static graph by tracing?
Can I still use beam search, after neuron compilation?
There can be something wrong in my knowledge.
If so, please fix them.
Thanks for your effort.
The text was updated successfully, but these errors were encountered:
Thank you for the question. It should be possible to set this parameter in the Configuration(GenerationConfig) to True. This ensures that the graph which gets traced, is static, and supports sampling on device.
When not sampling on device, it should be possible to use beam search after compilation, if the model remains identical.
Hi!
I am trying to use transformer-neuronx to compile the customized huggingface llama-3.1-8b model.
I use the model with beam search, and I know that it makes dynamic graph during generation.
But, if I compile the model with using neuron-sdk, does it make static graph by tracing?
Can I still use beam search, after neuron compilation?
There can be something wrong in my knowledge.
If so, please fix them.
Thanks for your effort.
The text was updated successfully, but these errors were encountered: