We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi lightseq Team, I notice lightseq' transformer architecture has an extra layer_norm on both encoder and decoder level (outside decoder layers)
lightseq/lightseq/training/ops/pytorch/transformer.py
Line 103 in a7ab0da
In Fairseq, this layer_norm is only added when per_layer_norm == True https://github.com/facebookresearch/fairseq/blob/b30980349bcb2e870481d783ac8cb3f338361601/fairseq/models/transformer/transformer_encoder.py#L100
Due to the architectural difference, I m unable to export native Fairseq Transformer with post layer norm to protobuf/hdf5 format, using https://github.com/bytedance/lightseq/blob/master/examples/inference/python/export/fairseq/native_fs_transformer_export.py. Cuz my model trained with Fairseq and per_layer_norm ==False doesn't have this extra layer_norm on decoder/encoder level.
Wonder why lightseq requires extra layer_norm on encoder/decoder level. Thanks!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi lightseq Team, I notice lightseq' transformer architecture has an extra layer_norm on both encoder and decoder level (outside decoder layers)
lightseq/lightseq/training/ops/pytorch/transformer.py
Line 103 in a7ab0da
In Fairseq, this layer_norm is only added when per_layer_norm == True
https://github.com/facebookresearch/fairseq/blob/b30980349bcb2e870481d783ac8cb3f338361601/fairseq/models/transformer/transformer_encoder.py#L100
Due to the architectural difference, I m unable to export native Fairseq Transformer with post layer norm to protobuf/hdf5 format, using
https://github.com/bytedance/lightseq/blob/master/examples/inference/python/export/fairseq/native_fs_transformer_export.py. Cuz my model trained with Fairseq and per_layer_norm ==False doesn't have this extra layer_norm on decoder/encoder level.
Wonder why lightseq requires extra layer_norm on encoder/decoder level. Thanks!
The text was updated successfully, but these errors were encountered: