Use tag::any for int8 matmul weight desc to create pd #155
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
An issue was found that int8 matmul runs into
ref:any
kernel, which is very slow. It was found with stock PyTorch + onednn 3.0.It is because dst scales have an impact on pd creation. When prepacking weight, dst scales are not set to create pd (int8 and fp32 share the same
expected_weight_desc
function). Then we can create a pd that gives weight desc in layout A.But at runtime, dst scales are set and we specify weight layout A to create pd. Onednn may find that layout A is improper, and it finally runs into
ref:jit
kernel.Now we use
tag::any
for weight desc to create pd at runtime regardless of the layout of prepacked weight. Then pd can give a better layout for weight. The prepacked weight will be reordered again on the first run.Previously:
tag::any
and without info of src/dst scales/zero points.ref:any
kernel is used.Now:
tag::any
and without info of src/dst scales/zero points.tag::any
and with info of src/dst scales/zero points.Weight is only reordered on the first run. Later on, weight is always in layout B, which is expected.
Test plan
@jgong5 @XiaobingSuper @yanbing-j @leslie-fang-intel Please review. Thanks!