-
Notifications
You must be signed in to change notification settings - Fork 257
Issues: intel/neural-compressor
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Error while running Whisper model quantization with Intel neural compressor
#2056
opened Nov 5, 2024 by
Shivani-k16
AssertionError of act_observer when using SmoothQuant for Llama-13b
#2033
opened Oct 16, 2024 by
kyang-06
Qwen/Qwen2.5-7B-Instruct model layer_wise_quant function error
#2017
opened Sep 30, 2024 by
hadoop2xu
Any example to quantise a text embedding model on Intel Gaudi2?
aitce
AI TCE to handle it firstly
#1919
opened Jul 14, 2024 by
sleepingcat4
how to extract int8 weights from quantized model
aitce
AI TCE to handle it firstly
#1817
opened May 25, 2024 by
chensterliu
'q_config' is needed when export an INT8 model
aitce
AI TCE to handle it firstly
#1736
opened Apr 18, 2024 by
ZhangShuoAlreadyExists
neural_compressor/adaptor/ox_utils/quantizer.py dfs crash during "basic" tuning
#1621
opened Feb 22, 2024 by
kmn1024
How to quantify google/vit-base-patch16-224 pytorch_model.bin to int8 type with neural-compressor
#1612
opened Feb 19, 2024 by
yingmuying
PostTrainingQuantConfig(quant_level='auto', device='npu', backend="onnxrt_dml_ep") produces fp32 ops.
#1580
opened Jan 26, 2024 by
kleiti
AWQ fails on ONNX model when a MatMul node's input is a model input/initializer
#1571
opened Jan 25, 2024 by
jstoecker
Previous Next
ProTip!
Follow long discussions with comments:>50.