intel / neural-compressor Public

Notifications You must be signed in to change notification settings
Fork 257
Star 2.2k

Code
Issues 34
Pull requests 1
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: intel/neural-compressor

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

34 Open 172 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Error while running Whisper model quantization with Intel neural compressor

#2056 opened Nov 5, 2024 by Shivani-k16

AssertionError of act_observer when using SmoothQuant for Llama-13b

#2033 opened Oct 16, 2024 by kyang-06

Coding error！！

#2027 opened Oct 11, 2024 by AheadSnail

Qwen/Qwen2.5-7B-Instruct model layer_wise_quant function error

#2017 opened Sep 30, 2024 by hadoop2xu

Failed to save quantized model

#2001 opened Sep 11, 2024 by lockeregg

how to evaluate AWQ ?

#1980 opened Aug 14, 2024 by chunniunai220ml

Quantization failed

#1972 opened Aug 11, 2024 by endomorphosis

Any example to quantise a text embedding model on Intel Gaudi2? aitce

AI TCE to handle it firstly

#1919 opened Jul 14, 2024 by sleepingcat4

Error in fp8 quantization: Invalid scale factor : 1.70e+06, make sure the scale is not larger than : 6.55e+04

#1907 opened Jul 9, 2024 by yyChen233

FP4 encoding related

#1891 opened Jul 1, 2024 by Tiantian-Han

PTQ with IPEX backend and XPU device is not working

#1889 opened Jun 28, 2024 by paguilomanas

Is there any accuracy data related to FP4?

#1835 opened Jun 3, 2024 by PhzCode

how to extract int8 weights from quantized model aitce

AI TCE to handle it firstly

#1817 opened May 25, 2024 by chensterliu

'q_config' is needed when export an INT8 model aitce

AI TCE to handle it firstly

#1736 opened Apr 18, 2024 by ZhangShuoAlreadyExists

io.UnsupportedOperation: fileno

#1714 opened Apr 3, 2024 by jashokkumar83

AWQ Quantization padding error

#1699 opened Mar 26, 2024 by PatriceVignola

Model execution is single threaded?

#1663 opened Mar 12, 2024 by akhauriyash

neural_compressor/adaptor/ox_utils/quantizer.py dfs crash during "basic" tuning

#1621 opened Feb 22, 2024 by kmn1024

How to quantify google/vit-base-patch16-224 pytorch_model.bin to int8 type with neural-compressor

#1612 opened Feb 19, 2024 by yingmuying

How to perform int8 quantisation (not uint8) using ONNX?

#1610 opened Feb 16, 2024 by paul-ang

AWQ quantization is very slow for ONNX LLMs

#1609 opened Feb 10, 2024 by PatriceVignola

Unable to save llama2 after SmoothQuant

#1600 opened Feb 2, 2024 by dellamuradario

how to get layer_mappings for distillation?

#1590 opened Feb 1, 2024 by Michael-Fuu

PostTrainingQuantConfig(quant_level='auto', device='npu', backend="onnxrt_dml_ep") produces fp32 ops.

#1580 opened Jan 26, 2024 by kleiti

AWQ fails on ONNX model when a MatMul node's input is a model input/initializer

#1571 opened Jan 25, 2024 by jstoecker

Previous 1 2 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly