Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TinyBert模型经过post_training_quantization进行INT8量化后,在Linux_X86-64平台推理报错 #119

Open
zxzlogic opened this issue Aug 3, 2022 · 4 comments

Comments

@zxzlogic
Copy link

zxzlogic commented Aug 3, 2022

  1. X2bolt -d onnx -m model -i PTQ #输出为model_ptq_input.bolt
  2. ./post_training_quantization -p model_ptq_input.bolt -i INT8_FP32 -b true -q NOQUANT -c 0 -o false
  3. 推理报错如下:
    [ERROR] thread 121948 file /home/xxx/project/bolt/compute/tensor/src/fully_connected.cpp line 394: requirement mismatch.
    对应行为:CHECK_REQUIREMENT(idt == qIDesc.dt);

想问一下有没有关于tinybert量化的教程,或者如何进一步定位错误原因?

@yuxianzhi
Copy link
Contributor

bolt提供了debug接口,可以加上--debug重新编译,然后再运行,会有更详细信息

如果不是保密模型,可以将量化前/后的模型发我们,我们会看一下,cos_wave@163.com

@yuxianzhi
Copy link
Contributor

yuxianzhi commented Aug 3, 2022

linux-x86_64是串行的代码,我们维护比较少。

可以选择avx512的服务器linux-x86_64_avx512或者armv8.2手机的android-aarch64,这个可能会跑起来

@zxzlogic
Copy link
Author

zxzlogic commented Aug 3, 2022

linux-x86_64是串行的代码,我们维护比较少。

可以选择avx512的服务器linux-x86_64_avx512或者armv8.2手机的android-aarch64,这个可能会跑起来

好的,我试一下armv8平台,感谢回复

@zxzlogic
Copy link
Author

zxzlogic commented Aug 3, 2022

bolt提供了debug接口,可以加上--debug重新编译,然后再运行,会有更详细信息

如果不是保密模型,可以将量化前/后的模型发我们,我们会看一下,cos_wave@163.com

好的,我补充一下debug信息。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants