Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing results from paper #4

Open
auphelia opened this issue May 4, 2022 · 0 comments
Open

Reproducing results from paper #4

auphelia opened this issue May 4, 2022 · 0 comments
Labels
question Further information is requested

Comments

@auphelia
Copy link

auphelia commented May 4, 2022

Hi,

I would like to reproduce your results from the paper "Integer-only Zero-shot Quantization for Efficient Speech Recognition" for int8 (or even int4 if possible) QuartzNet 15x5 on an A10 and A100 Nvidia GPU with additional measurements for the throughput.

I was trying to use the Q-ASR repo for that but I cannot find the TensorRT export, is that published somewhere else? If I understand the code in the repo correctly, then the execution in inference.py does not make use of the tensor cores of the GPU. Am I overlooking something here?

Kind regards

@auphelia auphelia added the question Further information is requested label May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant