Reproducing results from paper #4

auphelia · 2022-05-04T12:44:56Z

Hi,

I would like to reproduce your results from the paper "Integer-only Zero-shot Quantization for Efficient Speech Recognition" for int8 (or even int4 if possible) QuartzNet 15x5 on an A10 and A100 Nvidia GPU with additional measurements for the throughput.

I was trying to use the Q-ASR repo for that but I cannot find the TensorRT export, is that published somewhere else? If I understand the code in the repo correctly, then the execution in inference.py does not make use of the tensor cores of the GPU. Am I overlooking something here?

Kind regards

auphelia added the question Further information is requested label May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing results from paper #4

Reproducing results from paper #4

auphelia commented May 4, 2022

Reproducing results from paper #4

Reproducing results from paper #4

Comments

auphelia commented May 4, 2022