You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have not benchmarked sherpa-onnx on Android. However, we have compared the RTF of sherpa-ncnn and sherpa-onnx on macOS and Raspberry pi 4 Model B with streaming zipformer.
The following table compares the RTF for greedy search with 1 thread
sherpa-ncnn
sherpa-onnx
macOS
0.159
0.125
Raspberry Pi 4 Model B
0.871
0.697
If speed is the only thing you care about, then I suggest that you choose sherpa-onnx.
It is a pain to compile onnxruntime from source if you don't use pre-compiled onnxruntime libs.
We have not managed to compile onnxruntime for 32-bit arm.
I don't know how easy it is to add a custom operator to onnxruntime.
The source code of ncnn is very well readable and it is easy to extend it. It also provides a tool PNNX to convert models from PyTorch. If there is an op that cannot be converted, it is straightforward to change PNNX and ncnn to support it.
One thing I want to mention is that the file size of libncnn.so for Android is less than 1.2 MB. If you customize it, you can get an even smaller lib. I don't know if there is any open-source inference framework that can produce such a small lib.
Also, ncnn supports non-NVidia GPUs, e.g., GPUs on your mobile phones and ARM GPUs on your embedded boards.
ncnn also supports RISC-V.
I've seen that for Icefall, the 2 ways to export models are using either ONNX (this package) or NCNN.
Has there been any benchmarking done for the 2 methods? I'm wondering which one would be faster.
I did find that there's this page k2-fsa/sherpa-ncnn#44 which includes some NCNN run times.
The text was updated successfully, but these errors were encountered: