Feature Request: Support for ONNX backend for CrossEncoders. #3039

SupreethRao99 · 2024-11-06T14:22:52Z

Recently, I noticed that the SentenceTransformers class has gained the ability to use the ONNX backend, which is incredibly beneficial for enhancing performance, especially on CPUs.

I would like to request a similar feature for the CrossEncoder class. Adding support for the ONNX backend in CrossEncoder would be a significant enhancement. It would greatly accelerate reranking tasks on CPU, making the library even more powerful and efficient.

Here are some potential benefits:

Improved Performance: Faster inference times on CPU, useful when GPUs are not available.
Scalability: Ability to handle larger reranking workloads with reduced latency.
Consistency: Ensuring that both SentenceTransformers and CrossEncoder classes can leverage the same performance optimizations.

Thank you for considering this feature request.

The text was updated successfully, but these errors were encountered:

tomaarsen · 2024-11-07T07:57:37Z

Hello!

Thanks for the suggestion. Since I took over this project, I have made various improvements to SentenceTransformer models, such as multi-GPU training, bf16, loss logging, new backends, etc. My intention is to spend some time starting from next week on extending these improvements to CrossEncoder: both on the training and on the inference side. That will include adding ONNX/OV backends to the CrossEncoder.

Tom Aarsen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Support for ONNX backend for CrossEncoders. #3039

Feature Request: Support for ONNX backend for CrossEncoders. #3039

SupreethRao99 commented Nov 6, 2024

tomaarsen commented Nov 7, 2024

Feature Request: Support for ONNX backend for CrossEncoders. #3039

Feature Request: Support for ONNX backend for CrossEncoders. #3039

Comments

SupreethRao99 commented Nov 6, 2024

tomaarsen commented Nov 7, 2024