Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for ONNX backend for CrossEncoders. #3039

Open
SupreethRao99 opened this issue Nov 6, 2024 · 1 comment
Open

Feature Request: Support for ONNX backend for CrossEncoders. #3039

SupreethRao99 opened this issue Nov 6, 2024 · 1 comment

Comments

@SupreethRao99
Copy link

Recently, I noticed that the SentenceTransformers class has gained the ability to use the ONNX backend, which is incredibly beneficial for enhancing performance, especially on CPUs.

I would like to request a similar feature for the CrossEncoder class. Adding support for the ONNX backend in CrossEncoder would be a significant enhancement. It would greatly accelerate reranking tasks on CPU, making the library even more powerful and efficient.

Here are some potential benefits:

  • Improved Performance: Faster inference times on CPU, useful when GPUs are not available.
  • Scalability: Ability to handle larger reranking workloads with reduced latency.
  • Consistency: Ensuring that both SentenceTransformers and CrossEncoder classes can leverage the same performance optimizations.

Thank you for considering this feature request.

@tomaarsen
Copy link
Collaborator

Hello!

Thanks for the suggestion. Since I took over this project, I have made various improvements to SentenceTransformer models, such as multi-GPU training, bf16, loss logging, new backends, etc. My intention is to spend some time starting from next week on extending these improvements to CrossEncoder: both on the training and on the inference side. That will include adding ONNX/OV backends to the CrossEncoder.

  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants