You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently, I noticed that the SentenceTransformers class has gained the ability to use the ONNX backend, which is incredibly beneficial for enhancing performance, especially on CPUs.
I would like to request a similar feature for the CrossEncoder class. Adding support for the ONNX backend in CrossEncoder would be a significant enhancement. It would greatly accelerate reranking tasks on CPU, making the library even more powerful and efficient.
Here are some potential benefits:
Improved Performance: Faster inference times on CPU, useful when GPUs are not available.
Scalability: Ability to handle larger reranking workloads with reduced latency.
Consistency: Ensuring that both SentenceTransformers and CrossEncoder classes can leverage the same performance optimizations.
Thank you for considering this feature request.
The text was updated successfully, but these errors were encountered:
Thanks for the suggestion. Since I took over this project, I have made various improvements to SentenceTransformer models, such as multi-GPU training, bf16, loss logging, new backends, etc. My intention is to spend some time starting from next week on extending these improvements to CrossEncoder: both on the training and on the inference side. That will include adding ONNX/OV backends to the CrossEncoder.
Recently, I noticed that the
SentenceTransformers
class has gained the ability to use the ONNX backend, which is incredibly beneficial for enhancing performance, especially on CPUs.I would like to request a similar feature for the
CrossEncoder
class. Adding support for the ONNX backend inCrossEncoder
would be a significant enhancement. It would greatly accelerate reranking tasks on CPU, making the library even more powerful and efficient.Here are some potential benefits:
SentenceTransformers
andCrossEncoder
classes can leverage the same performance optimizations.Thank you for considering this feature request.
The text was updated successfully, but these errors were encountered: