Multi-Modal model support #1025

doberst · 2024-10-04T13:32:44Z

We are very interested in integrating open source self-hosted multi-modal models into LLMWare. We have been watching the space closely and looking for ideas and contributions for supporting open source multi-modal models that work in conjunction with RAG and Agent-based automation pipelines.

Our key criteria is that there must be a use case related to some business objective (e.g., not just image generation), the model needs to work reasonably well, and should be self-hostable (e.g., max of 10-15B parameters).

To implement, the key focus will be the construction of a new MultiModal model class, and design of the preprocessor and postprocessors required to handle the multi-modal content, along with support for the underlying model packaging (e.g., GGUF, Pytorch, ONNX, OpenVino). We would look to collaborate and will support the underlying inferencing technology required.

doberst added the enhancement New feature or request label Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Modal model support #1025

Multi-Modal model support #1025

doberst commented Oct 4, 2024

Multi-Modal model support #1025

Multi-Modal model support #1025

Comments

doberst commented Oct 4, 2024