Improve memory usage of `ModelToEncoder` #48

mikethea1 · 2024-08-14T16:23:55Z

Thanks for creating and maintaining this great library!

What would you like to be added:

Today, ModelToEncoder calls ModelToEncoding which statically initializes a dictionary of 7 encodings. 6/7 are duplicates.

When each encoding is constructed, the constructor eagerly loads a bunch of data from manifest resources. As far as I can tell, this data gets loaded separately for each instance.

I would like to be able to call ModelToEncoder and only have it lazily load the encoding I care about. Furthermore, I'd like to see it share Encoding instances among models which map to the same encoding.

An example implementation might look like this:

public static Encoding? TryFor(string modelName)
{
    switch (modelName)
    {
        case "gpt-4o":
            return O200KCache.Instance;
        case "gpt-4":
        ...
        case "text-embedding-3-large":
            return Cl100KCache.Instance;
        default:
            return null;
    }
}

private static class O200KCache
{
    public static readonly O200KBase Instance = new();
}

private static class Cl100KCache
{
    public static readonly Cl100KBase Instance = new();
}

Why is this needed:

Reduce memory footprint and startup time, especially as more models are added.

Anything else we need to know?

I'd be happy to file a PR for this if you're interested!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve memory usage of `ModelToEncoder` #48

Improve memory usage of `ModelToEncoder` #48

mikethea1 commented Aug 14, 2024

Improve memory usage of ModelToEncoder #48

Improve memory usage of ModelToEncoder #48

Comments

mikethea1 commented Aug 14, 2024

What would you like to be added:

Why is this needed:

Anything else we need to know?

Improve memory usage of `ModelToEncoder` #48

Improve memory usage of `ModelToEncoder` #48