Add OpenAI as a Provider for Descriptive Text Generation #828

dkotter · 2024-11-21T22:46:27Z

Description of the Change

In #785, we updated to GPT-4o mini in our OpenAI ChatGPT Provider. This model is multi-modal, which means you can do things with images, video, or audio, not just text.

So far we haven't take advantage of that but this PR brings OpenAI as a Provider for the Descriptive Text Generator Feature. Currently this Feature only runs on the Azure AI Vision Provider, so this brings a second option for that Feature.

Making requests to this model is the same as all of our text generation requests, other than we send the image URL in that request. We have a default prompt that is used and that can be modified from the settings screen, as needed. I tried to keep this prompt fairly generic but open to suggestions on improvements there. It gives decent results right now in the images I tested though does tend to be more verbose than what I'd want in just alt text, though noting the text here can be used as a caption or description, so hard to balance all three:

You are an assistant that generates descriptions of images that are used on a website. You will be provided with an image and will describe the main item you see in the image, giving details but staying concise. There is no need to say "the image contains" or similar, just describe what is actually in the image. This text will be important for screen readers, so make sure it is descriptive and accurate but not overly verbose

OpenAI requires images to be at least 512x512, so we return an error message if any image below that threshold is used. It also supports passing in the full image URL or a base64 encoded version of the image. For now I've used the image URL but we could look to go the encoded route, which would make things work in environments where images are publicly accessible (like locally). The downside here is it's slower and more expensive, as it uses more tokens.

Closes #826

Descriptive Text Generator settings screen

How to test the Change

Go to Tools > ClassifAI > Image Processing > Descriptive Text Generator
Select OpenAI as your Provider and add proper credentials
Ensure at least one Descriptive text fields is turned on
Go to your Media Library and choose an image without alt text and run the descriptive text scan. Ensure the text is saved properly
Try this from the single attachment page, using the metabox and ensure this works
Upload a new image and ensure alt text is added during that process
Test other methods as desired, like bulk processing on the Media Library list view or the WP-CLI command
Can also test adding custom prompts to ensure they work

Changelog Entry

Added - Add OpenAI ChatGPT as a Provider for the Descriptive Text Generator Feature.

Credits

Props @dkotter, @jeffpaul

Checklist:

I agree to follow this project's Code of Conduct.
I have updated the documentation accordingly.
I have added Critical Flows, Test Cases, and/or End-to-End Tests to cover my change.
All new and existing tests pass.

dkotter added 6 commits November 21, 2024 13:51

Allow ChatGPT to be used as a Provider for Descriptive Text Generation

f180a0a

Add a route to handle the descriptive text generation request

71f6993

Allow customizing the prompt in the settings. Fix typo

eac68e8

Modify the default prompt a bit

41b1c61

Add E2E tests

51ce65a

Set detail to auto

02fc829

dkotter added this to the 3.2.0 milestone Nov 21, 2024

dkotter self-assigned this Nov 21, 2024

dkotter requested review from jeffpaul and a team as code owners November 21, 2024 22:46

github-actions bot added the needs:code-review This requires code review. label Nov 21, 2024

dkotter added 2 commits November 21, 2024 15:50

Remove test that isn't needed

e76d989

Bring over test fixes from 815

1860aba

jeffpaul requested review from a team and faisal-alvi and removed request for a team and jeffpaul November 26, 2024 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenAI as a Provider for Descriptive Text Generation #828

Add OpenAI as a Provider for Descriptive Text Generation #828

dkotter commented Nov 21, 2024 •

edited

Loading

Add OpenAI as a Provider for Descriptive Text Generation #828

Are you sure you want to change the base?

Add OpenAI as a Provider for Descriptive Text Generation #828

Conversation

dkotter commented Nov 21, 2024 • edited Loading

Description of the Change

How to test the Change

Changelog Entry

Credits

Checklist:

dkotter commented Nov 21, 2024 •

edited

Loading