Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetune on multi-image VQA dataset #531

Open
rsr-droid opened this issue Oct 6, 2024 · 0 comments
Open

Finetune on multi-image VQA dataset #531

rsr-droid opened this issue Oct 6, 2024 · 0 comments

Comments

@rsr-droid
Copy link

Hi, is it possible to finetune this model on a multi-image VQA dataset, which has a format similar but not limited to the following where we can pass a list of images with a single prompt:
[
{
"system_prompt": "You are a helpful AI assistant",
"image": [
"/path/to/image_1,
"/path/to/image_2"
],
"conversations": [
{
"from": "human",
"value": "Provide a description of the images in attached."
},
{
"from": "gpt",
"value": "The first image shows a Labradoodle dog running energetically through a grassy field while holding a stick in its mouth. The dog has curly, light-colored fur and floppy ears flying in the air as it sprints with a joyful expression.

The second image features a close-up of an orange-colored cat with its mouth wide open in a large yawn, showing its sharp teeth and tongue. The background is black, making the cat's fur and facial features stand out vividly."

  }
]

},

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant