audio-language

Here are 5 public repositories matching this topic...

OFA-Sys / ONE-PEACE

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

representation-learning multimodal vision-and-language contrastive-loss vision-language vision-transformer foundation-models audio-language

Updated Oct 6, 2024
Python

TXH-mercury / VAST

Star

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

dataset vision-language audio-language multimodal-foundation-model cross-modality-pretraining vision-audio-subtitle-text

Updated Mar 14, 2024
Jupyter Notebook

AudioLLMs / AudioLLM

Star

Audio Large Language Models

audio-processing audio-language audio-understanding

Updated Nov 26, 2024

Sreyan88 / GAMA

Star

Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

audio dataset question-answering reasoning large-language-model audio-language multimodal-large-language-models

Updated Nov 27, 2024
Python

Sreyan88 / CompA

Star

Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models

audio nlp benchmark ai ml compositionality retreival audio-language

Updated Jul 10, 2024
Python

Improve this page

Add a description, image, and links to the audio-language topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the audio-language topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio-language

Here are 5 public repositories matching this topic...

OFA-Sys / ONE-PEACE

TXH-mercury / VAST

AudioLLMs / AudioLLM

Sreyan88 / GAMA

Sreyan88 / CompA

Improve this page

Add this topic to your repo