[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
-
Updated
Jul 17, 2024 - Python
[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Exploring Visual Prompts for Adapting Large-Scale Models
[TPAMI] Searching prompt modules for parameter-efficient transfer learning.
[ICCV 2023 & AAAI 2023] Binary Adapters & FacT, [Tech report] Convpass
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
Official implementation for CVPR'23 paper "BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning"
👀 Visual Instruction Inversion: Image Editing via Visual Prompting (NeurIPS 2023)
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
[CVPR 2023] VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
[ICLR24] AutoVP: An Automated Visual Prompting Framework and Benchmark
[ICML 2024] Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
[arXiv] "Uncovering the Hidden Cost of Model Compression" by Diganta Misra, Agam Goyal, Bharat Runwal, and Pin-Yu Chen
A simple GUI for experimenting with visual prompting
[IEEE BigData'24] Code used in Paper "Benchmarking Human and Automated Prompting in the Segment Anything Model"
These notes and resources are compiled from the crash course Prompt Engineering for Vision Models offered by DeepLearning.AI.
Add a description, image, and links to the visual-prompting topic page so that developers can more easily learn about it.
To associate your repository with the visual-prompting topic, visit your repo's landing page and select "manage topics."