Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Visual Language Model Sample + Replay #884

Open
bwsw opened this issue Nov 18, 2024 · 0 comments
Open

[Draft] Visual Language Model Sample + Replay #884

bwsw opened this issue Nov 18, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@bwsw
Copy link
Contributor

bwsw commented Nov 18, 2024

We deploy a language model and formulate questions. Depending on answers, we run certain classic models for a duration until the next answer from VLM.

We run VLM once in 1 second with questions:

  • cars in the viewport?
  • people in the viewport?

We use Replay to store results and initialize replay every time the mode says positively; when replay, while the model continues to answer positively, we prolong replay processing; when it switches to negative, we stop it.

The replayed streams are sent to the secondary pipeline where YOLOv8-KeyPoint and YOLOv8 are deployed; we use only ROIs corresponding to VLM decisions to launch one or another (or both) models.

Can use the VLM service to run the model: https://docs.nvidia.com/jetson/jps/inference-services/vlm.html

@bwsw bwsw added the enhancement New feature or request label Nov 18, 2024
@bwsw bwsw changed the title Visual Language Model Sample Visual Language Model Sample + Replay Nov 18, 2024
@bwsw bwsw changed the title Visual Language Model Sample + Replay [Draft] Visual Language Model Sample + Replay Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant