Skip to content
View shuoyang129's full-sized avatar

Block or report shuoyang129

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shuoyang129/README.md

Hi there 👋

My primary research focuses on the area of vision and language. Presently, I delve into the application of large language models (LLMs) across various tasks involving both vision and language, e.g., language-driven video understanding and open-vocabulary multi-label image recognition. Prior to this, my work revolved around hand detection, hand pose estimation, face recognition, and person re-identification. My publications can be found at Google Scholar .

📎 Homepages

🔥 News

Pinned Loading

  1. eamat eamat Public

    Entity-Aware and Motion-Aware Transformers for Language-driven Action Localization(IJCAI-22)

    Python 12 1

  2. Distrbution-based-Frame-Supervised-LDAL Distrbution-based-Frame-Supervised-LDAL Public

    Probability Distribution Based Frame-supervised Language-driven Action Localization (ACM MM2023)

    Python 1

  3. wangyongqi558/MMP_OV_VidVRD wangyongqi558/MMP_OV_VidVRD Public

    Multi-modal Prompting for Open-vocabulary Video Visual Relationship Detection(AAAI2024)

    Python 6 2