I am a Master of HUST (Huazhong University of Science and Technology), supervised by Nong Sang.
π Reseach-wise, I mainly focus on:
- Multi-modal Large Language Models
- Video Understanding, more specifically, Weakly-supervised Temporal Action Localization (WSTAL) & Weakly-suervised Video Anomaly Detection (WSVAD).
π I am open to:
- A internship/job offer with computer vision research and engineering.
π« Contact me by:
- Email: [email protected]
π¬ News:
- 2024-06-10: We release our code and model of "Arcana: Improving Multi-modal Large Language Model through Boosting Vision Capabilities".
- 2024-01-29: I start my internship in Baidu VIS, to do some research on Multi-modal Large Language Model (MLLM).
- 2023-12-09: One paper about point supervised temporal action localization is accepted on AAAI 2024.