ArrowLuo / CLIP4Clip Star 784 Code Issues Pull requests An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval" search retrieval ranking clip multimodality multimodal-learning multimodal activitynet retrieval-model msvd msrvtt video-text-retrieval lsmdc didemo video-clip-retrieval Updated Apr 12, 2024 Python
xuguohai / X-CLIP Star 112 Code Issues Pull requests An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval" multimodal activitynet msvd msrvtt video-text-retrieval lsmdc didemo Updated Apr 6, 2024 Python
shufangxun / MAC Star 23 Code Issues Pull requests An end-to-end masked contrastive video-and-language pre-training framework pytorch clip mae end-to-end-learning multimodal vision-and-language activitynet pretraining msrvtt contrastive-learning vision-transformer video-text-retrieval video-language didemo Updated Dec 13, 2022