Skip to content

Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos"

License

Notifications You must be signed in to change notification settings

sangminwoo/Explore-And-Match

Repository files navigation

Explore-And-Match

Implementation of "Explore-And-Match".

Getting Started

⚠️ Dependencies:

  • cuda == 10.2
  • torch == 1.8.0
  • torchvision == 0.9.0
  • python == 3.8.11
  • numpy == 1.20.3

Dataset Preparation

split

  • ActivityNet (train/val/test)
  • Charades (train/test: 5338/1334)

Download ActivityNet

merge 'v1-2' and 'v1-3' into a single folder 'videos'.

Download Charades

Pre-trained features

  • C3D
  • CLIP

Preprocess

Get 64/128/256 frames per video:

bash preprocess/get_constant_frames_per_video.sh

Extract features with CLIP

change 'val_1' to 'val' and 'val_2' to 'test' CLIP encodings

bash preprocess/get_clip_features.sh

Train

activitynet, charades

bash train_{dataset}.sh

Evaluation

bash test_{dataset}.sh

Configurations

refer to lib/configs.py

Citation

@article{woo2022explore,
  title={Explore and Match: End-to-End Video Grounding with Transformer},
  author={Woo, Sangmin and Park, Jinyoung and Koo, Inyong and Lee, Sumin and Jeong, Minki and Kim, Changick},
  journal={arXiv preprint arXiv:2201.10168},
  year={2022}
}

About

Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published