Skip to content

Latest commit

 

History

History
41 lines (28 loc) · 2.87 KB

DATASET.md

File metadata and controls

41 lines (28 loc) · 2.87 KB

Data Preparation

We have successfully pre-trained and fine-tuned our VideoMAE on Kinetics400, Something-Something-V2, UCF101 and HMDB51 with this codebase.

  • The pre-processing of Something-Something-V2 can be summarized into 3 steps:

    1. Download the dataset from official website.

    2. Preprocess the dataset by changing the video extension from webm to .mp4 with the original height of 240px.

    3. Generate annotations needed for dataloader ("<path_to_video> <video_class>" in annotations). The annotation usually includes train.csv, val.csv and test.csv ( here test.csv is the same as val.csv). We share our annotation files (train.csv, val.csv, test.csv) via Google Drive. The format of *.csv file is like:

      dataset_root/video_1.mp4  label_1
      dataset_root/video_2.mp4  label_2
      dataset_root/video_3.mp4  label_3
      ...
      dataset_root/video_N.mp4  label_N
      
  • The pre-processing of Kinetics400 can be summarized into 3 steps:

    1. Download the dataset from official website.

    2. Preprocess the dataset by resizing the short edge of video to 320px. You can refer to MMAction2 Data Benchmark for TSN and SlowOnly.
      Recommend: OpenDataLab provides a copy of Kinetics400 dataset, you can download Kinetics dataset with short edge 320px from here.

    3. Generate annotations needed for dataloader ("<path_to_video> <video_class>" in annotations). The annotation usually includes train.csv, val.csv and test.csv ( here test.csv is the same as val.csv). The format of *.csv file is like:

      dataset_root/video_1.mp4  label_1
      dataset_root/video_2.mp4  label_2
      dataset_root/video_3.mp4  label_3
      ...
      dataset_root/video_N.mp4  label_N
      

Note:

  1. We use decord to decode the videos on the fly during both pre-training and fine-tuning phases.
  2. All experiments on Kinetics-400 in VideoMAE are based on this version.