Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How did you get the audio for "datasets/srcdata/msrvtt/audios"? #17

Closed
wonzin opened this issue Apr 14, 2024 · 3 comments
Closed

How did you get the audio for "datasets/srcdata/msrvtt/audios"? #17

wonzin opened this issue Apr 14, 2024 · 3 comments

Comments

@wonzin
Copy link

wonzin commented Apr 14, 2024

The original msrvtt folder structure is the below.

msrvtt
├── annotation
│ ├── MSR_VTT.json
├── high-quality
│ ├── structured-symlinks
│ │ ├── jsfusion_val_caption_idx.pkl
│ │ ├── ... many other files....
├── structured-symlinks
│ ├── jsfusion_val_caption_idx.pkl
│ ├── ... many other files....
├── videos
│ ├── all
│ │ ├── video1.mp4
│ │ ├── ....
│ │ ├── video9999.mp4
│ ├── tmp
│ │ ├──MSRVTT.zip
│ ├── vids
│ │ ├──data
│ │ │ ├── MSRVTT.zip

However, there is no audios for msrvtt.

  1. How did you get the audio?
    Is there specific way to extract the audio for example, bitrate, sample rate, audio channel, type of codec.
    Any kind of audio file is valid?

  2. "datasets/src/data/msrvtt/videos" == "msrvtt/videos/all" ?

@wonzin
Copy link
Author

wonzin commented Apr 15, 2024

ffmpeg video.mp4 -ac 1 -ar 16000 audio.wav
I use this options to convert into audios. but it still has other error.

04/15/2024 14:39:07 - INFO - main - data_cfg_msrvtt_cap_val_batch_size : 64
04/15/2024 14:39:07 - INFO - main - msrvtt_cap Using clip mean and std.
04/15/2024 14:39:07 - INFO - main - msrvtt_cap transforms crop_flip
04/15/2024 14:39:07 - INFO - main - Create Dataset msrvtt_cap Success
04/15/2024 14:39:07 - INFO - main - loader cap%tvas--msrvtt_cap , ratio 10170 , bs_pergpu 16, n_workers 8

not have audios video6446
not have audios video51
...

@DelusionalLogic
Copy link

I've used the script in https://github.com/TXH-mercury/VAST/blob/410ca47acf40d4ab098e345b76159df66bc42239/utils/offline_process_data.py to extract the audio.

As for still getting no audio errors. I think some of the videos don't contain any audio, and those errors are expected.

@wonzin
Copy link
Author

wonzin commented Jun 3, 2024

Thank you you save my day :)

@wonzin wonzin closed this as completed Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants