Sparse Coding Semantic Hyperlapse

Project

This project contains the code and data used to generate the results reported in the paper A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos on the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018. It implements a Semantic Fast-forward method for First-Person Videos with a proper stabilization method based on a adaptive frame selection via Minimum Sparse Reconstruction problem and Smoothing Frame Transition.

For more information and visual results, please access the project page.

Contact

Authors

Michel Melo da Silva - PhD student - UFMG - [email protected]
Washington Luis de Souza Ramos - PhD student - UFMG - [email protected]
João Pedro Klock Ferreira - Undergraduate Student - UFMG - [email protected]
Felipe Cadar Chamone - Undergraduate Student - UFMG - [email protected]
Mario Fernando Montenegro Campos - Advisor - UFMG - [email protected]
Erickson Rangel do Nascimento - Advisor - UFMG - [email protected]

Institution

Federal University of Minas Gerais (UFMG)
Computer Science Department
Belo Horizonte - Minas Gerais -Brazil

Laboratory

VeRLab: Laboratory of Computer Vison and Robotics
https://www.verlab.dcc.ufmg.br

Dataset

DoMSEV is an 80-hour dataset of multimodal (RGB-D, IMU, and GPS) semantic egocentric videos that covers a wide range of activities. You can get more info and download the dataset in the following page:

DoMSEV – Dataset of Multimodal Semantic Egocentric Video.

Code

Dependencies

MATLAB 2016a
OpenCV 2.4 (Tested with 2.4.9 and 2.4.13)
Doxygen 1 (for documentation only - Tested with 1.8.12)
Check the MIFF code dependencies if you want to run the egocentric video stabilizer.

1. I want to run it in a pre-processed example!

Just follow the steps in Example.md file.

2. I want to run it in my raw video!

Usage

The project processing is decribed by the following flowchart:

Optical Flow Estimator:

The first step processing is to estimate the Optical Flow (OF) of the input video.
1. The folder _Vid2OpticalFlowCSV contains the modified Poleg et al. 2014 Flow Estimator code from the link to run in the Linux system.
2. Navigate to the folder compile the code.
3. Into the Vid2OpticalFlowCSV folder, run the command:

optflow -v < video_filename > -c < config.xml > -o < output_filename.csv >

Options	Description	Type	Example
`< video_filename >`	Path and filename of the video.	String	`~/Data/MyVideos/myVideo.mp4`
`< config.xml >`	Path to the configuration XML file.	String	`../default-config.xml`
`< output_filename.csv >`	Path to save the output CSV file.	String	`myVideo.csv`

Save the output file using the same name of the input video with extension .csv.

Semantic Extractor:

The second step is to extract the semantic information over all frames of the input video and save it to a CSV file.

You should go to the folder _SemanticFastForward_JVCI_2018 containing the Multi Importance Fast-Forward (MIFF) code [Silva et al. 2018].

On the MATLAB console, go to the "SemanticScripts" folder inside the MIFF project and run the command:

>> ExtractAndSave(< Video_filename >, < Semantic_extractor_name >)

Parameters	Description	Type	Example
`< video_filename >`	Path and filename of the video.	String	`~/Data/MyVideos/Video.mp4`
`< semantic_extractor_name >`	Semantic extractor algorithm.	String	`'face'` or `'pedestrian'`

Transistion Costs Estimation:

The third step is to calculate the transition costs over all frames of the Input video and save it in a MAT file. On the MATLAB console, go to the "Util" folder inside the MIFF project and run the command:

>> GenerateTransistionCosts(< video_dir >, <experiment>, < semantic_extractor_name >, <speed_up>)

Parameters	Description	Type	Example
`< video_dir >`	Complete path to the video.	String	`~/Data/MyVideos`
`< experiment >`	Name to identify the experiment.	String	`Biking_0p`
`< semantic_extractor_name >`	Semantic extractor algorithm.	String	`'face'` or `'pedestrian'`
`<speed_up>`	Desired speed-up rate	Integer	`'10'`

This function also save the Semantic Costs in a CSV file, which will be used in the Video Stabilizer. The files are saved in the same folder of the video (< video_dir >).

Yolo Extractor

To use the Yolo Extractor:

Clone the Yolo repository: git clone https://github.com/pjreddie/darknet.git
Go to darknet folder: cd darknet/
To make sure you using the same code, go back to an specific commit: git reset b3c4fc9f223d9b6f50a1652d8d116fcdcc16f2e8 --hard
Copy the files from _Darknet (in the 2018-cvpr-silva-sparsecoding directory) to the src/ folder
Modify the Makefile to match your specification. Notice that for our purpose the OpenCV option is mandatory, so change the line OPENCV=0 for OPENCV=1
Run make
To download the weights run: wget https://www.verlab.dcc.ufmg.br/repository/hyperlapse/data/cvpr2018_yolo/yolo.weights

To use the extractor run:

./darknet detector demo <data file> <cfg file> <weights> <video file> <output file>

Fields	Description	Type	Example
`< data file >`	Model configuration file.	String	`cfg/coco.data`
`< cfg file >`	Model configuration file.	String	`cfg/yolo.cfg`
`< weights >`	Weights file for the desired model.	String	`yolo.weights`
`< video file >`	Video file to extrack the detections.	String	`example.mp4`
`< output file >`	File created to save yolo results.	String	`example_yolo_raw.txt`

The output file contains all information extracted from the video. Example:

3, 4.000000
0, 0.407742, 490, 13, 543, 133
58, 0.378471, 982, 305, 1279, 719
58, 0.261219, 80, 5, 251, 121
1, 5.000000
58, 0.451681, 981, 307, 1279, 719

The first line contains two informations, the number of boxes detected and the number of the frame. Each one of the following lines contains the information about each detected box. It is formated as:

<Number of boxes> <frame number>
<Class> <Confidence> <left> <top> <right> <bottom>
<Class> <Confidence> <left> <top> <right> <bottom>
<Class> <Confidence> <left> <top> <right> <bottom>
<Number of boxes> <frame number>
<Class> <Confidence> <left> <top> <right> <bottom>
...

After extracting all this information, you need to generate the descriptor. Go back to the project folder "2018-cvpr-silva-sparsecoding/" and run:

python generate_descriptor.py <video_path> <yolo_extraction> <desc_output>

Fields	Description	Type	Example
`< video_path >`	Path to the video file.	String	`example.mp4`
`< yolo_extraction >`	Path to the yolo extraction.	String	`example_yolo_raw.txt`
`< desc_output >`	Path to the descriptor.	String	`example_yolo_desc.csv`

Semantic Fast-Forward

After the previous steps, you are ready to accelerate the Input Video. On MATLAB console, go to the "LLC" folder, inside the project directory and run the command:

>> accelerate_video_LLC( < input_video > , < semantic_extractor > );

Fields	Description	Type	Example
`< input_video >`	Filename of the input video.	String	`example.mp4`
`< semantic_extractor >`	Descriptor used into the semantic extraction	String	`'face'` or `'pedestrian'`

Citation

If you are using it for academic purposes, please cite:

M. M. Silva, W. L. S. Ramos, J. P. K. Ferreira, F. C. Chamone, M. F. M. Campos, E. R. Nascimento, A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos. In CVPR, 2018.

Bibtex entry

@InProceedings{Silva2018,
title = {A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos},
booktitle = {2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
author = {M. M. Silva and W. L. S. Ramos and J. P. K. Ferreira and F. C. Chamone and M. F. M. Campos and E. R. Nascimento},
Year = {2018},
Address = {Salt Lake City, USA},
month = {Jun.},
intype = {to appear in},
pages = {},
volume = {},
number = {},
doi = {},
ISBN = {}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LLC		LLC
_Darknet		_Darknet
_SemanticFastForward_JVCI_2018 @ b929647		_SemanticFastForward_JVCI_2018 @ b929647
_Vid2OpticalFlowCSV		_Vid2OpticalFlowCSV
doc		doc
etc		etc
features		features
io		io
python_codes		python_codes
.gitignore		.gitignore
.gitmodules		.gitmodules
Example.md		Example.md
LICENSE		LICENSE
README.md		README.md

License

verlab/SemanticFastForward_CVPR_2018

Folders and files

Latest commit

History

Repository files navigation

Sparse Coding Semantic Hyperlapse

Project

Contact

Authors

Institution

Laboratory

Dataset

Code

Dependencies

1. I want to run it in a pre-processed example!

2. I want to run it in my raw video!

Usage

Citation

Bibtex entry

Enjoy it.

About

Topics

Resources

License

Stars

Watchers

Forks

Languages