ViT for Monocular Depth Estimation

Vision Transformers-relative and absolute depth estimation

Usage

Images can be manually transferred to input folder or be downloaded from DuckDuckGo API using the script:

python fetch_sample_images.py -i <search image> -u <no. of urls>

Select one of the four models:
- DPT_Large: Largest model
- DPT_Hybrid
- MiDaS
- MiDaS_small
Inference:

python inference.py -i ../input -o ../output -t DPT_Large

python inference.py -i ../input -o ../output -t DPT_Hybrid

python inference.py -i ../input -o ../output -t MiDaS

python inference.py -i ../input -o ../output -t MiDaS_small

Absolute Depth Estimation

The models perform relative depth estimation. To approximately estimate absolute depth, method prescribed in Section-5 of the paper has been implemented using . Also have a look at the following issues: #36, #37, #42, #63, #148, #171.

To perform absolute depth estimation, use the below script.

python inference.py -i ../input -o ../output -t <model_name> -a true

Output

Results are saved in output folder in png format. Output for any random image can be visualized using the script:

python plot.py

NOTE:

Training script is not provided by the original authors, refer issue #43. The authors utilize the strategies proposed in the paper "Multi-Task Learning as Multi-Objective Optimization" for training on different datasets with different objectives. The authors have shared the loss function in pytorch code here

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
input		input
output		output
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input

input

output

output

src

src

LICENSE

LICENSE

README.md

README.md

Repository files navigation

ViT for Monocular Depth Estimation

Usage

NOTE:

About

Releases

Packages

Languages

License

S-B-Iqbal/ViT-for-Monocular-Depth-Estimation

Folders and files

Latest commit

History

Repository files navigation

ViT for Monocular Depth Estimation

Usage

NOTE:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages