Resource-Conscious High-Performance Models for 2D-to-3D Single View Reconstruction

Capstone Project, Research Work and Research Paper

Our work has been presented and published in the IEEE-TENCON 2021 conference held in Auckland, New Zealand. This paper can be viewed and referenced using the IEEE-Xplore link provided below -

Aknowledgements

We, Dhruv Srikanth, Suraj Bidnur and Rishab Kumar would like to thank Dr. Sanjeev G for his guidance throughout our research and capstone project for our final year of undergraduate engineering. We would also like to thank the IEEE Society for publishing our paper titled - "Resource-Conscious High-Performance Models for 2D-to-3D Single View Reconstruction" by Suraj Bidnur, Dhruv Srikanth and Sanjeev G.

Objective

We aim to reconstruct 3D voxel models from their 2D images using deep learning algorithms. We differentiate from other techniques, methods and models used in our success in reducing resource utilization, increasing computational efficiency and reducing training time all while improving on the performance and accuracy.

Inspiration

The Pix2Vox model and 3D-R2N2 architectures provided us with inspiration. We based original based our approach off of a similar model and then made alteration from that point onwards for single view image reconstruction without any data augmentation. The papers for the Pix2Vox and 3D-R2N2 architectures can be found below -

Motivation

Lack of 3D content despite increasing demands by various industries like gaming, medical, cinema etc.
Increase in popularity along with the proven success of deep learning techniques like CNNs, GANs etc. over recent years.
High resource requirements and computation costs in existing approaches.

Dataset

The dataset we have trained our models on is the 3D-ShapeNet dataset. The links to the 2D rendering files and the 3D binvox files are mentioned below.

The dataset contains 13 different object classes with over 700,000 images.

Proposed models

Here we propose 2 models for use in different scenarios.

AE-Dense: This model gives the best results (highest IoU) but it comes at the cost of a much higher GPU memory utilization (close to 9GB). In situations where there is no limitation on the GPU memory, then this model can be used.
3D-SkipNet: This model performs slightly worse than AE-Dense but it uses around 2GB less GPU memory (close to 7GB). In situations where GPU memory availability is critical, this model can be used.

Project setup and running

All the details needed to setup and get the project running are given in the "setup_instructions.txt" file.

Metrics

Performance Metric - Intersection over Union (IoU)
Loss - Binary Cross-Entropy (BCE)

Training Configuration

Epochs: 150
Learning Rate: 0.001
Input shape: 224,224,3
Batch size: 32
Output shape: 32,32,32 voxel grid

Hardware Configuration

GPU: Nvidia Tesla T4 with 16GB VRAM
CPU and RAM: 4 vCPU’s and 28GB RAM
OS: Ubuntu 18.04 running in a Microsoft Azure VM

Software Configuration

Tensorflow: 2.4.0
CUDA: 11.0
cuDNN: 8.0
Python: 3.6-3.8

Training Results

Given below are the mean IoUs for each of the following models that we trained:

AE-Res: 0.6787
AE-Dense: 0.7258
3D-SkipNet: 0.6871
3D-SkipNet with kernel splitting: 0.6626

Given below are the mean IoUs for each of the following models that are state-of-the-art baselines for comparison:

Pix2Vox: 0.6340
3D-R2N2: 0.5600

Research Paper

Our research on this topic resulted in a research paper that has been presented and published in the IEEE-TENCON 2021 conference held in Auckland, New Zealand. The paper can be viewed and referenced using the IEEE-Xplore link provided below -

Resource-Conscious High-Performance Models for 2D-to-3D Single View Reconstruction

Takeaways

There exists a trade-off for skip connections and dense connections between performance and resource utilization.
We propose using dense connections in non-resource constrained environments.
We hope that our models establish the potential to utilise 3D reconstruction of objects whilst utilising minimal resources towards building sustainability in the environment and accessibility on the edge.

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
ShapeNet.json		ShapeNet.json
ShapeNet_5.json		ShapeNet_5.json
ShapeNet_mid_2.json		ShapeNet_mid_2.json
ShapeNet_mid_2_other.json		ShapeNet_mid_2_other.json
ShapeNet_mid_4.json		ShapeNet_mid_4.json
ShapeNet_original.json		ShapeNet_original.json
binvox_rw.py		binvox_rw.py
binvox_viz.py		binvox_viz.py
commands.py		commands.py
config.py		config.py
data.py		data.py
inference.py		inference.py
init_config.sh		init_config.sh
install_cuDNN.sh		install_cuDNN.sh
install_cuda.sh		install_cuda.sh
logger.py		logger.py
metrics.py		metrics.py
model.py		model.py
py_package_venv_setup.sh		py_package_venv_setup.sh
requirements.txt		requirements.txt
save_data.py		save_data.py
saveiou.py		saveiou.py
setup_instructions.txt		setup_instructions.txt
specifications.py		specifications.py
test.py		test.py
train.py		train.py
utils.py		utils.py

DhruvSrikanth/2D-3D-Single-View-Reconstruction

Folders and files

Latest commit

History

Repository files navigation

Resource-Conscious High-Performance Models for 2D-to-3D Single View Reconstruction

Capstone Project, Research Work and Research Paper

Aknowledgements

Objective

Inspiration

Motivation

Dataset

Proposed models

Project setup and running

Metrics

Training Configuration

Hardware Configuration

Software Configuration

Training Results

Research Paper

Takeaways

About

Topics

Resources

Stars

Watchers

Forks

Languages