Skip to content
/ im2pages Public

A repository for our multi-manuscript document image dataset IMMI and code associated with our paper 'Deformable Deep Networks for Instance Segmentation of Overlapping Multi Page Handwritten Documents'

License

Notifications You must be signed in to change notification settings

ihdia/im2pages

Repository files navigation

Deformable Deep Networks for Instance Segmentation of Overlapping Multi Page Handwritten Documents

Dependencies and Installation

Manual Setup

The IM2PAGES code is tested with

  • Python (3.8.10)
  • PyTorch (1.9.0)
  • Detectron2 (0.4.1)
  • CUDA (10.2)
  • CudNN (7.6.5-CUDA-10.2)

For setup of Detectron2, please follow the official documentation

Automatic Setup (From an Env File)

We have provided environment files for both Conda and Pip methods. Please use any one of the following.

Using Conda

conda env create -f environment.yml

Using Pip

pip install -r requirements.txt

Usage

Initial Setup:

  • Download the IMMI dataset [Dataset Link]
  • Place the
    • Dataset under images directory
    • COCO-Pretrained Model weights and Palmira pretrained weights in the init_weights directory

SLURM Workloads

If your compute uses SLURM workloads, please load these (or equivalent) modules at the start of your experiments. Ensure that all other modules are unloaded.

module add cuda/10.2
module add cudnn/7.6.5-cuda-10.2

Training

Palmira and variants

Train the presented networks

python train_net_palmira.py \
    --config-file configs/palmira/Palmira.yaml \
    --num-gpus 4
  • Any required hyper-parameter changes including initial weights can be performed in the corresponding config file.
  • To run the experiment of palmira and its variants change the input config files in the args section
  • Resuming from checkpoints can be done by adding --resume to the above command.

Inference

Quantitative

To perform inference and get quantitative results on the test set.

python train_net_palmira.py \
    --config-file corresponding_config.yaml \
    --eval-only \
    MODEL.WEIGHTS <path-to-model-file> 
  • This outputs 2 json files in the corresponding output directory from the config.
    • coco_instances_results.json - This is an encoded format which is to be parsed to get the qualitative results
    • immi_test_coco_format.json - This is regular coco encoded format which is human parsable

Qualitative

Can be executed only after quantitative inference (or) on validation outputs at the end of each training epoch.

This parses the output JSON and overlays predictions on the images.

python visualise_json_results.py \
    --inputs <path-to-output-file-1.json> [... <path-to-output-file-2.json>] \
    --output outputs/qualitative/ \
    --dataset immi_test

NOTE: To compare multiple models, multiple input JSON files can be passed. This produces a single vertically stitched image combining the predictions of each JSON passed.

Custom Images

To run the model on your own images without training, please download the provided weights from [here] for FT-Palmira-AS and here [here] for FT-Vanilla Mask RCNN .PLease find the colab Notebook link here for the same demo.

python demo.py \
    --input <path-to-image-directory-*.jpg> \
    --output <path-to-output-directory> \
    --config corresponding_config.yaml \
    --opts MODEL.WEIGHTS <Pretrained_model_weights.pth>

Sample Outputs

Contact

For any queries, please contact Dr. Ravi Kiran Sarvadevabhatla

License

This project is open sourced under MIT License.

About

A repository for our multi-manuscript document image dataset IMMI and code associated with our paper 'Deformable Deep Networks for Instance Segmentation of Overlapping Multi Page Handwritten Documents'

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published