Feature: PoC for the bounding box reflection in `bbox_shift_scale_rotate` #1125

i-aki-y · 2022-02-15T10:00:36Z

About this PR

I had implemented a bbox reflection functionality for bbox_shift_scale_rotate for my own specific usecase. I think this functionality will be beneficial for other users. So I made this PR.

My goal is to integrate the bbox reflection mode like (cv2.BORDER_REFLECT and cv2.BORDER_WRAP) into current transforms like ShiftScaleRotate.

But it was not an easy task.

So in this PR, I provide a only functional version (bboxes_shift_scale_rotate_reflect) as a first step. I think it is sufficient to look into how it works and what challenges exist.
And if you think this implementation seems promising to merge, I want to get into the next step.

I'm not sure this PR can be merged, but I hope this implementation and analysis inspire someone.

Demo

This is a demo: An input image(left), a result of bbox_shift_scale_rotate(center), and a result of bboxes_shift_scale_rotate_reflect(right) proposed in this PR.

The full runnable code is here:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import cv2
import skimage
import albumentations as A

# define helper funcs
def add_bbox(ax, bbox):
    label = 0    
    if len(bbox) > 4:
        bbox, label = bbox[:4], bbox[4]        
    bbox_color = plt.get_cmap("tab10").colors[label]
    x_min, y_min, x_max, y_max = bbox
    w, h = x_max - x_min, y_max - y_min
    pat = Rectangle(xy=(x_min, y_min), width=w, height=h, fill=False, lw=3, color=bbox_color)
    ax.add_patch(pat)

def plot_image_and_bboxes(image, bboxes, ax):
    ax.imshow(image)
    for i in range(len(bboxes)):
        add_bbox(ax, bboxes[i])

def get_shift_scale_rotate_bboxes(bboxes, params):
    bboxes_out = []
    for bbox in bboxes:
        label = []
        if len(bbox) > 4:
            label.append(bbox[4])
        bbox = A.normalize_bbox(bbox, params["rows"], params["cols"])
        bbox = A.bbox_shift_scale_rotate(bbox, **params)
        bbox = A.denormalize_bbox(bbox, params["rows"], params["cols"])
        bboxes_out.append((*bbox, *label))
    return bboxes_out

def get_shift_scale_rotate_reflect_bboxes(bboxes, params):
    bboxes, labels = A.to_ndarray_bboxes(bboxes)
    bboxes = A.normalize_bboxes2(bboxes, params["rows"], params["cols"])
    bboxes = A.bboxes_shift_scale_rotate_reflect(bboxes, **box_params)
    bboxes = A.denormalize_bboxes2(bboxes, params["rows"], params["cols"])
    bboxes = A.to_tuple_bboxes(bboxes, labels)
    return bboxes

# setup
image = skimage.data.astronaut()
rows, cols = image.shape[:2]
bboxes = [[170, 30, 280, 180, 0], [350, 80, 460, 290, 1], [140, 350, 200, 420, 2]]
img_params = dict(angle=45, scale=0.5, dx=0.1, dy=0.2, border_mode=cv2.BORDER_REFLECT)
box_params = img_params.copy()
box_params.update({"rows": rows, "cols": cols})
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

# show original image
axes[0].set_title("original image")
plot_image_and_bboxes(image, bboxes, axes[0])

## shift_scale_rotate
axes[1].set_title("shift_scale_rotate")
print("image transform")
%timeit image_tr = A.shift_scale_rotate(image, **img_params)
print("shift_scale_rotate")
%timeit bboxes_no_reflect = get_shift_scale_rotate_bboxes(bboxes, box_params)
plot_image_and_bboxes(image_tr, bboxes_no_reflect, axes[1])

## shift_scale_rotate_reflect
axes[2].set_title("shift_scale_rotate_reflect")
print("shift_scale_rotate with reflection")
%timeit bboxes_reflect = get_shift_scale_rotate_reflect_bboxes(bboxes, box_params)
plot_image_and_bboxes(image_tr, bboxes_reflect, axes[2])
#plt.savefig("./bbox_reflection.jpg")

About implementation

This implementation is very straightforward.

A summary is here:

Generate flipped copies and laid out them around the original image.
Apply affine transform to the bboxes, including coped ones.
Apply center bbox crop and remove invisible bboxes

        step.1       step.2       step.3

                       q p q
        +-+-+-+      +-+-+-+
        |q|p|q|      | |d|b|d
        +-+-+-+      +-+-+-+       +-+
  b  -> |d|b|d|  ->  | |q|p|q  ->  |q|
        +-+-+-+      +-+-+-+       +-+
        |q|p|q|      | | | |
        +-+-+-+      +-+-+-+

This works.
But this is not efficient because this makes many bbox copies that will be removed finally.
I searched for an efficient algorithm for this task, but I could not find it.
Therefore, I decided to continue adopting this method.

To mitigate the disadvantage for the performance, I decided to implement this functionality in the vectorized bboxes.
This is why this PR has some 'bboxes_xyz' functions vectorized versions of existing bbox_xyz counterparts.

This PR has many changes, but I tried not to change any existing codes to avoid unintentional problems.

About performance

The std output of the above example is here.

image transform
3.39 ms ± 880 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
shift_scale_rotate
155 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
shift_scale_rotate with reflection
385 µs ± 9.66 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The time of 385 µs is not so bad compared to the img transform result of 3.39ms.
But it is slower than the original shift_scale_rotate despite vectorization.
The benefit of vectorization appears as the number of bboxes becomes large.
To see this, I run the following codes:

def get_grid_bboxes(n_grid, bbox_size, rows, cols):
    bboxes = []
    d_y = rows / n_grid
    d_x = cols / n_grid
    for i_y in range(n_grid):
        y_min = (i_y + 0.5) * d_y
        y_max = y_min + bbox_size
        for i_x in range(n_grid):
            x_min = (i_x + 0.5) * d_x
            x_max = x_min + bbox_size            
            bboxes.append([x_min, y_min, x_max, y_max])
    return bboxes

#grid_bboxes = get_grid_bboxes(3, 32, rows, cols)
#fig, ax = plt.subplots(1, 1, figsize=(6, 6))
#plot_image_and_bboxes(image, grid_bboxes, ax)

box_params["border_mode"] = cv2.BORDER_REFLECT_101
#box_params["border_mode"] = cv2.BORDER_CONSTANT

grid_bboxes = get_grid_bboxes(1, 32, rows, cols)
print(f"{len(grid_bboxes)} boxes")
%timeit get_shift_scale_rotate_bboxes(grid_bboxes, box_params)
%timeit get_shift_scale_rotate_reflect_bboxes(grid_bboxes, box_params)

grid_bboxes = get_grid_bboxes(3, 32, rows, cols)
print(f"{len(grid_bboxes)} boxes")
%timeit get_shift_scale_rotate_bboxes(grid_bboxes, box_params)
%timeit get_shift_scale_rotate_reflect_bboxes(grid_bboxes, box_params)

grid_bboxes = get_grid_bboxes(10, 32, rows, cols)
print(f"{len(grid_bboxes)} boxes")
%timeit get_shift_scale_rotate_bboxes(grid_bboxes, box_params)
%timeit get_shift_scale_rotate_reflect_bboxes(grid_bboxes, box_params)

The results are here (the parameters are the same as the previous example):

1 boxes
45.3 µs ± 1.01 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
293 µs ± 5.56 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
9 boxes
402 µs ± 7.01 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
418 µs ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
100 boxes
4.48 ms ± 71.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.68 ms ± 36.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The vectorized version surpassed the original one as the number of bboxes increased. Also, note that the number of bboxes processed by the bboxes_shift_scale_rotate_reflect is about some times larger than
the bbox_shift_scale_rotate does because bbox_shift_scale_rotate does not process any of reflected bboxes.

Know issues and some notes

I list some known issues and notes that I found through the implementation.

1. Computational cost depends on parameters.

For example, assume to set a very small scale factor, like scale=0.01. In such a case, the number of bbox will be multiplied by 10^4.
This can cause performance issues, so users should be careful about small scale factors.

2. Extra care for tracking the bbox and labels is needed

~~Albumentation allows adding label information as a different target by using label_fields.~~
~~Since bbox reflection changes the number of input bboxes, it makes it difficult to track the relation between bbox and label_fields.~~
~~I think some extra care about the label_fields is needed to integrate this functionality into albumentation's pipeline.~~

--> The label_fields are automatically concatenated to bbox, so we do not need extra work about it.

3. Extra care for mismatches between numpy ndarray and tuple bboxes is needed

Albumentation allows adding label information to the bboxes as an additional element, and the label information can be a string type.
So bbox should be split into pure bbox and label so that the np.array(bboxes) does not create str ndarray.
To solve this issue, I introduced to_ndarray_bboxes(bboxes) and to_tuple_bboxes(bboxes, labels) functions.

4. This implementation does not care about the difference between `BORDER_REFLECT` and `BORDER_REFLECT_101`

Since I am not sure that this gives a significant disadvantage for results, this implementation does not care about the difference between BORDER_REFLECT and BORDER_REFLECT_101 to avoid extra complications.
I may need to do something about it, but I don't have any ideas on how to implement BORDER_REFLECT_101 precisely at the moment.

5. A well-established algorithm is wanted

This implementation is easy to understand but it is better if there is a more efficient and established algorithm.

Dipet · 2022-02-16T21:08:03Z

Thank you for your contribution. This is great and complex work!
Unfortunately, we need some time to look deeper into this PR and understand how it works.
Thus, reviewing this PR may take some time.

i-aki-y · 2022-02-17T02:32:28Z

Of course, you can take your time.
I think this is not a simple patch and includes things to consider and need design decisions.

Some vectorized bbox functions are also added

i-aki-y · 2022-09-04T02:12:04Z

I made some minor changes to follow up on recent albumentation updates.
And I integrated bbox reflection functionality to ShiftScaleRotate transform by adding a new argument reflec_annotation

Dipet added the WIP label Jun 11, 2022

Feature: add bbox reflection

8703fe9

Some vectorized bbox functions are also added

i-aki-y force-pushed the feature-bbox-reflection branch from 24b2803 to 8703fe9 Compare September 1, 2022 07:43

fix styles

c50cd86

Dipet added the Waiting for review label Sep 2, 2022

i-aki-y added 2 commits September 4, 2022 11:03

Add rotate_method arg support

a014284

Integrate bbox reflection to ShiftScaleRotate

c39f994

i-aki-y and others added 3 commits September 4, 2022 11:19

fix unit test

c607f18

fix doc style

4795c48

Merge branch 'master' into feature-bbox-reflection

4c4e2bf

Dipet removed the WIP label Sep 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: PoC for the bounding box reflection in `bbox_shift_scale_rotate` #1125

Feature: PoC for the bounding box reflection in `bbox_shift_scale_rotate` #1125

i-aki-y commented Feb 15, 2022 •

edited

Dipet commented Feb 16, 2022

i-aki-y commented Feb 17, 2022

i-aki-y commented Sep 4, 2022

Feature: PoC for the bounding box reflection in bbox_shift_scale_rotate #1125

Are you sure you want to change the base?

Feature: PoC for the bounding box reflection in bbox_shift_scale_rotate #1125

Conversation

i-aki-y commented Feb 15, 2022 • edited

About this PR

Demo

About implementation

About performance

Know issues and some notes

1. Computational cost depends on parameters.

2. Extra care for tracking the bbox and labels is needed

3. Extra care for mismatches between numpy ndarray and tuple bboxes is needed

4. This implementation does not care about the difference between BORDER_REFLECT and BORDER_REFLECT_101

5. A well-established algorithm is wanted

Dipet commented Feb 16, 2022

i-aki-y commented Feb 17, 2022

i-aki-y commented Sep 4, 2022

Feature: PoC for the bounding box reflection in `bbox_shift_scale_rotate` #1125

Feature: PoC for the bounding box reflection in `bbox_shift_scale_rotate` #1125

i-aki-y commented Feb 15, 2022 •

edited

4. This implementation does not care about the difference between `BORDER_REFLECT` and `BORDER_REFLECT_101`