A proposal of a framework to handle multi-image augmentation (Including mosaic augmentation) #1420

i-aki-y · 2023-03-08T05:59:42Z

I propose a new batch-based augmentation framework as a natural extension of the Compose framework.

With this feature, I want to make it easy to implement augmentations that use multiple image inputs, such as Mosaic and MixUp augmentations.

I hope you will be interested and would appreciate it if you could review this PR.

About the PR

As a natural extension of the Compose, I introduced a new batch-based compose, BatchedCompose, and associate classes.
In the BatchedCompose, we can seamlessly combine single-image and multi-image transforms with minimum constraints.
The BatchCompose supports most features provided by the Compose. And the existing transforms contained in the BatchCompose, such as OneOf and HorizontalFlip, work as expected.

To demonstrate how this framework will work, I include a complete implementation of a Mosaic augmentation in this PR.
Thanks to the new framework, the implementation and usage are much simpler and cleaner than my previous PR #1147.

This is an example:

## Define augmentation
demo1 = A.BatchCompose(
    [
        A.ForEach([A.OneOf([A.VerticalFlip(p=1), A.HorizontalFlip(p=1)], p=1),A.RandomScale(scale_limit=0.2)]),
        A.Mosaic4(out_height=250, out_width=250, replace=False, value=(127, 127, 127), out_batch_size=1, p=1),
        A.ForEach([A.RandomCrop(height=200, width=200, p=1)]),
    ],
    bbox_params=A.BboxParams(format="coco", label_fields=["blabels_batch"]),
    keypoint_params=A.KeypointParams(format="xy", label_fields=["klabels_batch"]),
)

## Apply augmentation
data1 = demo1(
    image_batch=image_batch,
    bboxes_batch=bboxes_batch,
    mask_batch=mask_batch,
    keypoints_batch=kpts_batch,
    blabels_batch=blabel_batch,
    klabels_batch=klabel_batch
)

(You can see the complete code in the last section below)

The inputs and output are here:

The BatchCompose expects batched targets as inputs, so the outputs also are batched.
The ForEach is a helper container introduced in this PR. This works to bridge single-image transforms and multi-image transforms.
The Mosaic4 is an example of a multi-image transform.

Note that this example uses all standard targets (image, bboxes, mask, keypoints, and label_fields), and they work as expected.

Another demo demonstrates how powerful the framework is. (You do not need to understand the detail of the transforms).

demo2 = A.BatchCompose(
    [
        A.Repeat([
            A.Repeat([
                A.OneOf([
                    A.ForEach([
                        A.OneOf([A.VerticalFlip(p=1), A.HorizontalFlip(p=1)], p=1),
                        A.Resize(100, 100, p=1),
                    ]),
                    A.ForEach([A.RandomScale(scale_limit=(-0.5, -0.8)), A.ToSepia(p=0.5)]),
                ], p=1),
            ], n=4),
            A.Mosaic4(out_height=200, out_width=200, replace=False, value=(127, 127, 127), out_batch_size=1, p=1),
            A.ForEach([A.CoarseDropout(p=0.7, max_height=20, max_width=20), A.RandomScale(scale_limit=(-0.2, 0.0), p=1)]),
        ], n=4),
        A.ForEach([A.Rotate(limit=30, p=1)]),
        A.Mosaic4(out_height=400, out_width=400, replace=False, value=(127, 127, 127), out_batch_size=1, p=1),
        A.ForEach([
            A.RandomCrop(height=350, width=350, p=1)
        ]),        
    ]
)
image_single = skimage.data.astronaut()
data2 = demo2(image_batch=[image_single])

Note that the input is a single-image batch. The Repeat is another helper container that applies transforms n times and concatenates the output batches.
Combining the ForEach and the Repeat make this batch-based mechanism very powerful and flexible.

How to use

The user should obey the following rule to work with the BatchCompose.
I think complying with these constraints is not so hard and can cover most usecase.

All targets should be batch, and the names should have _batch suffixes. This rule is also applied to the label_fields and additional_targets parameters.
When you use the single-image transforms, you must enclose the transforms by the ForEach.
A multi-image transform should inherit the BatchBasedTransform.

Mosaic augmentation

This PR includes an implementation of the mosaic augmentation.
Implementation is straightforward, except that it is batch-based.
One notable feature I added is the out_batch_size parameter. This allows the user to specify the output's batch size.
Similar behavior can be achieved using a Repeat container, but each internal transform can control the behavior in more detail.
I think supporting out_batch_size can enrich the batch-base transform's flexibility.

Compatibility

The BatchCompose is implemented as a subclass of the Compose.
I have modified the Compose to add customization points but kept the behavior unchanged.

Future work

Now the BatchBasedTransform does not support the functionality of the ReplayCompose.
If this PR can be approved, I will work on this in a different PR.

Currently, Mosaic augmentation is the only example. I hope this can accelerate support for other multi-image transforms.

Full example code

Most lines are for data preparation and visualization. The only important parts are those already quoted.

import random
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import numpy as np
import cv2
import albumentations as A
import skimage.data

## Helper funcs
def get_sample_data(char):
    IMAGE_SHAPE = (100, 100, 3)    
    image = np.full(IMAGE_SHAPE, 255, dtype=np.uint8)
    cv2.putText(
        image, 
        char, 
        (15, 85), 
        fontFace=cv2.FONT_HERSHEY_SIMPLEX,
        fontScale=3.5,
        color=(0, 0, 0),
        thickness=8,
        lineType=cv2.LINE_AA           
    )
    image_bin = (image < 128).astype(np.uint8)
    contours, _ = cv2.findContours(image_bin[:, :, 0], cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    pts = contours[0].reshape(-1, 2)
    xmin, ymin, xmax, ymax = pts[:, 0].min(), pts[:, 1].min(), pts[:, 0].max(), pts[:, 1].max()
    bboxes = [(xmin, ymin, xmax - xmin, ymax - ymin)]    
    mask = np.full_like(image, 255, dtype=np.uint8)
    cv2.fillPoly(mask, pts.reshape(1, -1, 2), color=(255, 255, 127))
    keypoints = [(x, y) for i, (x, y) in enumerate(contours[0].reshape(-1, 2).tolist()) if i % 10 == 0]
    return image, bboxes, mask, keypoints

def get_sample_batches(chars):    
    image_batch = []
    bboxes_batch = []
    mask_batch = []
    keypoints_batch = []
    blabel_batch = []
    klabel_batch = []
    for char in chars:
        image, bboxes, mask, keypoints = get_sample_data(char)
        image_batch.append(image)
        bboxes_batch.append(bboxes)
        mask_batch.append(mask)
        keypoints_batch.append(keypoints)
        blabel_batch.append(char)
        klabel_batch.append([char for p in keypoints])
    return image_batch, bboxes_batch, mask_batch, keypoints_batch, blabel_batch, klabel_batch

COLOR_MAP = {"A": "red", "B": "blue", "C": "green", "D": "brown"}
def add_bbox(ax, bbox, label):
    x_min, y_min, w, h = bbox[:4]
    pat = Rectangle(xy=(x_min, y_min), width=w, height=h, fill=False, lw=2, color=COLOR_MAP[label], alpha=0.7)
    ax.add_patch(pat)

def visualize_image_and_annotations(image, bboxes, mask, keypoints, blabels, klabels, ax):

    alpha = 0.5
    ax.imshow((alpha * image + (1 - alpha) * mask).astype(np.uint8))
    for i in range(len(bboxes)):
        add_bbox(ax, bboxes[i], blabels[i])
    pts = np.array(keypoints).reshape(-1, 2)
    c = [COLOR_MAP[l] for l in klabels]
    ax.scatter(x=pts[:, 0], y=pts[:, 1], c=c, s=20)
    h, w = image.shape[:2]
    ax.set_xlim(0, w)
    ax.set_ylim(h, 0)


## Make Reproducible
random.seed(1)
np.random.seed(1)
    
## Prepare data
image_batch, bboxes_batch, mask_batch, kpts_batch, blabel_batch, klabel_batch = get_sample_batches("ABCD")
image_single = skimage.data.astronaut()

## Define and Apply
demo1 = A.BatchCompose(
    [
        A.ForEach([A.OneOf([A.VerticalFlip(p=1), A.HorizontalFlip(p=1)], p=1), A.RandomScale(scale_limit=0.2)]),
        A.Mosaic4(out_height=250, out_width=250, replace=False, value=(127, 127, 127), out_batch_size=1, p=1),
        A.ForEach([A.RandomCrop(height=200, width=200, p=1)]),
    ],
    bbox_params=A.BboxParams(format="coco", label_fields=["blabels_batch"]),
    keypoint_params=A.KeypointParams(format="xy", label_fields=["klabels_batch"]),
)
data1 = demo1(
    image_batch=image_batch,
    bboxes_batch=bboxes_batch,
    mask_batch=mask_batch,
    keypoints_batch=kpts_batch,
    blabels_batch=blabel_batch,
    klabels_batch=klabel_batch
)

demo2 = A.BatchCompose(
    [
        A.Repeat([
            A.Repeat([
                A.OneOf([
                    A.ForEach([
                        A.OneOf([A.VerticalFlip(p=1), A.HorizontalFlip(p=1)], p=1),
                        A.Resize(100, 100, p=1),
                    ]),
                    A.ForEach([A.RandomScale(scale_limit=(-0.5, -0.8)), A.ToSepia(p=0.5)]),
                ], p=1),
            ], n=4),
            A.Mosaic4(out_height=200, out_width=200, replace=False, value=(127, 127, 127), out_batch_size=1, p=1),
            A.ForEach([A.CoarseDropout(p=0.7, max_height=20, max_width=20), A.RandomScale(scale_limit=(-0.2, 0.0), p=1)]),
        ], n=4),
        A.ForEach([A.Rotate(limit=30, p=1)]),
        A.Mosaic4(out_height=400, out_width=400, replace=False, value=(127, 127, 127), out_batch_size=1, p=1),
        A.ForEach([
            A.RandomCrop(height=350, width=350, p=1)
        ]),        
    ]
)
data2 = demo2(image_batch=[image_single])

## Visualize

fig, axes = plt.subplots(1, 4, figsize=(10, 3))
axes = axes.flatten()
axes[0].set_title("demo1 input")
for ax in axes:
    ax.set_xticks([])
    ax.set_yticks([])
for i in range(4):
    visualize_image_and_annotations(
        image_batch[i], bboxes_batch[i], mask_batch[i], kpts_batch[i], blabel_batch[i], klabel_batch[i], axes[i]
    )
fig.savefig("./demo1_in.jpg", bbox_inches="tight")

fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.set_title("demo1 output")
ax.set_xticks([])
ax.set_yticks([])
visualize_image_and_annotations(
    data1["image_batch"][0],
    data1["bboxes_batch"][0],
    data1["mask_batch"][0],
    data1["keypoints_batch"][0],
    data1["blabels_batch"][0],
    data1["klabels_batch"][0],        
    ax
)
fig.savefig("./demo1_out.jpg", bbox_inches="tight")

fig, ax = plt.subplots(1, 1, figsize=(3, 3))
ax.set_title("demo2 input")
ax.set_xticks([])
ax.set_yticks([])
ax.imshow(image_single)
fig.savefig("./demo2_in.jpg", bbox_inches="tight")
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
ax.set_xticks([])
ax.set_yticks([])
data2 = demo2(image_batch=[image_single])
ax.set_title("demo2 output")
ax.imshow(data2["image_batch"][0])
fig.savefig("./demo2_out.jpg", bbox_inches="tight")

mikel-brostrom · 2023-03-10T07:11:40Z

Wow, this looks fantastic @i-aki-y. Should I adapt MixUp in my PR and make it inherit BatchBasedTransform if this is the new way of working with multi-image augmentations @Dipet?

mikel-brostrom · 2023-03-25T17:15:26Z

Any plans of getting this merged any time soon?

thiagoribeirodamotta · 2023-06-23T20:45:54Z

Any plans of getting this merged any time soon?

I'm also very interested on this matter. Are there any plans on merging this PR?

i-aki-y and others added 6 commits March 8, 2023 13:23

Add batch compose and mosaic augmentation

43c4704

Fix missing the keypoint_mosaic4 declaration

6bc4476

Fix doc tool and Readme

8e6b866

Fix README.md to avoid CI error

8036ba3

Merge branch 'master' into add-batch-compose

e876305

Merge branch 'master' into add-batch-compose

284fa0d

Merge branch 'master' into add-batch-compose

c137e2b

klemen1999 mentioned this pull request Jul 26, 2023

Augmentations refactor luxonis/luxonis-ml#18

Merged

Relax conditions so that image_batch is not required

ee062aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A proposal of a framework to handle multi-image augmentation (Including mosaic augmentation) #1420

A proposal of a framework to handle multi-image augmentation (Including mosaic augmentation) #1420

i-aki-y commented Mar 8, 2023

mikel-brostrom commented Mar 10, 2023 •

edited

mikel-brostrom commented Mar 25, 2023

thiagoribeirodamotta commented Jun 23, 2023

A proposal of a framework to handle multi-image augmentation (Including mosaic augmentation) #1420

Are you sure you want to change the base?

A proposal of a framework to handle multi-image augmentation (Including mosaic augmentation) #1420

Conversation

i-aki-y commented Mar 8, 2023

About the PR

How to use

Mosaic augmentation

Compatibility

Future work

Full example code

mikel-brostrom commented Mar 10, 2023 • edited

mikel-brostrom commented Mar 25, 2023

thiagoribeirodamotta commented Jun 23, 2023

mikel-brostrom commented Mar 10, 2023 •

edited