Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorized interface for bboxes and keypoints #1577

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Conversation

Dipet
Copy link
Collaborator

@Dipet Dipet commented Mar 12, 2024

Simple benchmarks shows significant speedup from 4 to 6 times.

Benchmark:

import numpy as np
import random
import time
import cv2
import os
from typing import Tuple, Union

import albumentations as A
from albumentations.core.bbox_utils import convert_bboxes_from_albumentations

cv2.setNumThreads(0)
cv2.ocl.setUseOpenCL(False)

os.environ["OMP_NUM_THREADS"] = "1"
os.environ["OPENBLAS_NUM_THREADS"] = "1"
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["VECLIB_MAXIMUM_THREADS"] = "1"
os.environ["NUMEXPR_NUM_THREADS"] = "1"


def make_bboxes(count: int) -> np.ndarray:
    def make_bbox() -> Tuple[float, float, float, float, Union[float, str]]:
        x1, y1 = random.uniform(0, 1 - 2e-3), random.uniform(0, 1 - 2e-3)
        x2 = random.uniform(x1 + 1e-3, 1 - 1e-3)
        y2 = random.uniform(y1 + 1e-3, 1 - 1e-3)

        label = random.randint(int(-1e9), int(1e9))
        if random.random() > 0.5:
            label = str(label)

        return x1, y1, x2, y2, label

    result = [make_bbox() for _ in range(count)]
    return np.array(result, dtype=object)


def main():
    h, w = 100, 100
    image = np.empty([h, w, 3], dtype=np.uint8)
    formats = ["pascal_voc", "coco", "yolo"]
    bboxes_count = 1_000
    bboxes = make_bboxes(bboxes_count)
    total_iters = 1000

    for format_name in formats:
        cur_bboxes = np.copy(bboxes)
        cur_bboxes[:, :4] = convert_bboxes_from_albumentations(
            bboxes[:, :4].astype(float), format_name, h, w, check_validity=True
        )
        transforms = A.Compose([], bbox_params=A.BboxParams(format_name))

        s = time.time()
        for i in range(total_iters):
            transforms(image=image, bboxes=cur_bboxes)
        dt = time.time() - s
        print(f"format: {format_name} samples per seconds: {total_iters * bboxes_count / dt:.3f}")


if __name__ == '__main__':
    main()

Results:

numpy vectorized
format: pascal_voc samples per seconds: 681654.392
format: coco samples per seconds: 669183.501
format: yolo samples per seconds: 678640.870

current main
format: pascal_voc samples per seconds: 173860.101
format: coco samples per seconds: 167347.072
format: yolo samples per seconds: 102999.074

Result of bbox and keypoint benchmark from current PR

Bboxes
                      imgaug albumentations (np transforms) albumentations (main)
HorizontalFlip         0.151                          0.218                  0.91
VerticalFlip           0.148                          0.218                 0.902
Flip                   0.169                           0.23                 0.913
Rotate                 0.724                          1.211                 1.899
SafeRotate                 -                           1.68                 1.544
RandomRotate90         0.729                          0.251                 0.943
ShiftScaleRotate           -                          1.253                 1.943
Transpose                  -                           0.26                 0.944
Pad                    1.191                          1.283                 1.213
Perspective            2.471                          0.444                 3.931
RandomCropNearBBox         -                          1.071                 0.628
BBoxSafeRandomCrop         -                          1.862                 1.732
CenterCrop                 -                          1.065                 0.785
Crop                   2.942                          1.068                 0.978
CropAndPad                 -                          1.097                 1.017
RandomCropFromBorders      -                          1.057                 0.975
Affine                 0.728                          3.875                 3.716
PiecewiseAffine        267.564                       139.82               134.556
Sequence               5.133                          6.926                 7.593


keypoints
                       imgaug albumentations (np transform) albumentations (main)
HorizontalFlip          0.131                         0.222                  0.28
VerticalFlip            0.112                         0.211                 0.272
Flip                    0.127                         0.231                 0.292
Rotate                  0.386                         0.476                  1.21
SafeRotate                  -                         0.755                 0.894
RandomRotate90          0.197                         0.256                 0.316
ShiftScaleRotate            -                         0.487                 1.271
Transpose                   -                         0.219                 0.275
Pad                     0.341                         0.447                 0.495
Perspective             0.778                         0.334                 1.081
RandomCropNearBBox          -                         0.192                 0.168
CenterCrop                  -                         0.198                 0.173
Crop                    0.724                         0.184                 0.242
CropAndPad                  -                          0.26                 0.497
RandomCropFromBorders       -                         0.192                 0.239
Affine                  0.373                          2.29                 2.799
PiecewiseAffine        77.803                        87.765                76.805
Sequence                1.975                         3.237                 3.792

@Dipet
Copy link
Collaborator Author

Dipet commented Mar 12, 2024

Need to compare with #1396

@ternaus
Copy link
Collaborator

ternaus commented Mar 13, 2024

Torchvision augmentations support boxes, while imgaug is not supported.

I suspect people would be more interested in how Albumentations are with respect to the popular Torchvison

@Dipet Dipet marked this pull request as ready for review March 24, 2024 20:58
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Dipet - I've reviewed your changes and they look great!

General suggestions:

  • Ensure the new vectorized interface is well-documented, including examples of how to use it.
  • Consider backward compatibility with the previous interface to ease the transition for existing users.
  • Perform thorough testing with real-world datasets to ensure the enhancements do not introduce any regressions.
Here's what I looked at during the review
  • 🟡 General issues: 4 issues found
  • 🟢 Security: all looks good
  • 🟡 Testing: 4 issues found
  • 🟡 Complexity: 1 issue found
  • 🟢 Docstrings: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

albumentations/core/bbox_utils.py Show resolved Hide resolved
albumentations/core/bbox_utils.py Show resolved Hide resolved
Comment on lines +75 to +81
@property
def values_dim(self) -> int:
return DATA_DIM

@property
def internal_type(self) -> Optional[Type[DataWithLabels]]:
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code_clarification): Consider explaining the purpose of returning None for internal_type in DataProcessor.

benchmark/benchmark_keypoints.py Outdated Show resolved Hide resolved
benchmark/benchmark_bbox.py Outdated Show resolved Hide resolved
tests/test_keypoint.py Show resolved Hide resolved
tests/test_transforms.py Show resolved Hide resolved
albumentations/core/bbox_utils.py Show resolved Hide resolved
albumentations/core/keypoints_utils.py Show resolved Hide resolved
benchmark/benchmark_keypoints.py Show resolved Hide resolved
@Dipet
Copy link
Collaborator Author

Dipet commented Mar 24, 2024

Updated bench result in images/sec

Bbboxes

                imgaug albumentations torchvision
HorizontalFlip   10214           4394       11750
VerticalFlip     12331           4628       22442
Rotate            1489            835        2514
RandomRotate90    1662           3954           -
Pad                910            796           -
Perspective        432           2279        1893
Crop               362            954           -
Affine            1525            447        2654
PiecewiseAffine      3              7           -
Sequence           197            184           -

Keypoints

                      imgaug albumentations
HorizontalFlip          7833           4545
VerticalFlip            9179           4940
Flip                    7881           4291
Rotate                  2625           2101
SafeRotate                 -           1340
RandomRotate90          5165           3958
ShiftScaleRotate           -           2047
Transpose                  -           4621
Pad                     2806           2252
Perspective             1280           3044
RandomCropNearBBox         -           5284
CenterCrop                 -           5143
Crop                    1409           5617
CropAndPad                 -           3878
RandomCropFromBorders      -           5247
Affine                  2825            510
PiecewiseAffine           12             12
Sequence                 492            318

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants