DetectionDataset.as_coco having annotation id collision between splits #768

adbcode · 2024-01-23T12:26:22Z

Search before asking

I have searched the Supervision issues and found no similar bug report.

Bug

When exporting a YOLOv8 formatted dataset to COCO format (using the attached code), the JSON files associated with each split uses its own sequence for assigning annotation ID values.

This causes issues when trying to import the output dataset with other libraries, which expect a unique ID across all splits for each annotation.

Example from dataset with train, valid and test splits:

test
train
valid

Kindly consider using a common sequence when generating annotation IDs for a dataset across splits.

Environment

Supervision 0.16.0

Minimal Reproducible Example

import supervision as sv

yolo = sv.DetectionDataset.from_yolo(
images_directory_path=f"{dataset_root}/images",
annotations_directory_path=f"{dataset_root}/labels",
data_yaml_path=f"{dataset_root}/data.yaml",
force_masks=True
)

yolo.as_coco(
images_directory_path=f"{target}/images",
annotations_path=f"{target}/annotations.json"
)

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

SkalskiP · 2024-01-23T14:18:27Z

Hi @adbcode 👋🏻 Thanks a lot for your interest in supervision. How does that influence your workflow?

adbcode · 2024-01-24T18:01:12Z

Hi @adbcode 👋🏻 Thanks a lot for your interest in supervision. How does that influence your workflow?

Hello! When using it with other libraries, especially those who expect unique IDs for each annotation, the dataset gets corrupted on import.

Current workaround is to recreate the IDs during import, but it loses the original order.

SkalskiP · 2024-01-26T08:37:37Z

@adbcode, Would something like this satisfy you:

Merging YOLO splits.
Converting YOLO to COCO.
Splitting COCO into subsets while preserving ID continuity.

adbcode · 2024-01-26T08:45:08Z

@adbcode, Would something like this satisfy you:

Merging YOLO splits.

Converting YOLO to COCO.

Splitting COCO into subsets while preserving ID continuity.

this will be fine as long as we can recreate the original split in the end.

adbcode added the bug Something isn't working label Jan 23, 2024

SkalskiP self-assigned this Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DetectionDataset.as_coco having annotation id collision between splits #768

DetectionDataset.as_coco having annotation id collision between splits #768

adbcode commented Jan 23, 2024

SkalskiP commented Jan 23, 2024

adbcode commented Jan 24, 2024

SkalskiP commented Jan 26, 2024 •

edited

adbcode commented Jan 26, 2024

DetectionDataset.as_coco having annotation id collision between splits #768

DetectionDataset.as_coco having annotation id collision between splits #768

Comments

adbcode commented Jan 23, 2024

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

SkalskiP commented Jan 23, 2024

adbcode commented Jan 24, 2024

SkalskiP commented Jan 26, 2024 • edited

adbcode commented Jan 26, 2024

SkalskiP commented Jan 26, 2024 •

edited